World Happiness Report 2023

World Happiness Report 2023 149 Gen 2: Person-Level Sampling of Twitter Feeds Measurement accuracies can be increased substantively by improving the sampling and aggregation methods, especially by aggregating tweets first to the person level. Person-level sampling addresses the disproportionate impact that a small number of highly active accounts can have on geographic estimates. In addition to person-level sampling, demographic person characteristics (such as age and gender) can be estimated through language, and on their basis, post-stratification weights can be determined, which is similar to the methods used in representative phone surveys (see Fig. 5.6 for a method sketch). This approach shows remarkable improvements in accuracy (see Fig. 5.7). Gen 2 with Level 1 dictionary/ annotation-based methods One of the earliest examples of Gen 2 evaluated the predictive accuracy of community-level language (as measured with Level 1 dictionaries such as LIWC) across 27 health-related outcomes, such as obesity and mentally unhealthy days.87 Importantly, this work evaluated several aggregation methods, including random samples of posts (Gen 1 methods) and a person-focused approach (Gen 2). This person-focused aggregation significantly outperformed (in terms of out-ofsample predictive accuracy) the Gen 1 aggregation methods with an accuracy (average Pearson r across all 27 health outcomes) of .59 for Gen 1 vs. .63 for Gen 2. Gen 2 using Level 2 machine learning methods User-level aggregation. Some researchers have proposed a Level 2 person-centered approach, which first measures word frequencies at the person-level and then averages those frequencies to the county-level, effectively yielding a county language average across users.88 Furthermore, through sensitivity analyses, this work calibrated minimum thresholds on both the number of tweets needed per person (30 tweets or more) and the number of people needed per county to produce stable county-level language estimates (at least 100 people), which are standard techniques in geo-spatial analysis.89 Across several prediction tasks, including estimating life satisfaction, the Gen 2 outperformed Gen 1 approaches, as seen in Fig. 5.7. Additional work has shown that Gen 2 language estimates show how external validity (e.g., language estimates of county-level personality correlate with survey- based measures) and are robust to spatial autocorrelations (i.e., county correlations are not an artifact of, or dependent on, the physical spatial nature of the data).90 Correction for representativeness. One common limitation with work on social media text is selection bias – the social media sample is not representative of the population from which we would like to infer additional information. The person-centered approach has also been expanded to consider who uses social media relative to their respective community. When using state-of-the-art machine learning approaches, sociodemographics (such as age, gender, income, and education) can be estimated for each Twitter user from their social media language, thus allowing for the measurement of the socio- demographic makeup of the sample.91 Comparing the sociodemographic distribution of the sample to the population’s distribution gives a measure of Twitter users’ degree of over- or under-presentation. This comparison can be used to reweight each user’s language estimate in the county-aggregation process using post-stratification techniques commonly used in demography and public health.92 Applying these reweighting techniques to closed vocabulary (e.g., LIWC dictionaries, Level 1)93 and open-vocabulary features (e.g., LDA topics, Level 2)94 increased predictive accuracy above that of previous Gen 2 methods (see Fig. 5.7, top). The person-centered approach has also been expanded to consider who uses social media relative to their respective community.

RkJQdWJsaXNoZXIy NzQwMjQ=