Academic Research
CSMaP faculty, postdoctoral fellows, and students publish rigorous, peer-reviewed research in top academic journals and post working papers sharing ongoing work.
Search or Filter
-
Journal Article
The Diffusion and Reach of (Mis)Information on Facebook During the U.S. 2020 Election
Sociological Science, 2024
Social media creates the possibility for rapid, viral spread of content, but how many posts actually reach millions? And is misinformation special in how it propagates? We answer these questions by analyzing the virality of and exposure to information on Facebook during the U.S. 2020 presidential election. We examine the diffusion trees of the approximately 1 B posts that were re-shared at least once by U.S.-based adults from July 1, 2020, to February 1, 2021. We differentiate misinformation from non-misinformation posts to show that (1) misinformation diffused more slowly, relying on a small number of active users that spread misinformation via long chains of peer-to-peer diffusion that reached millions; non-misinformation spread primarily through one-to-many affordances (mainly, Pages); (2) the relative importance of peer-to-peer spread for misinformation was likely due to an enforcement gap in content moderation policies designed to target mostly Pages and Groups; and (3) periods of aggressive content moderation proximate to the election coincide with dramatic drops in the spread and reach of misinformation and (to a lesser extent) political content.
-
Journal Article
-
Journal Article
News Sharing on Social Media: Mapping the Ideology of News Media, Politicians, and the Mass Public
Political Analysis, 2024
-
Journal Article
The Trump Advantage in Policy Recall Among Voters
American Politics Research, 2024
Research in political science suggests campaigns have a minimal effect on voters’ attitudes and vote choice. We evaluate the effectiveness of the 2016 Trump and Clinton campaigns at informing voters by giving respondents an opportunity to name policy positions of candidates that they felt would make them better off. The relatively high rates of respondents’ ability to name a Trump policy that would make them better off suggests that the success of his campaign can be partly attributed to its ability to communicate memorable information. Our evidence also suggests that cable television informed voters: respondents exposed to higher levels of liberal news were more likely to be able to name Clinton policies, and voters exposed to higher levels of conservative news were more likely to name Trump policies; these effects hold even conditioning on respondents’ ideology and exposure to mainstream media. Our results demonstrate the advantages of using novel survey questions and provide additional insights into the 2016 campaign that challenge one part of the conventional narrative about the presumed non-importance of operational ideology.
-
Journal Article
-
Journal Article
A Multi-Stakeholder Approach for Leveraging Data Portability to Support Research on the Digital Information Environment
Journal of Online Trust and Safety, 2024
In this paper, we aim to situate data portability within the evolving discussions of how to support data access for researchers studying the digital information environment. We explore how data donations, enabled by existing data access rights and data portability requirements, provide promising opportunities for supporting research on critical trust and safety topics. Evaluating other data access mechanisms that are more central to policy debates about platform transparency, we argue that data donations are a powerful additional mechanism that offer key legal, ethical, and scientific benefits. We then assess current challenges with using data donations for research and offer recommendations for various stakeholders to better align portability mechanisms with the needs of research. Taken together, we argue that although portability is often considered within a context of competition and user agency, regulators, industry actors, and researchers should understand and leverage portability’s potential impact to empower critical research on the societal impacts of digital platforms and services.
-
Journal Article
Digital Town Square? Nextdoor's Offline Contexts and Online Discourse
Journal of Quantitative Description: Digital Media, 2024
There is scant quantitative research describing Nextdoor, the world's largest and most important hyperlocal social media network. Due to its localized structure, Nextdoor data are notoriously difficult to collect and work with. We build multiple datasets that allow us to generate descriptive analyses of the platform's offline contexts and online content. We first create a comprehensive dataset of all Nextdoor neighborhoods joined with U.S. Census data, which we analyze at the community-level (block-group). Our findings suggests that Nextdoor is primarily used in communities where the populations are whiter, more educated, more likely to own a home, and with higher levels of average income, potentially impacting the platform's ability to create new opportunities for social capital formation and citizen engagement. At the same time, Nextdoor neighborhoods are more likely to have active government agency accounts---and law enforcement agencies in particular---where offline communities are more urban, have larger nonwhite populations, greater income inequality, and higher average home values. We then build a convenience sample of 30 Nextdoor neighborhoods, for which we collect daily posts and comments appearing in the feed (115,716 posts and 163,903 comments), as well as associated metadata. Among the accounts for which we collected posts and comments, posts seeking or offering services were the most frequent, while those reporting potentially suspicious people or activities received the highest average number of comments. Taken together, our study describes the ecosystem of and discussion on Nextdoor, as well as introduces data for quantitatively studying the platform.
-
Journal Article
The Effects of Facebook and Instagram on the 2020 Election: A Deactivation Experiment
Proceedings of the National Academy of Sciences, 2024
We study the effect of Facebook and Instagram access on political beliefs, attitudes, and behavior by randomizing a subset of 19,857 Facebook users and 15,585 Instagram users to deactivate their accounts for 6 wk before the 2020 U.S. election. We report four key findings. First, both Facebook and Instagram deactivation reduced an index of political participation (driven mainly by reduced participation online). Second, Facebook deactivation had no significant effect on an index of knowledge, but secondary analyses suggest that it reduced knowledge of general news while possibly also decreasing belief in misinformation circulating online. Third, Facebook deactivation may have reduced self-reported net votes for Trump, though this effect does not meet our preregistered significance threshold. Finally, the effects of both Facebook and Instagram deactivation on affective and issue polarization, perceived legitimacy of the election, candidate favorability, and voter turnout were all precisely estimated and close to zero.
-
Journal Article
Estimating the Ideology of Political YouTube Videos
Political Analysis, 2024
We present a method for estimating the ideology of political YouTube videos. As online media increasingly influences how people engage with politics, so does the importance of quantifying the ideology of such media for research. The subfield of estimating ideology as a latent variable has often focused on traditional actors such as legislators, while more recent work has used social media data to estimate the ideology of ordinary users, political elites, and media sources. We build on this work by developing a method to estimate the ideologies of YouTube videos, an important subset of media, based on their accompanying text metadata. First, we take Reddit posts linking to YouTube videos and use correspondence analysis to place those videos in an ideological space. We then train a text-based model with those estimated ideologies as training labels, enabling us to estimate the ideologies of videos not posted on Reddit. These predicted ideologies are then validated against human labels. Finally, we demonstrate the utility of this method by applying it to the watch histories of survey respondents with self-identified ideologies to evaluate the prevalence of echo chambers on YouTube. Our approach gives video-level scores based only on supplied text metadata, is scalable, and can be easily adjusted to account for changes in the ideological climate. This method could also be generalized to estimate the ideology of other items referenced or posted on Reddit.
-
Journal Article
Online Searches to Evaluate Misinformation Can Increase its Perceived Veracity
Nature, 2024
Considerable scholarly attention has been paid to understanding belief in online misinformation, with a particular focus on social networks. However, the dominant role of search engines in the information environment remains underexplored, even though the use of online search to evaluate the veracity of information is a central component of media literacy interventions. Although conventional wisdom suggests that searching online when evaluating misinformation would reduce belief in it, there is little empirical evidence to evaluate this claim. Here, across five experiments, we present consistent evidence that online search to evaluate the truthfulness of false news articles actually increases the probability of believing them. To shed light on this relationship, we combine survey data with digital trace data collected using a custom browser extension. We find that the search effect is concentrated among individuals for whom search engines return lower-quality information. Our results indicate that those who search online to evaluate misinformation risk falling into data voids, or informational spaces in which there is corroborating evidence from low-quality sources. We also find consistent evidence that searching online to evaluate news increases belief in true news from low-quality sources, but inconsistent evidence that it increases belief in true news from mainstream sources. Our findings highlight the need for media literacy programmes to ground their recommendations in empirically tested strategies and for search engines to invest in solutions to the challenges identified here.
-
Journal Article
A Synthesis of Evidence for Policy from Behavioural Science During COVID-19
Nature, 2023
Scientific evidence regularly guides policy decisions, with behavioural science increasingly part of this process. In April 2020, an influential paper proposed 19 policy recommendations (‘claims’) detailing how evidence from behavioural science could contribute to efforts to reduce impacts and end the COVID-19 pandemic. Here we assess 747 pandemic-related research articles that empirically investigated those claims. We report the scale of evidence and whether evidence supports them to indicate applicability for policymaking. Two independent teams, involving 72 reviewers, found evidence for 18 of 19 claims, with both teams finding evidence supporting 16 (89%) of those 18 claims. The strongest evidence supported claims that anticipated culture, polarization and misinformation would be associated with policy effectiveness. Claims suggesting trusted leaders and positive social norms increased adherence to behavioural interventions also had strong empirical support, as did appealing to social consensus or bipartisan agreement. Targeted language in messaging yielded mixed effects and there were no effects for highlighting individual benefits or protecting others. No available evidence existed to assess any distinct differences in effects between using the terms ‘physical distancing’ and ‘social distancing’. Analysis of 463 papers containing data showed generally large samples; 418 involved human participants with a mean of 16,848 (median of 1,699). That statistical power underscored improved suitability of behavioural science research for informing policy decisions. Furthermore, by implementing a standardized approach to evidence selection and synthesis, we amplify broader implications for advancing scientific evidence in policy formulation and prioritization.
-
Journal Article
Testing the Effect of Information on Discerning the Veracity of News in Real Time
Journal of Experimental Political Science, 2023
Despite broad adoption of digital media literacy interventions that provide online users with more information when consuming news, relatively little is known about the effect of this additional information on the discernment of news veracity in real time. Gaining a comprehensive understanding of how information impacts discernment of news veracity has been hindered by challenges of external and ecological validity. Using a series of pre-registered experiments, we measure this effect in real time. Access to the full article relative to solely the headline/lede and access to source information improves an individual's ability to correctly discern the veracity of news. We also find that encouraging individuals to search online increases belief in both false/misleading and true news. Taken together, we provide a generalizable method for measuring the effect of information on news discernment, as well as crucial evidence for practitioners developing strategies for improving the public's digital media literacy.
-
Journal Article
Replicating the Effects of Facebook Deactivation in an Ethnically Polarized Setting
Research & Politics, 2023
The question of how social media usage impacts societal polarization continues to generate great interest among both the research community and broader public. Nevertheless, there are still very few rigorous empirical studies of the causal impact of social media usage on polarization. To explore this question, we replicate the only published study to date that tests the effects of social media cessation on interethnic attitudes (Asimovic et al., 2021). In a study situated in Bosnia and Herzegovina, the authors found that deactivating from Facebook for a week around genocide commemoration in Bosnia and Herzegovina had a negative effect on users’ attitudes toward ethnic outgroups, with the negative effect driven by users with more ethnically homogenous offline networks. Does this finding extend to other settings? In a pre-registered replication study, we implement the same research design in a different ethnically polarized setting: Cyprus. We are not able to replicate the main effect found in Asimovic et al. (2021): in Cyprus, we cannot reject the null hypothesis of no effect. We do, however, find a significant interaction between the heterogeneity of users’ offline networks and the deactivation treatment within our 2021 subsample, consistent with the pattern from Bosnia and Herzegovina. We also find support for recent findings (Allcott et al., 2020; Asimovic et al., 2021) that Facebook deactivation leads to a reduction in anxiety levels and suggestive evidence of a reduction in knowledge of current news, though the latter is again limited to our 2021 subsample.
-
Journal Article
Like-Minded Sources On Facebook Are Prevalent But Not Polarizing
Nature, 2023
Many critics raise concerns about the prevalence of ‘echo chambers’ on social media and their potential role in increasing political polarization. However, the lack of available data and the challenges of conducting large-scale field experiments have made it difficult to assess the scope of the problem1,2. Here we present data from 2020 for the entire population of active adult Facebook users in the USA showing that content from ‘like-minded’ sources constitutes the majority of what people see on the platform, although political information and news represent only a small fraction of these exposures. To evaluate a potential response to concerns about the effects of echo chambers, we conducted a multi-wave field experiment on Facebook among 23,377 users for whom we reduced exposure to content from like-minded sources during the 2020 US presidential election by about one-third. We found that the intervention increased their exposure to content from cross-cutting sources and decreased exposure to uncivil language, but had no measurable effects on eight preregistered attitudinal measures such as affective polarization, ideological extremity, candidate evaluations and belief in false claims. These precisely estimated results suggest that although exposure to content from like-minded sources on social media is common, reducing its prevalence during the 2020 US presidential election did not correspondingly reduce polarization in beliefs or attitudes.
-
Journal Article
-
Journal Article
-
Journal Article
Asymmetric Ideological Segregation In Exposure To Political News on Facebook
Science, 2023
Does Facebook enable ideological segregation in political news consumption? We analyzed exposure to news during the US 2020 election using aggregated data for 208 million US Facebook users. We compared the inventory of all political news that users could have seen in their feeds with the information that they saw (after algorithmic curation) and the information with which they engaged. We show that (i) ideological segregation is high and increases as we shift from potential exposure to actual exposure to engagement; (ii) there is an asymmetry between conservative and liberal audiences, with a substantial corner of the news ecosystem consumed exclusively by conservatives; and (iii) most misinformation, as identified by Meta’s Third-Party Fact-Checking Program, exists within this homogeneously conservative corner, which has no equivalent on the liberal side. Sources favored by conservative audiences were more prevalent on Facebook’s news ecosystem than those favored by liberals.
-
Journal Article
Measuring the Ideology of Audiences for Web Links and Domains Using Differentially Private Engagement Data
Proceedings of the International AAAI Conference on Web and Social Media, 2023
This paper demonstrates the use of differentially private hyperlink-level engagement data for measuring ideologies of audiences for web domains, individual links, or aggregations thereof. We examine a simple metric for measuring this ideological position and assess the conditions under which the metric is robust to injected, privacy-preserving noise. This assessment provides insights into and constraints on the level of activity one should observe when applying this metric to privacy-protected data. Grounding this work is a massive dataset of social media engagement activity where privacy-preserving noise has been injected into the activity data, provided by Facebook and the Social Science One (SS1) consortium. Using this dataset, we validate our ideology measures by comparing to similar, published work on sharing-based, homophily- and content-oriented measures, where we show consistently high correlation (>0.87). We then apply this metric to individual links from several popular news domains and demonstrate how one can assess link-level distributions of ideological audiences. We further show this estimator is robust to selection of engagement types besides sharing, where domain-level audience-ideology assessments based on views and likes show no significant difference compared to sharing-based estimates. Estimates of partisanship, however, suggest the viewing audience is more moderate than the audiences who share and like these domains. Beyond providing thresholds on sufficient activity for measuring audience ideology and comparing three types of engagement, this analysis provides a blueprint for ensuring robustness of future work to differential privacy protections.
-
Journal Article
Exposure to the Russian Internet Research Agency Foreign Influence Campaign on Twitter in the 2016 US Election and Its Relationship to Attitudes and Voting Behavior
Nature Communications, 2023
There is widespread concern that foreign actors are using social media to interfere in elections worldwide. Yet data have been unavailable to investigate links between exposure to foreign influence campaigns and political behavior. Using longitudinal survey data from US respondents linked to their Twitter feeds, we quantify the relationship between exposure to the Russian foreign influence campaign and attitudes and voting behavior in the 2016 US election. We demonstrate, first, that exposure to Russian disinformation accounts was heavily concentrated: only 1% of users accounted for 70% of exposures. Second, exposure was concentrated among users who strongly identified as Republicans. Third, exposure to the Russian influence campaign was eclipsed by content from domestic news media and politicians. Finally, we find no evidence of a meaningful relationship between exposure to the Russian foreign influence campaign and changes in attitudes, polarization, or voting behavior. The results have implications for understanding the limits of election interference campaigns on social media.
-
Journal Article
Dictionary-Assisted Supervised Contrastive Learning
Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2022
Text analysis in the social sciences often involves using specialized dictionaries to reason with abstract concepts, such as perceptions about the economy or abuse on social media. These dictionaries allow researchers to impart domain knowledge and note subtle usages of words relating to a concept(s) of interest. We introduce the dictionary-assisted supervised contrastive learning (DASCL) objective, allowing researchers to leverage specialized dictionaries when fine-tuning pretrained language models. The text is first keyword simplified: a common, fixed token replaces any word in the corpus that appears in the dictionary(ies) relevant to the concept of interest. During fine-tuning, a supervised contrastive objective draws closer the embeddings of the original and keyword-simplified texts of the same class while pushing further apart the embeddings of different classes. The keyword-simplified texts of the same class are more textually similar than their original text counterparts, which additionally draws the embeddings of the same class closer together. Combining DASCL and cross-entropy improves classification performance metrics in few-shot learning settings and social science applications compared to using cross-entropy alone and alternative contrastive and data augmentation methods.
-
Journal Article
Using Social Media Data to Reveal Patterns of Policy Engagement in State Legislatures
State Politics & Policy Quarterly, 2022
-
Journal Article
Most Users Do Not Follow Political Elites on Twitter; Those Who Do, Show Overwhelming Preferences for Ideological Congruity.
Science Advances, 2022
We offer comprehensive evidence of preferences for ideological congruity when people engage with politicians, pundits, and news organizations on social media. Using four years of data (2016-2019) from a random sample of 1.5 million Twitter users, we examine three behaviors studied separately to date: (a) following of in-group vs. out-group elites, (b) sharing in-group vs. out-group information (retweeting), and (c) commenting on the shared information (quote tweeting). We find the majority of users (60%) do not follow any political elites. Those who do, follow in-group elite accounts at much higher rates than out-group accounts (90% vs. 10%), share information from in-group elites 13 times more frequently than from out-group elites, and often add negative comments to the shared out-group information. Conservatives are twice as likely as liberals to share in-group vs. out-group content. These patterns are robust, emerge across issues and political elites, and regardless of users' ideological extremity.
-
Journal Article
Election Fraud, YouTube, and Public Perception of the Legitimacy of President Biden
Journal of Online Trust and Safety, 2022
Skepticism about the outcome of the 2020 presidential election in the United States led to a historic attack on the Capitol on January 6th, 2021 and represents one of the greatest challenges to America's democratic institutions in over a century. Narratives of fraud and conspiracy theories proliferated over the fall of 2020, finding fertile ground across online social networks, although little is know about the extent and drivers of this spread. In this article, we show that users who were more skeptical of the election's legitimacy were more likely to be recommended content that featured narratives about the legitimacy of the election. Our findings underscore the tension between an "effective" recommendation system that provides users with the content they want, and a dangerous mechanism by which misinformation, disinformation, and conspiracies can find their way to those most likely to believe them.
-
Journal Article
What We Learned About The Gateway Pundit from its Own Web Traffic Data
Workshop Proceedings of the 16th International AAAI Conference on Web and Social Media, 2022
To mitigate the spread of false news, researchers need to understand who visits low-quality news sites, what brings people to those sites, and what content they prefer to consume. Due to challenges in observing most direct website traffic, existing research primarily relies on alternative data sources, such as engagement signals from social media posts. However, such signals are at best only proxies for actual website visits. During an audit of far-right news websites, we discovered that The Gateway Pundit (TGP) has made its web traffic data publicly available, giving us a rare opportunity to understand what news pages people actually visit. We collected 68 million web traffic visits to the site over a one-month period and analyzed how people consume news via multiple features. Our referral analysis shows that search engines and social media platforms are the main drivers of traffic; our geo-location analysis reveals that TGP is more popular in counties where more people voted for Trump in 2020. In terms of content, topics related to 2020 US presidential election and 2021 US capital riot have the highest average number of visits. We also use these data to quantify to what degree social media engagement signals correlate with actual web visit counts. To do so, we collect Facebook and Twitter posts with URLs from TGP during the same time period. We show that all engagement signals positively correlate with web visit counts, but with varying correlation strengths. For example, total interaction on Facebook correlates better than Twitter retweet count. Our insights can also help researchers choose the right metrics when they measure the impact of news URLs on social media.
-
Journal Article
News Credibility Labels Have Limited Average Effects on News Diet Quality and Fail to Reduce Misperceptions
Science Advances, 2022
As the primary arena for viral misinformation shifts toward transnational threats, the search continues for scalable countermeasures compatible with principles of transparency and free expression. We conducted a randomized field experiment evaluating the impact of source credibility labels embedded in users’ social feeds and search results pages. By combining representative surveys (n = 3337) and digital trace data (n = 968) from a subset of respondents, we provide a rare ecologically valid test of such an intervention on both attitudes and behavior. On average across the sample, we are unable to detect changes in real-world consumption of news from low-quality sources after 3 weeks. We can also rule out small effects on perceived accuracy of popular misinformation spread about the Black Lives Matter movement and coronavirus disease 2019. However, we present suggestive evidence of a substantively meaningful increase in news diet quality among the heaviest consumers of misinformation. We discuss the implications of our findings for scholars and practitioners.