Academic Research
CSMaP is a leading academic research institute studying the ever-shifting online environment at scale. We publish peer-reviewed research in top academic journals and produce rigorous data reports on policy relevant topics.
Search or Filter
-
Journal Article
Moderating with the Mob: Evaluating the Efficacy of Real-Time Crowdsourced Fact-Checking
Journal of Online Trust and Safety, 2021
Reducing the spread of false news remains a challenge for social media platforms, as the current strategy of using third-party fact- checkers lacks the capacity to address both the scale and speed of misinformation diffusion. Research on the “wisdom of the crowds” suggests one possible solution: aggregating the evaluations of ordinary users to assess the veracity of information. In this study, we investigate the effectiveness of a scalable model for real-time crowdsourced fact-checking. We select 135 popular news stories and have them evaluated by both ordinary individuals and professional fact-checkers within 72 hours of publication, producing 12,883 individual evaluations. Although we find that machine learning-based models using the crowd perform better at identifying false news than simple aggregation rules, our results suggest that neither approach is able to perform at the level of professional fact-checkers. Additionally, both methods perform best when using evaluations only from survey respondents with high political knowledge, suggesting reason for caution for crowdsourced models that rely on a representative sample of the population. Overall, our analyses reveal that while crowd-based systems provide some information on news quality, they are nonetheless limited—and have significant variation—in their ability to identify false news.
-
Journal Article
SARS-CoV-2 RNA Concentrations in Wastewater Foreshadow Dynamics and Clinical Presentation of New COVID-19 Cases
Science of the Total Environment, 2022
Current estimates of COVID-19 prevalence are largely based on symptomatic, clinically diagnosed cases. The existence of a large number of undiagnosed infections hampers population-wide investigation of viral circulation. Here, we quantify the SARS-CoV-2 concentration and track its dynamics in wastewater at a major urban wastewater treatment facility in Massachusetts, between early January and May 2020. SARS-CoV-2 was first detected in wastewater on March 3. SARS-CoV-2 RNA concentrations in wastewater correlated with clinically diagnosed new COVID-19 cases, with the trends appearing 4–10 days earlier in wastewater than in clinical data. We inferred viral shedding dynamics by modeling wastewater viral load as a convolution of back-dated new clinical cases with the average population-level viral shedding function. The inferred viral shedding function showed an early peak, likely before symptom onset and clinical diagnosis, consistent with emerging clinical and experimental evidence. This finding suggests that SARS-CoV-2 concentrations in wastewater may be primarily driven by viral shedding early in infection. This work shows that longitudinal wastewater analysis can be used to identify trends in disease transmission in advance of clinical case reporting, and infer early viral shedding dynamics for newly infected individuals, which are difficult to capture in clinical investigations.
-
Journal Article
Twitter Flagged Donald Trump’s Tweets with Election Misinformation: They Continued to Spread Both On and Off the Platform
Harvard Kennedy School (HKS) Misinformation Review, 2021
We analyze the spread of Donald Trump’s tweets that were flagged by Twitter using two intervention strategies—attaching a warning label and blocking engagement with the tweet entirely. We find that while blocking engagement on certain tweets limited their diffusion, messages we examined with warning labels spread further on Twitter than those without labels. Additionally, the messages that had been blocked on Twitter remained popular on Facebook, Instagram, and Reddit, being posted more often and garnering more visibility than messages that had either been labeled by Twitter or received no intervention at all. Taken together, our results emphasize the importance of considering content moderation at the ecosystem level.
-
Journal Article
Testing the Effects of Facebook Usage in an Ethnically Polarized Setting
Proceedings of the National Academy of Sciences, 2021
Despite the belief that social media is altering intergroup dynamics—bringing people closer or further alienating them from one another—the impact of social media on interethnic attitudes has yet to be rigorously evaluated, especially within areas with tenuous interethnic relations. We report results from a randomized controlled trial in Bosnia and Herzegovina (BiH), exploring the effects of exposure to social media during 1 wk around genocide remembrance in July 2019 on a set of interethnic attitudes of Facebook users. We find evidence that, counter to preregistered expectations, people who deactivated their Facebook profiles report lower regard for ethnic outgroups than those who remained active. Moreover, we present additional evidence suggesting that this effect is likely conditional on the level of ethnic heterogeneity of respondents’ residence. We also extend the analysis to include measures of subjective well-being and knowledge of news. Here, we find that Facebook deactivation leads to suggestive improvements in subjective wellbeing and a decrease in knowledge of current events, replicating results from recent research in the United States in a very different context, thus increasing our confidence in the generalizability of these effects.
-
Journal Article
Accessibility and Generalizability: Are Social Media Effects Moderated by Age or Digital Literacy?
Research & Politics, 2021
-
Journal Article
The Times They Are Rarely A-Changin': Circadian Regularities in Social Media Use
Journal of Quantitative Description: Digital Media, 2021
-
Journal Article
Cracking Open the News Feed: Exploring What U.S. Facebook Users See and Share with Large-Scale Platform Data
Journal of Quantitative Description: Digital Media, 2021
-
Journal Article
YouTube Recommendations and Effects on Sharing Across Online Social Platforms
Proceedings of the ACM on Human-Computer Interaction, 2021
-
Journal Article
Tweeting Beyond Tahrir: Ideological Diversity and Political Intolerance in Egyptian Twitter Networks
World Politics, 2021
Do online social networks affect political tolerance in the highly polarized climate of postcoup Egypt? Taking advantage of the real-time networked structure of Twitter data, the authors find that not only is greater network diversity associated with lower levels of intolerance, but also that longer exposure to a diverse network is linked to less expression of intolerance over time. The authors find that this relationship persists in both elite and non-elite diverse networks. Exploring the mechanisms by which network diversity might affect tolerance, the authors offer suggestive evidence that social norms in online networks may shape individuals’ propensity to publicly express intolerant attitudes. The findings contribute to the political tolerance literature and enrich the ongoing debate over the relationship between online echo chambers and political attitudes and behavior by providing new insights from a repressive authoritarian context.
-
Journal Article
Political Psychology in the Digital (mis)Information age: A Model of News Belief and Sharing
Social Issues and Policy Review, 2021
The spread of misinformation, including “fake news,” propaganda, and conspiracy theories, represents a serious threat to society, as it has the potential to alter beliefs, behavior, and policy. Research is beginning to disentangle how and why misinformation is spread and identify processes that contribute to this social problem. We propose an integrative model to understand the social, political, and cognitive psychology risk factors that underlie the spread of misinformation and highlight strategies that might be effective in mitigating this problem. However, the spread of misinformation is a rapidly growing and evolving problem; thus scholars need to identify and test novel solutions, and work with policymakers to evaluate and deploy these solutions. Hence, we provide a roadmap for future research to identify where scholars should invest their energy in order to have the greatest overall impact.
-
Journal Article
You Won’t Believe Our Results! But They Might: Heterogeneity in Beliefs About the Accuracy of Online Media
Journal of Experimental Political Science, 2021
“Clickbait” media has long been espoused as an unfortunate consequence of the rise of digital journalism. But little is known about why readers choose to read clickbait stories. Is it merely curiosity, or might voters think such stories are more likely to provide useful information? We conduct a survey experiment in Italy, where a major political party enthusiastically embraced the esthetics of new media and encouraged their supporters to distrust legacy outlets in favor of online news. We offer respondents a monetary incentive for correct answers to manipulate the relative salience of the motivation for accurate information. This incentive increases differences in the preference for clickbait; older and less educated subjects become even more likely to opt to read a story with a clickbait headline when the incentive to produce a factually correct answer is higher. Our model suggests that a politically relevant subset of the population prefers Clickbait Media because they trust it more.
-
Journal Article
Trumping Hate on Twitter? Online Hate Speech in the 2016 U.S. Election Campaign and its Aftermath.
Quarterly Journal of Political Science, 2021
To what extent did online hate speech and white nationalist rhetoric on Twitter increase over the course of Donald Trump's 2016 presidential election campaign and its immediate aftermath? The prevailing narrative suggests that Trump's political rise — and his unexpected victory — lent legitimacy to and popularized bigoted rhetoric that was once relegated to the dark corners of the Internet. However, our analysis of over 750 million tweets related to the election, in addition to almost 400 million tweets from a random sample of American Twitter users, provides systematic evidence that hate speech did not increase on Twitter over this period. Using both machine-learning-augmented dictionary-based methods and a novel classification approach leveraging data from Reddit communities associated with the alt-right movement, we observe no persistent increase in hate speech or white nationalist language either over the course of the campaign or in the six months following Trump's election. While key campaign events and policy announcements produced brief spikes in hateful language, these bursts quickly dissipated. Overall we find no empirical support for the proposition that Trump's divisive campaign or election increased hate speech on Twitter.
-
Data Report
Issue Discussion in the Georgia Senate Elections
Data Report, NYU's Center for Social Media and Politics, 2020
-
Journal Article
Political Knowledge and Misinformation in the Era of Social Media: Evidence From the 2015 UK Election
British Journal of Political Science, 2022
-
Data Report
Influential Users in the Common Core and Black Lives Matter Social Media Conversation
Data Report, NYU's Center for Social Media and Politics, 2020
-
Working Paper
News Sharing on Social Media: Mapping the Ideology of News Media Content, Citizens, and Politicians
Working Paper, November 2020
-
Journal Article
Political Sectarianism in America
Science, 2020
Political polarization, a concern in many countries, is especially acrimonious in the United States. For decades, scholars have studied polarization as an ideological matter — how strongly Democrats and Republicans diverge vis-à-vis political ideals and policy goals. Such competition among groups in the marketplace of ideas is a hallmark of a healthy democracy. But more recently, researchers have identified a second type of polarization, one focusing less on triumphs of ideas than on dominating the abhorrent supporters of the opposing party. This literature has produced a proliferation of insights and constructs but few interdisciplinary efforts to integrate them. We offer such an integration, pinpointing the superordinate construct of political sectarianism and identifying its three core ingredients: othering, aversion, and moralization. We then consider the causes of political sectarianism and its consequences for U.S. society — especially the threat it poses to democracy. Finally, we propose interventions for minimizing its most corrosive aspects.
-
Working Paper
A Comparison of Methods in Political Science Text Classification: Transfer Learning Language Models for Politics
Working Paper, October 2020
Automated text classification has rapidly become an important tool for political analysis. Recent advancements in NLP enabled by advances in deep learning now achieve state of the art results in many standard tasks for the field. However, these methods require large amounts of both computing power and text data to learn the characteristics of the language, resources which are not always accessible to political scientists. One solution is a transfer learning approach, where knowledge learned in one area or source task is transferred to another area or a target task. A class of models that embody this approach are language models, which demonstrate extremely high levels of performance. We investigate the performance of these models in the political science by comparing multiple text classification methods. We find RoBERTa and XLNet, language models that rely on theTransformer, require fewer computing resources and less training data to perform on par with – or outperform – several political science text classification methods. Moreover, we find that the increase in accuracy is especially significant in the case of small labeled data, highlighting the potential for reducing the data-labeling cost of supervised methods for political scientists via the use of pretrained language models.
-
Working Paper
-
Data Report
Online Issue Politicization: How the Common Core and Black Lives Matter Discussions Evolved on Social Media
Data Report, NYU's Center for Social Media and Politics, 2020
-
Book
Social Media and Democracy: The State of the Field, Prospects for Reform
Cambridge University Press, 2020
-
Journal Article
Content-Based Features Predict Social Media Influence Operations
Science Advances, 2020
-
Journal Article
Cross-Platform State Propaganda: Russian Trolls on Twitter and YouTube During the 2016 U.S. Presidential Election
The International Journal of Press/Politics, 2020
This paper investigates online propaganda strategies of the Internet Research Agency (IRA)—Russian “trolls”—during the 2016 U.S. presidential election. We assess claims that the IRA sought either to (1) support Donald Trump or (2) sow discord among the U.S. public by analyzing hyperlinks contained in 108,781 IRA tweets. Our results show that although IRA accounts promoted links to both sides of the ideological spectrum, “conservative” trolls were more active than “liberal” ones. The IRA also shared content across social media platforms, particularly YouTube—the second-most linked destination among IRA tweets. Although overall news content shared by trolls leaned moderate to conservative, we find troll accounts on both sides of the ideological spectrum, and these accounts maintain their political alignment. Links to YouTube videos were decidedly conservative, however. While mixed, this evidence is consistent with the IRA’s supporting the Republican campaign, but the IRA’s strategy was multifaceted, with an ideological division of labor among accounts. We contextualize these results as consistent with a pre-propaganda strategy. This work demonstrates the need to view political communication in the context of the broader media ecology, as governments exploit the interconnected information ecosystem to pursue covert propaganda strategies.
-
Journal Article
Automated Text Classification of News Articles: A Practical Guide
Political Analysis, 2021
Automated text analysis methods have made possible the classification of large corpora of text by measures such as topic and tone. Here, we provide a guide to help researchers navigate the consequential decisions they need to make before any measure can be produced from the text. We consider, both theoretically and empirically, the effects of such choices using as a running example efforts to measure the tone of New York Times coverage of the economy. We show that two reasonable approaches to corpus selection yield radically different corpora and we advocate for the use of keyword searches rather than predefined subject categories provided by news archives. We demonstrate the benefits of coding using article segments instead of sentences as units of analysis. We show that, given a fixed number of codings, it is better to increase the number of unique documents coded rather than the number of coders for each document. Finally, we find that supervised machine learning algorithms outperform dictionaries on a number of criteria. Overall, we intend this guide to serve as a reminder to analysts that thoughtfulness and human validation are key to text-as-data methods, particularly in an age when it is all too easy to computationally classify texts without attending to the methodological choices therein.
-
Journal Article
Using Social and Behavioral Science to Support COVID-19 Pandemic Response
Nature Human Behavior, 2020