Research
CSMaP is a leading academic research institute studying the ever-shifting online environment at scale. We publish peer-reviewed research in top academic journals, produce rigorous reports and analyses on policy relevant topics, and develop open source tools and methods to support the broader scholarly community.
Academic Research
-
Journal Article
Quantifying Narrative Similarity Across Languages
Sociological Methods & Research, 2025
How can one understand the spread of ideas across text data? This is a key measurement problem in sociological inquiry, from the study of how interest groups shape media discourse, to the spread of policy across institutions, to the diffusion of organizational structures and institution themselves. To study how ideas and narratives diffuse across text, we must first develop a method to identify whether texts share the same information and narratives, rather than the same broad themes or exact features. We propose a novel approach to measure this quantity of interest, which we call “narrative similarity,” by using large language models to distill texts to their core ideas and then compare the similarity of claims rather than of words, phrases, or sentences. The result is an estimand much closer to narrative similarity than what is possible with past relevant alternatives, including exact text reuse, which returns lexically similar documents; topic modeling, which returns topically similar documents; or an array of alternative approaches. We devise an approach to providing out-of-sample measures of performance (precision, recall, F1) and show that our approach outperforms relevant alternatives by a large margin. We apply our approach to an important case study: The spread of Russian claims about the development of a Ukrainian bioweapons program in U.S. mainstream and fringe news websites. While we focus on news in this application, our approach can be applied more broadly to the study of propaganda, misinformation, diffusion of policy and cultural objects, among other topics.
-
Journal Article
Labeling Social Media Posts: Does Showing Coders Multimodal Content Produce Better Human Annotation, and a Better Machine Classifier?
Political Science Research and Methods, 2025
Reports & Analysis
-
Analysis
Who Has a Policy that Would Benefit You? More Voters Say Trump.
National survey data from the 2016, 2020, and 2024 elections shed light on how candidates' campaign strategies impact voter policy recall.
November 2, 2024
-
Analysis
Reducing Exposure To Misinformation: Evidence from WhatsApp in Brazil
Deactivating multimedia on WhatsApp in Brazil consistently reduced exposure to online misinformation during the pre-election weeks in 2022, but did not impact whether false news was believed, or reduce polarization.
August 16, 2024
Data Collections & Tools
As part of our project to construct comprehensive data sets and to empirically test hypotheses related to social media and politics, we have developed a suite of open-source tools and modeling processes.