Data Science Methodology
Our experts produce new methodologies to further understand how social media affects politics and democracy. From developing and deploying code, CSMaP researchers create new ways to quantify social media interactions and its effects.
Academic Research
-
Journal Article
Quantifying Narrative Similarity Across Languages
Sociological Methods & Research, 2025
How can one understand the spread of ideas across text data? This is a key measurement problem in sociological inquiry, from the study of how interest groups shape media discourse, to the spread of policy across institutions, to the diffusion of organizational structures and institution themselves. To study how ideas and narratives diffuse across text, we must first develop a method to identify whether texts share the same information and narratives, rather than the same broad themes or exact features. We propose a novel approach to measure this quantity of interest, which we call “narrative similarity,” by using large language models to distill texts to their core ideas and then compare the similarity of claims rather than of words, phrases, or sentences. The result is an estimand much closer to narrative similarity than what is possible with past relevant alternatives, including exact text reuse, which returns lexically similar documents; topic modeling, which returns topically similar documents; or an array of alternative approaches. We devise an approach to providing out-of-sample measures of performance (precision, recall, F1) and show that our approach outperforms relevant alternatives by a large margin. We apply our approach to an important case study: The spread of Russian claims about the development of a Ukrainian bioweapons program in U.S. mainstream and fringe news websites. While we focus on news in this application, our approach can be applied more broadly to the study of propaganda, misinformation, diffusion of policy and cultural objects, among other topics.
-
Journal Article
Labeling Social Media Posts: Does Showing Coders Multimodal Content Produce Better Human Annotation, and a Better Machine Classifier?
Political Science Research and Methods, 2025
Reports & Analysis
-
Analysis
Are Influence Campaigns Trolling Your Social Media Feeds?
Now, there are ways to find out. New data shows that machine learning can identify content created by online political influence operations.
October 13, 2020
News & Commentary
-
Policy
When it Comes to Understanding AI’s Impact on Elections, We’re Still Working in the Dark
Greater transparency around AI-generated political advertising would transform researchers' ability to understand its potential effects on democracy and elections.
March 4, 2025
-
Policy
Mosaics of Insight: Auditing TikTok Through Independent Data Access
Even if TikTok is sold to a non-Chinese buyer, the threat of foreign influence will remain. That’s why researchers need independent data access.
February 21, 2025