United States - NYU’s Center for Social Media, AI, and Politics

Academic Research

Journal Article
Quantifying Narrative Similarity Across Languages
Hannah Waight,

Sol Messing,

Anton Shirikov,

Margaret E. Roberts,

Jonathan Nagler,

Jason Greenfield,

Megan A. Brown,

Kevin Aslett,

Joshua A. Tucker
Sociological Methods & Research, 2025
View Article View abstract

How can one understand the spread of ideas across text data? This is a key measurement problem in sociological inquiry, from the study of how interest groups shape media discourse, to the spread of policy across institutions, to the diffusion of organizational structures and institution themselves. To study how ideas and narratives diffuse across text, we must first develop a method to identify whether texts share the same information and narratives, rather than the same broad themes or exact features. We propose a novel approach to measure this quantity of interest, which we call “narrative similarity,” by using large language models to distill texts to their core ideas and then compare the similarity of claims rather than of words, phrases, or sentences. The result is an estimand much closer to narrative similarity than what is possible with past relevant alternatives, including exact text reuse, which returns lexically similar documents; topic modeling, which returns topically similar documents; or an array of alternative approaches. We devise an approach to providing out-of-sample measures of performance (precision, recall, F1) and show that our approach outperforms relevant alternatives by a large margin. We apply our approach to an important case study: The spread of Russian claims about the development of a Ukrainian bioweapons program in U.S. mainstream and fringe news websites. While we focus on news in this application, our approach can be applied more broadly to the study of propaganda, misinformation, diffusion of policy and cultural objects, among other topics.
Area of Study

Data Science Methodology

Foreign Influence Campaigns
Date Posted

Jul 14, 2025
Tags

Methods,

Text and Content Analysis,

Ukraine,

Russia,

United States,

Large Language Models
Journal Article
Labeling Social Media Posts: Does Showing Coders Multimodal Content Produce Better Human Annotation, and a Better Machine Classifier?
Haohan Chen,

James Bisbee,

Joshua A. Tucker,

Jonathan Nagler
Political Science Research and Methods, 2025
View Article View abstract

The increasing multimodality (e.g., images, videos, links) of social media data presents opportunities and challenges. But text-as-data methods continue to dominate as modes of classification, as multimodal social media data are costly to collect and label. Researchers who face a budget constraint may need to make informed decisions regarding whether to collect and label only the textual content of social media data or their full multimodal content. In this article, we develop five measures and an experimental framework to assist with these decisions. We propose five performance metrics to measure the costs and benefits of multimodal labeling: average time per post, average time per valid response, valid response rate, intercoder agreement, and classifier’s predictive power. To estimate these measures, we introduce an experimental framework to evaluate coders’ performance under text-only and multimodal labeling conditions. We illustrate the method with a tweet labeling experiment.
Area of Study

Data Science Methodology

Elite & Mass Political Behavior
Date Posted

Jul 13, 2025
Tags

Text and Content Analysis,

Methods,

Twitter/X,

United States

View All Related Research

Reports & Analysis

Analysis
Who Has a Policy that Would Benefit You? More Voters Say Trump.
National survey data from the 2016, 2020, and 2024 elections shed light on how candidates' campaign strategies impact voter policy recall.

November 2, 2024
Analysis
How Americans’ Confidence in Technology Firms has Dropped
- Sean Kates,
- Jonathan Ladd,
- Joshua A. Tucker
Results from the American Institutional Confidence poll's second wave show that the public's confidence in technology, and tech companies, has markedly decreased over the past five years.

June 14, 2023

View All Related Reports & Analysis

News & Commentary

Commentary
Embracing Platform Transparency in a Digital World to Strengthen Democracy
- Joshua A. Tucker
Democracy in the digital age depends on transparent data access to ensure accountability, informed policymaking, and public trust.

February 12, 2026
Commentary
Was there censorship on TikTok after the U.S. takeover?
- Benjamin Guinaudeau,
- Kylan Rutherford,
- Sol Messing,
- Molly Roberts,
- Andreu Casas,
- Keng-Chi Chang,
- Hennes Barnehl,
- Joshua A. Tucker
A TikTok outage more likely explains recent anomalies – there’s no evidence of larger platform changes so far.

February 4, 2026

View All Related News

Academic Research

Area of Study

Date Posted

Tags

Area of Study

Date Posted

Tags

Reports & Analysis

News & Commentary