Data Science Methodology - NYU’s Center for Social Media and Politics

Academic Research

Journal Article
Survey Professionalism: New Evidence from Web Browsing Data
Bernhard Clemm von Hohenberg,

Tiago Ventura,

Jonathan Nagler,

Ericka Menchen-Trevino,

Magdalena Wojcieszak
Political Analysis, 2025
View Article View abstract

Online panels have become an important resource for research in political science, but the compensation offered to panelists incentivizes them to become “survey professionals,” raising concerns about data quality. We provide evidence on survey professionalism exploring three US samples of subjects who donated their browsing data, recruited via Lucid, YouGov, and Facebook (total 𝑛=3,886). Survey professionalism is common, but varies across samples: by our most conservative estimate, we find 1.7% of respondents on Facebook, 7.6% on YouGov, and 34 7% on Lucid to be professionals (under the assumption that professionals are as likely as non-professionals to donate data after conditioning on observable demographics available from all online survey takers). However, evidence that professionals lower data quality is limited: they do not systematically differ demographically or politically from non-professionals and do not exhibit more response instability. They are, however, somewhat more likely to speed, straightline, and attempt to take questionnaires repeatedly. To address potential selection issues in donating of browsing data, we present sensitivity analyses with lower bounds for survey professionalism. While concerns about professionalism are warranted, we conclude that survey professionals do not, by and large, distort inferences of research based on online panels.
Area of Study

Data Science Methodology

Online Information Environment
Date Posted

Oct 06, 2025
Journal Article
Quantifying Narrative Similarity Across Languages
Hannah Waight,

Sol Messing,

Anton Shirikov,

Margaret E. Roberts,

Jonathan Nagler,

Jason Greenfield,

Megan A. Brown,

Kevin Aslett,

Joshua A. Tucker
Sociological Methods & Research, 2025
View Article View abstract

How can one understand the spread of ideas across text data? This is a key measurement problem in sociological inquiry, from the study of how interest groups shape media discourse, to the spread of policy across institutions, to the diffusion of organizational structures and institution themselves. To study how ideas and narratives diffuse across text, we must first develop a method to identify whether texts share the same information and narratives, rather than the same broad themes or exact features. We propose a novel approach to measure this quantity of interest, which we call “narrative similarity,” by using large language models to distill texts to their core ideas and then compare the similarity of claims rather than of words, phrases, or sentences. The result is an estimand much closer to narrative similarity than what is possible with past relevant alternatives, including exact text reuse, which returns lexically similar documents; topic modeling, which returns topically similar documents; or an array of alternative approaches. We devise an approach to providing out-of-sample measures of performance (precision, recall, F1) and show that our approach outperforms relevant alternatives by a large margin. We apply our approach to an important case study: The spread of Russian claims about the development of a Ukrainian bioweapons program in U.S. mainstream and fringe news websites. While we focus on news in this application, our approach can be applied more broadly to the study of propaganda, misinformation, diffusion of policy and cultural objects, among other topics.
Area of Study

Data Science Methodology

Foreign Influence Campaigns
Date Posted

Jul 14, 2025
Tags

Methods,

Text and Content Analysis,

Ukraine,

Russia,

United States,

Large Language Models

View All Related Research

Reports & Analysis

Analysis
Are Influence Campaigns Trolling Your Social Media Feeds?
- Meysam Alizadeh,
- Cody L. Buntain,
- Jacob N. Shapiro,
- Joshua A. Tucker
Now, there are ways to find out. New data shows that machine learning can identify content created by online political influence operations.

October 13, 2020

View All Related Reports & Analysis

News & Commentary

Commentary
Platform-Independent Experiments on Social Media
- Joshua A. Tucker,
- Jennifer Allen
Two of our core faculty, Joshua Tucker and Jenny Allen, recently published a perspectives piece in Science in response to the recently published article, "Reranking partisan animosity in algorithmic social media feeds alters affective polarization."

November 27, 2025
Policy
Comments on Ofcom’s Call for Evidence on Researcher Access
We responded to Ofcom’s public request for evidence on researcher access to online service data for safety research, highlighting barriers researchers face when accessing social media data, the challenges of limited information sharing, potential ways to improve data access, and examples of robust data-sharing practices.

July 26, 2025

View All Related News

Academic Research

Area of Study

Date Posted

Area of Study

Date Posted

Tags

Reports & Analysis

News & Commentary