Academic Research
CSMAP faculty, postdoctoral fellows, and students publish rigorous, peer-reviewed research in top academic journals and post working papers sharing ongoing work.
Search or Filter
-
Journal Article
State Media Control Influences Large Language Models
Nature, 2026
Millions of people around the world query large language models (LLMs) for information. Although several studies have compellingly documented the persuasive potential of these models, there is limited evidence of who or what influences the models themselves, leading to a flurry of concerns about which companies and governments build and regulate the models. Here we show through six studies that government control of the media across the world already influences the output of LLMs via their training data. We use a cross-national audit to show that LLMs exhibit a stronger pro-government valence when prompted in the languages of countries with lower media freedom than in those with higher media freedom. This result is correlational, so to triangulate the specific mechanism of how state media control can influence LLMs, we develop a multi-part case study on China’s media. We demonstrate that media scripted and curated by the Chinese state appears in LLM training datasets. To evaluate the plausible effect of this inclusion, we use an open-weight model to show that additional pretraining on Chinese state-coordinated media generates more positive answers to prompts about Chinese political institutions and leaders. We link this phenomenon to commercial models through two audit studies demonstrating that prompting models in Chinese generates more positive responses about China’s institutions and leaders than do the same queries in English. The combination of influence and persuasive potential across languages suggests the troubling conclusion that states and powerful institutions have increased strategic incentives to leverage media control in the hopes of shaping LLM output.
-
Working Paper
Artificial Intelligence, Politics, and Political Science
Working Paper, 2026
This forthcoming edited volume (Cambridge University Press) examines the transformative impact of artificial intelligence on democratic institutions, political behavior, governance, and the discipline of political science itself. The volume represents the report of the American Political Science Association’s Presidential Task Force on AI, Politics, and Political Science, co-chaired by Joshua Tucker and Nathaniel Persily.
Across twelve chapters produced by close to 60 scholars, the report evaluates how generative AI and machine learning systems are reshaping public opinion formation, political communication, labor markets, electoral processes, state capacity, and regulatory frameworks. The authors analyze both the opportunities and risks posed by AI technologies, including concerns surrounding information integrity, ideological personalization, surveillance, democratic accountability, and concentrated technological power. Themes that cut across multiple chapters include: the unprecedented power of a small number of AI corporations; the opacity and non-replicability of model outputs; bias in AI systems; and the absence of agreed-upon benchmarks for evaluation.The volume also addresses methodological and ethical implications for political science research, emphasizing transparency, reproducibility, and the responsible integration of AI tools into scholarly inquiry. Ultimately, the volume argues that AI will not only alter political institutions and citizen-state relations, but also may fundamentally reshape how political knowledge is produced and interpreted. It calls for sustained interdisciplinary collaboration and evidence-based governance to ensure that AI development supports democratic resilience rather than undermining it.
-
Working Paper
Polarization by Default: Auditing Recommendation Bias in LLM-Based Content Curation
Working Paper, 2026
Large Language Models (LLMs) are increasingly deployed to curate and rank human-created content, yet the nature and structure of their biases in these tasks remains poorly understood: which biases are robust across providers and platforms, and which can be mitigated through prompt design. We present a controlled simulation study mapping content selection biases across three major LLM providers (OpenAI, Anthropic, Google) on real social media datasets from Twitter/X, Bluesky, and Reddit, using six prompting strategies (\textit{general}, \textit{popular}, \textit{engaging}, \textit{informative}, \textit{controversial}, \textit{neutral}). Through 540,000 simulated top-10 selections from pools of 100 posts across 54 experimental conditions, we find that biases differ substantially in how structural and how prompt-sensitive they are. Polarization is amplified across all configurations, toxicity handling shows a strong inversion between engagement- and information-focused prompts, and sentiment biases are predominantly negative. Provider comparisons reveal distinct trade-offs: GPT-4o Mini shows the most consistent behavior across prompts; Claude and Gemini exhibit high adaptivity in toxicity handling; Gemini shows the strongest negative sentiment preference. On Twitter/X, where author demographics can be inferred from profile bios, political leaning bias is the clearest demographic signal: left-leaning authors are systematically over-represented despite right-leaning authors forming the pool plurality in the dataset, and this pattern largely persists across prompts.
-
Working Paper
AI summaries in social media improve dialogue but reduce engagement
Working Paper, 2026
-
Journal Article
Quantifying Narrative Similarity Across Languages
Sociological Methods & Research, 2025
How can one understand the spread of ideas across text data? This is a key measurement problem in sociological inquiry, from the study of how interest groups shape media discourse, to the spread of policy across institutions, to the diffusion of organizational structures and institution themselves. To study how ideas and narratives diffuse across text, we must first develop a method to identify whether texts share the same information and narratives, rather than the same broad themes or exact features. We propose a novel approach to measure this quantity of interest, which we call “narrative similarity,” by using large language models to distill texts to their core ideas and then compare the similarity of claims rather than of words, phrases, or sentences. The result is an estimand much closer to narrative similarity than what is possible with past relevant alternatives, including exact text reuse, which returns lexically similar documents; topic modeling, which returns topically similar documents; or an array of alternative approaches. We devise an approach to providing out-of-sample measures of performance (precision, recall, F1) and show that our approach outperforms relevant alternatives by a large margin. We apply our approach to an important case study: The spread of Russian claims about the development of a Ukrainian bioweapons program in U.S. mainstream and fringe news websites. While we focus on news in this application, our approach can be applied more broadly to the study of propaganda, misinformation, diffusion of policy and cultural objects, among other topics.
-
Working Paper
Emergent LLM Behaviors are Observationally Equivalent to Data Leakage
Working Paper, 2025
Ashery et al. recently argue that large language models (LLMs), when paired to play a classic "naming game," spontaneously develop linguistic conventions reminiscent of human social norms. Here, we show that their results are better explained by data leakage: the models simply reproduce conventions they already encountered during pre-training. Despite the authors' mitigation measures, we provide multiple analyses demonstrating that the LLMs recognize the structure of the coordination game and recall its outcomes, rather than exhibit "emergent" conventions. Consequently, the observed behaviors are indistinguishable from memorization of the training corpus. We conclude by pointing to potential alternative strategies and reflecting more generally on the place of LLMs for social science models.