China - NYU’s Center for Social Media, AI, and Politics

Academic Research

Journal Article
State Media Control Influences Large Language Models
Hannah Waight,

Eddie Yang,

Yin Yuan,

Sol Messing,

Margaret E. Roberts,

Brandon M. Stewart,

Joshua A. Tucker
Nature, 2026
View Article View abstract

Millions of people around the world query large language models (LLMs) for information. Although several studies have compellingly documented the persuasive potential of these models, there is limited evidence of who or what influences the models themselves, leading to a flurry of concerns about which companies and governments build and regulate the models. Here we show through six studies that government control of the media across the world already influences the output of LLMs via their training data. We use a cross-national audit to show that LLMs exhibit a stronger pro-government valence when prompted in the languages of countries with lower media freedom than in those with higher media freedom. This result is correlational, so to triangulate the specific mechanism of how state media control can influence LLMs, we develop a multi-part case study on China’s media. We demonstrate that media scripted and curated by the Chinese state appears in LLM training datasets. To evaluate the plausible effect of this inclusion, we use an open-weight model to show that additional pretraining on Chinese state-coordinated media generates more positive answers to prompts about Chinese political institutions and leaders. We link this phenomenon to commercial models through two audit studies demonstrating that prompting models in Chinese generates more positive responses about China’s institutions and leaders than do the same queries in English. The combination of influence and persuasive potential across languages suggests the troubling conclusion that states and powerful institutions have increased strategic incentives to leverage media control in the hopes of shaping LLM output.
Area of Study

Online Information Environment

Politics of Authoritarianism

Public Opinion
Date Posted

May 13, 2026
Tags

China,

Generative AI,

Large Language Models,

Text and Content Analysis
Journal Article
Content-Based Features Predict Social Media Influence Operations
Meysam Alizadeh,

Jacob N. Shapiro,

Cody L. Buntain,

Joshua A. Tucker
Science Advances, 2020
View Article View abstract

We study how easy it is to distinguish influence operations from organic social media activity by assessing the performance of a platform-agnostic machine learning approach. Our method uses public activity to detect content that is part of coordinated influence operations based on human-interpretable features derived solely from content. We test this method on publicly available Twitter data on Chinese, Russian, and Venezuelan troll activity targeting the United States, as well as the Reddit dataset of Russian influence efforts. To assess how well content-based features distinguish these influence operations from random samples of general and political American users, we train and test classifiers on a monthly basis for each campaign across five prediction tasks. Content-based features perform well across period, country, platform, and prediction task. Industrialized production of influence campaign content leaves a distinctive signal in user-generated content that allows tracking of campaigns from month to month and across different accounts.
Area of Study

Data Science Methodology

Foreign Influence Campaigns
Date Posted

Jul 22, 2020
Tags

Twitter/X,

Reddit,

China,

Russia,

Venezuela,

United States

View All Related Research

Reports & Analysis

Analysis
Is Social Media to Blame for Violence at the U.S. Capitol?
This explains how social media can both weaken — and strengthen — democracy. Groups opposed to fundamental tenets of liberal democracy also have found their megaphone.

January 7, 2021
Analysis
Are Influence Campaigns Trolling Your Social Media Feeds?
- Meysam Alizadeh,
- Cody L. Buntain,
- Jacob N. Shapiro,
- Joshua A. Tucker
Now, there are ways to find out. New data shows that machine learning can identify content created by online political influence operations.

October 13, 2020

View All Related Reports & Analysis

News & Commentary

Commentary
State media control impacts the output of U.S.-based LLMs
- Hannah Waight,
- Eddie Yang,
- Yin Yuan,
- Sol Messing,
- Brandon M. Stewart,
- Margaret E. Roberts,
- Joshua A. Tucker
Training data for LLMs does not just fall from the sky, our research finds.

May 13, 2026
Policy
Mosaics of Insight: Auditing TikTok Through Independent Data Access
Even if TikTok is sold to a non-Chinese buyer, the threat of foreign influence will remain. That’s why researchers need independent data access.

February 21, 2025

View All Related News

Academic Research

Area of Study

Date Posted

Tags

Area of Study

Date Posted

Tags

Reports & Analysis

News & Commentary