- Home  /  
- Research  /  
- Academic Research  /  
- An enriched, multimodal social media dataset of a UK General Election campaign
An enriched, multimodal social media dataset of a UK General Election campaign
How text, images, and platform interactions reveal the digital dynamics of modern political campaigning in the UK.
Citation
Barrie, Christopher, Aybuke Atalay, and Alia ElKattan. "An enriched, multimodal social media dataset of a UK General Election campaign." Journal of Quantitative Description: Digital Media, 5 (2025). https://doi.org/10.51685/jqd.2025.017
Date Posted
Oct 10, 2025
Authors
- Christopher Barrie,
- Aybuke Atalay,
- Alia ElKattan
Area of Study
Tags
Abstract
This article introduces a dataset of all posts by candidates during the 2024 General Election in the United Kingdom with a presence on the X (formerly Twitter) platform. The article relies on a crowd-sourcing innovation in the United Kingdom that, for the first time, provided researchers with early access to a regularly updated candidate list prior to the start of the election. This made it possible to collect real-time data on candidate posts for 1,604 candidates across 53 separate political parties. Additionally, we download and store 53,327 images and 15,982 videos posted within tweets. We enrich the data with the realized vote count and vote share for each candidate as well as text transcripts extracted from the audio of video posts. Overall, the dataset provides a uniquely comprehensive collection of online campaigning material for an election campaign and will be of considerable value to scholars of political communication, elections, and democratic responsiveness. We also analyze the topics and tone — focusing on negativity — across different media formats to identify patterns in the content and style of candidate communication across parties.
Background
Social media data has become central to the study of political communication, especially during election campaigns. Candidate posts can reveal which issues parties emphasize, how campaigns frame political choices, and how candidates use different styles of communication online. Scholars have used these kinds of data to study topics such as agenda setting, issue salience, campaign negativity, and the changing affordances of digital platforms for political communication.
Yet collecting comprehensive campaign data remains difficult. Researchers need to gather posts in real time to avoid missing content that may later be deleted, but candidate lists are often finalized shortly before elections, leaving little time to identify accounts and begin collection. Recent limits on platform API access have made this even harder. This study addresses these challenges by using a crowd-sourced, regularly updated list of declared candidates ahead of the 2024 UK General Election, allowing the authors to collect candidate posts on X during the campaign and build a dataset that includes text, images, videos, transcripts, and election results.
Study
The study introduces a dataset of candidate posts on X during the 2024 UK General Election. The authors begin by identifying candidates through Democracy Club’s regularly updated list of declared candidates, then cross-check candidate profiles through WhoCanIVoteFor and Google Search to locate and validate X accounts. After manually validating candidate usernames, they collect posts from 1,604 candidate accounts across 53 political parties during the campaign period from May 22 to July 3, 2024. The final dataset includes about 185,000 tweets, alongside candidate information, constituency details, vote counts, and vote share.
The study also collects and enriches media attached to candidate posts, making the dataset multimodal. The authors download images and videos shared in tweets, recover 53,327 images and 15,982 videos for the final dataset, and extract transcripts from the audio of video posts. They then analyze campaign communication across formats by measuring topic salience and negativity. Topics are identified using keywords developed from British Election Study responses about the most important issue facing the country, while negativity is measured using a RoBERTa-based sentiment model fine-tuned on manually labeled campaign tweets. This allows the authors to compare what candidates and parties emphasized across tweet text, video transcripts, and media formats.
Results
The dataset captures a broad range of candidate communication during the 2024 UK General Election. The authors collect posts from 1,604 candidates across 53 political parties during the campaign period, along with 53,327 images and 15,982 videos recovered for the final dataset. The dataset covers more than one-third of all standing candidates and includes candidate-level election results, allowing researchers to connect online campaign communication with constituency information, vote counts, and vote share. The authors also note that missing candidate accounts were primarily associated with lower activity on X, rather than clear regional bias or whether a candidate was a former or sitting MP.
The initial analyses show that parties varied in the topics they emphasized online. Reform UK candidates focused heavily on immigration, while the Scottish National Party placed particular emphasis on Europe and Scotland’s relationship to Europe after Brexit. Across major parties, candidates frequently mentioned Keir Starmer and Rishi Sunak, reflecting the centrality of the two main party leaders during the campaign. Despite the importance of Brexit in recent UK politics, Europe received comparatively little attention from Labour and Conservative candidates. These topical patterns remained broadly similar when the authors balanced tweet counts within party and when they compared results with and without retweets.
The study also finds differences in campaign tone and media use across parties. Labour, the winning party, campaigned in the most positive style among the seven principal parties, while Reform UK was among the most negative. Retweets tended to be more negative than original posts, increasing overall negativity when included, but this pattern was consistent across parties. The authors also find that larger and more established parties, including Labour and the Conservative Party, used images more frequently than Reform UK: 53% of Reform UK candidate posts were text-only, compared with 38% of Labour posts and 40% of Conservative posts. Taken together, the findings show how the dataset can be used to study campaign communication across text, image, video, and transcript data, while offering a public resource for future research on political communication, elections, and online campaigning.