Research
CSMaP is a leading academic research institute studying the ever-shifting online environment at scale. We publish peer-reviewed research in top academic journals, produce rigorous reports and analyses on policy relevant topics, and develop open source tools and methods to support the broader scholarly community.
Academic Research
-
Book
Online Data and the Insurrection
Media and January 6th, 2024
Online data is key to understanding the leadup to the January 6 insurrection, including how and why election fraud conspiracies spread online, how conspiracy groups organized online to participate in the insurrection, and other factors of online life that led to the insurrection. However, there are significant challenges in accessing data for this research. First, platforms restrict which researchers get access to data, as well as what researchers can do with the data they access. Second, this data is ephemeral; that is, once users or the platform remove the data, researchers can no longer access it. These factors affect what research questions can ever be asked and answered.
-
Journal Article
Estimating the Ideology of Political YouTube Videos
Political Analysis, 2024
We present a method for estimating the ideology of political YouTube videos. As online media increasingly influences how people engage with politics, so does the importance of quantifying the ideology of such media for research. The subfield of estimating ideology as a latent variable has often focused on traditional actors such as legislators, while more recent work has used social media data to estimate the ideology of ordinary users, political elites, and media sources. We build on this work by developing a method to estimate the ideologies of YouTube videos, an important subset of media, based on their accompanying text metadata. First, we take Reddit posts linking to YouTube videos and use correspondence analysis to place those videos in an ideological space. We then train a text-based model with those estimated ideologies as training labels, enabling us to estimate the ideologies of videos not posted on Reddit. These predicted ideologies are then validated against human labels. Finally, we demonstrate the utility of this method by applying it to the watch histories of survey respondents with self-identified ideologies to evaluate the prevalence of echo chambers on YouTube. Our approach gives video-level scores based only on supplied text metadata, is scalable, and can be easily adjusted to account for changes in the ideological climate. This method could also be generalized to estimate the ideology of other items referenced or posted on Reddit.
Reports & Analysis
-
Analysis
How Americans’ Confidence in Technology Firms has Dropped
Results from the American Institutional Confidence poll's second wave show that the public's confidence in technology, and tech companies, has markedly decreased over the past five years.
June 14, 2023
-
Analysis
Latinos Who Use Spanish-Language Social Media Get More Misinformation
That could affect their votes — and their safety from covid-19.
November 8, 2022
Data Collections & Tools
As part of our project to construct comprehensive data sets and to empirically test hypotheses related to social media and politics, we have developed a suite of open-source tools and modeling processes.