Data Science Methodology

Our experts produce new methodologies to further understand how social media affects politics and democracy. From developing and deploying code, CSMaP researchers create new ways to quantify social media interactions and its effects.

Research

  • Working Paper

    Estimating the Ideology of Political YouTube Videos

    Working Paper, May 2022

    View Article View abstract

    We present a method for estimating the ideology of political YouTube videos. As online media increasingly influences how people engage with politics, so does the importance of quantifying the ideology of such media for research. The subfield of estimating ideology as a latent variable has often focused on traditional actors such as legislators, while more recent work has used social media data to estimate the ideology of ordinary users, political elites, and media sources. We build on this work by developing a method to estimate the ideologies of YouTube videos, an important subset of media, based on their accompanying text metadata. First, we take Reddit posts linking to YouTube videos and use correspondence analysis to place those videos in an ideological space. We then train a text-based model with those estimated ideologies as training labels, enabling us to estimate the ideologies of videos not posted on Reddit. These predicted ideologies are then validated against human labels. Finally, we demonstrate the utility of this method by applying it to the watch histories of survey respondents with self-identified ideologies to evaluate the prevalence of echo chambers on YouTube. Our approach gives video-level scores based only on supplied text metadata, is scalable, and can be easily adjusted to account for changes in the ideological climate. This method could also be generalized to estimate the ideology of other items referenced or posted on Reddit.

    Area of Study

    Date Posted

    May 02, 2022

  • Working Paper

    Network Embedding Methods for Large Networks in Political Science

    Working Paper, November 2021

    View Article View abstract

    Social networks play an important role in many political science studies. With the rise of social media, these networks have grown in both size and complexity. Analysis of these large networks requires generation of feature representations that can be used in machine learning models. One way to generate these feature representations is to use network embedding methods for learning low-dimensional feature representations of nodes and edges in a network. While there is some literature comparing the advantages and shortcomings of these models, to our knowledge, there has not been any analysis on the applicability of network embedding models to classification tasks in political science. In this paper, we compare the performance of five prominent network embedding methods on prediction of ideology of Twitter users and ideology of Internet domains. We find that LINE provides the best feature representation across all 4 datasets that we use, resulting in the highest performance accuracy. Finally, we provide the guidelines for researchers on the use of these models for their own research.

    Area of Study

    Date Posted

    Nov 12, 2021

    Tags

View All Related Research

News & Views

  • News
    A group of people working over a table that has papers, sticky notes, and a laptop on it.

    Our Craig Newmark Philanthropies Graduate Students

    In 2020, Craig Newmark Philanthropies donated $400,000 to support our PhD students, ensuring they could continue their research projects examining some of the biggest questions at the intersection of social media and democracy. Here is an update on what they've been working on this past year thanks to Craig's generous support.

    July 1, 2021

  • News
    A map of the Earth with connecting dots outlining all the continents.

    Text Classification Using a Transformer-Based Model

    We're committed to supporting open and accessible science, which includes promoting the creation and use of open-source software, providing high-quality replication materials for our publications, and contributing to existing open-source tools and frameworks. To do so, we created an open-source tool to make using transformers easier and explain how to use it here.

    December 8, 2020

View All Related News