- Home  /  
- Research  /  
- Academic Research  /  
- A Comparison of Methods in Political Science Text Classification: Transfer Learning Language Models for Politics
A Comparison of Methods in Political Science Text Classification: Transfer Learning Language Models for Politics
Automated text classification has rapidly become an important tool for political analysis, despite large amounts of computing power and text data. One solution for this is a transfer learning approach.
Citation
Terechshenko, Zhanna, Fridolin Linder, Vishakh Padmakumar, Fengyuan Liu, Jonathan Nagler, Joshua Aaron Tucker, and Richard Bonneau. “A Comparison of Methods in Political Science Text Classification: Transfer Learning Language Models for Politics.” SSRN Electronic Journal, (2020). https://doi.org/10.2139/ssrn.3724644
Date Posted
Oct 20, 2020
Authors
- Zhanna Terechshenko,
- Fridolin Linder,
- Vishakh Padmakumar,
- Fengyuan Liu,
- Jonathan Nagler,
- Joshua A. Tucker,
- Richard Bonneau
Area of Study
Abstract
Automated text classification has rapidly become an important tool for political analysis. Recent advancements in NLP enabled by advances in deep learning now achieve state of the art results in many standard tasks for the field. However, these methods require large amounts of both computing power and text data to learn the characteristics of the language, resources which are not always accessible to political scientists. One solution is a transfer learning approach, where knowledge learned in one area or source task is transferred to another area or a target task. A class of models that embody this approach are language models, which demonstrate extremely high levels of performance. We investigate the performance of these models in the political science by comparing multiple text classification methods. We find RoBERTa and XLNet, language models that rely on theTransformer, require fewer computing resources and less training data to perform on par with – or outperform – several political science text classification methods. Moreover, we find that the increase in accuracy is especially significant in the case of small labeled data, highlighting the potential for reducing the data-labeling cost of supervised methods for political scientists via the use of pretrained language models.
Background
Automated text classification is an increasingly important tool for political analysis. The ability to classify text data with high levels of accuracy is often expensive and requires relatively large amounts of human-labeled data. These methods require large amounts of both computing power and text data to learn the characteristics of the language--resources which are not always accessible to political scientists. One solution is a transfer learning approach, where knowledge learned in one area or source task is transferred to another area or a target task. We investigate the performance of these models in the political sciences by comparing multiple text classification methods. Here we aim to provide political scientists with the guidance for text classification across diverse tasks.
Study
In this paper, we provide a brief overview of common methods for supervised text classification and discuss transfer learning via pretrained language models as a potential solution for text classification problems in political science. We then test and compare the performance of traditional machine learning and several transfer learning models across four data sets representative of typical political text classification tasks. We also analyze more difficult classification tasks that involve training the classifier on labeled data from one data set.
Results
We find that transfer learning models that use the Transformer architecture perform consistently on par, or outperform, other models. We also find that the increase in accuracy is especially significant in data sets with the smallest amount of labeled data, highlighting the potential for reducing the data-labeling cost of supervised methods for political scientists. Our results suggest that the Transformer architecture performs consistently well, but does not remove the need for labeled data altogether. Furthermore, these models are readily available for several languages, relatively easy to use and very fast. The open-source software we are releasing in conjunction with this paper aims at making the implementation of such experiments easier for the researchers interested in this method.