Individual Submission Summary
Share...

Direct link:

Protest Event Analysis in Autocracies: The Use of NLP to Collect Protest Data

Sun, September 8, 8:00 to 9:30am, Pennsylvania Convention Center (PCC), 110B

Abstract

Different political regimes actively implement strategies to restrict access to information they deem detrimental to their political ends while also actively dissuading citizens from disseminating specific publications and opinions. These innovative tactics pose substantial challenges to the way scholars collect data on protests. In exploring the case of Russia, where citizens can face political persecution for posting certain types of information online, we examine the use of state-of-the-art Natural Language Processing models, including BERT, GPT-4, Llama, and Transformer machine learning (ML) models to mine, categorise and analyse data on contentious events reported by users of social networks such as Twitter, YouTube, and VK. This study assesses the efficacy of these NLP models in identifying and extracting relevant information from diverse datasets, often characterised by complex and nuanced linguistic expressions. These datasets are vital for both qualitative and quantitative analysis of contentious events. Our approach involves collecting data from these social networks, testing the performance of the models, exploring the types of information this data allows us to retrieve, and proceeding with an analysis of the development of contentious actions in Russia. This results in a dataset containing details on contentious events, their locations, typologies, and other relevant data, contributing to both quantitative and qualitative analysis. We also discuss the limitations of such computational methods and contribute to the literature on the methodology of protest event analysis, particularly in authoritarian regimes, with a focus on the use of social media and language processing technologies as the primary data source.

Authors