Individual Submission Summary
Share...

Direct link:

Large Language Models for Measuring Contested and Multi-Dimensional Concepts

Thu, September 5, 8:00 to 9:30am, Marriott Philadelphia Downtown, 405

Abstract

Large language models (LLMs) are deep neural network models pre-trained on massive amounts of text data. These models have demonstrated remarkable success in transferring linguistic knowledge learned in training to various downstream applications. However, one common concern when directly applying LLMs to the political science domain is the often complex and contested definitions of the target concepts, especially when the scholarly understanding of a concept may differ from the general audience writings that those LLMs were trained on. In this paper, we evaluate four zero-shot LLMs’ performances in measuring populism, one of the most contested and popular concepts in academic and general public discussion in the past two decades. We compare this zero-shot approach with expert coding and a fine-tuned classifier. We find that ChatGPT 3.5 overall gives comparable results to expert coding and supervised learning. Predicting populism directly using LLMs can yield a measure with low face validity. However, with careful prompting, LLMs are able to provide measures better aligned with the scholarly understanding of a contested concept.

Authors