Individual Submission Summary
Share...

Direct link:

Automating the Detection of Mis/Disinformation on Social Media Platforms

Sun, September 8, 8:00 to 9:30am, Pennsylvania Convention Center (PCC), 110B

Abstract

The abundance of content on social media can make timely, efficient detection of mis/disinformation extremely challenging. Expert factchecking is often resource- and time-intensive. Moreover, new digital technologies, such as generative AI, risk making this problem worse and existing strategies may struggle to keep up. Efforts to identify, understand, and respond to mis/disinformation would therefore benefit from automated tools that can help triage high volumes of content, surfacing potentially problematic content for expert review.

In this paper, we present and evaluate a method for automating the detection of patterns of mis/disinformation circulated on social media: a kind of 'early warning system' of potentially problematic claims. Unlike automated methods that approach detection by training a statistical model to identify mis/disinformation on the basis of a pre-adjudicated corpus of posts, the aim of our method is to develop a flexible approach in which mis/disinformation can be rapidly identified in a way that involves little to no initial labelling by human coders.

Our method proceeds in two principal stages. First, a large language model--based on a set of generalizable prompts--classifies the underlying claims made in a corpus of social media posts. Second, these claims are, in turn, clustered in an unsupervised learning framework using a large ensemble of features. These include: linguistic features of the content, such as concreteness, valence, imageability, and morphological complexity; features derived from the public profiles of those who authored or shared the content, such as inferred socio-demographic characteristics, ideology, and partisanship; and features characterizing the networks in which the claims circulated.

As an initial case study, we apply and evaluate the usefulness of this method using a novel dataset of posts about COVID-19 made on X/Twitter by Canadians between March 2020 and March 2023. The paper will seek to validate the approach by assessing how quickly and how accurately mis/disinformation can be identified in the absence of training data. In so doing, we also seek to offer new substantive insights about the proportion of COVID-related claims circulating during this period that were potentially problematic, how this proportion may have changed over time and in response to events, and what features (linguistic, user, and network) point to potentially problematic claims.

Authors