Search
Browse By Day
Browse By Time
Browse By Person
Browse By Mini-Conference
Browse By Division
Browse By Session or Event Type
Browse Sessions by Fields of Interest
Browse Papers by Fields of Interest
Search Tips
Conference
Location
About APSA
Personal Schedule
Change Preferences / Time Zone
Sign In
X (Twitter)
Our public discourse now largely takes place on digital platforms. Those platforms (in)famously enforce a range of rules against various categories of harmful speech. These rules are made by trust-and-safety teams working within technology companies, who legislate a vast array of rules. Once their feasibility is tested by engineers, these rules are subsequently enforced by a complex bureaucracy of content moderation workers. Crucially, the frontline of the enforcement apparatus is not human, but machine. While some of this enforcement involves simpler forms of automation (as with forms of cryptographic “hashing”), most content moderation deploys increasingly sophisticated forms of machine intelligence—trained on existing datasets of decisions made by human moderators. Based on this training, the machines themselves then classify speech as protected or unprotected under the rules. Thanks to large language models, these machine tools are only growing more powerful, writing and enforcing their own code to enforce human-programmed content rules with ever greater speed and—it is hoped—accuracy.
Should we welcome the fact that our public sphere is now largely policed by machines? The benefits of deploying AI are substantial: it enables platforms to satisfy their moderation duties (to protect people from seriously harmful, unprotected speech) with far greater speed than if left solely to human moderators; it also spares human moderators the burdens of exposure to harmful content. Still, the use of AI in content moderation is subject to a wide range of normative objections. The central task of the paper is to map these objections, pinpointing the strengths and weaknesses of each. Some protest that the use of AI filters is an unacceptable form of prior restraint on speech. Others complain that AI is unacceptably opaque, whereas others invoke the inherent inappropriateness of nonhuman decision-making that impacts fundamental rights. Many complain about the substantive accuracy of AI’s decisions, which notoriously involve false positives (removing legitimate speech that ought to be protected) and false negatives (leaving up harmful speech that ought to be removed). And others complain about bias, pointing to unfair discrimination flowing from the underlying data on which AI tools are trained. While the paper’s main contribution lies in cataloguing these concerns, it also offers some speculative normative conclusions. Some of the concerns about using AI for content moderation are overstated, and others are serious—yet potentially redressable. The upshot is a less pessimistic appraisal of the prospects for ethical mechanized content moderation by platforms.