Search
Browse By Day
Browse By Time
Browse By Person
Browse By Mini-Conference
Browse By Division
Browse By Session or Event Type
Browse Sessions by Fields of Interest
Browse Papers by Fields of Interest
Search Tips
Conference
Location
About APSA
Personal Schedule
Change Preferences / Time Zone
Sign In
X (Twitter)
Digital media companies have increasingly restricted access to data about public activity on their platforms, which limits scholarship, impairs platform accountability, and empowers abusive users. However, nearly all platforms concentrate users in a few high-volume places (i.e. pages, channels, or subreddits). We exploit this concentration to develop new, scalable methods to reconstruct most public user activity on digital platforms, with or without access to platform-provided APIs. Our approach works especially well, we show, because the most popular places/channels are the most stable over time, and because lower-engagement users participate overwhelmingly in popular channels. Platforms also show fractal self-similarity, with subcategories of content mirroring the concentration, stable popularity, and ladder of engagement seen across the platform as a whole.
We deploy these methods in an R package, which can estimate total coverage for a scraping list or API collection list of a given size, and calculate how frequently it needs to be updated to minimize undercollection. Our approach makes it feasible to recover large segments of digital platform activity, both for "big picture" overviews of the highest-visibility content, and within smaller topics and