Code
import numpy as np
import pandas as pd
Michael Mullarkey
November 13, 2022
A recent Washington Post article dove deep on Twitter’s crowd-sourced fact-checking program Birdwatch. Twitter’s owner touted the volunteer-driven initiative around the same time 15% of Trust and Safety staff were laid off. A week later a large number of content moderation contractors1 were fired without notice and the head of Trust and Safety resigned.
Data related to trust and safety are rarely open for audit, but Birdwatch2 has provided open source data, code, and documentation since it started as a small pilot program in January 2021. I decided to do an initial assessment of the open data with an eye toward Birdwatch’s scalability as a content moderation tool.
We don’t need a lot of Python packages to do this analysis,3 and I try to keep my dependencies as light as possible without making my life a nightmare.
We can download the data from this page, where the Birdwatch data is updated daily.
I downloaded this data on November 13, 2022. If you’re accessing this data in the future the analyses will not exactly reproduce since new data will be included. You can feel free to grab the data I’m using from the Github repo for this post.
First we’ll make a quick function to read the .tsv files in as Pandas Data Frames.
Then we’ll apply that function to all of the Birdwatch data.
To make it easier to explore each data frame, we’ll assign them to separate objects outside the list. The documentation for all three datasets provided by Birdwatch is here.
Before I move forward, one huge positive of this project is that data is available for outside research. Allowing external audits of content moderation approaches is tricky, and I commend the Birdwatch folks for their transparency so far.
Any Twitter user can join Birdwatch with the ultimate goal of adding notes to tweets. Those notes can fact-check, provide additional context, and in theory deter disinformation.
There are checks and balances on the Birdwatch system geared toward stopping bad-faith actors. While anyone can join Birdwatch you cannot write your own initial notes on tweets when you join. First, you must consistently submit ratings of others’ initial notes that agree with the other Birdwatch members’ general consensus.
Those ratings from Birdwatch members also serve as a powerful bottleneck for which initial notes ultimately appear on tweets. The ranking for whether a note is “helpful” enough to apply to a tweet is more complex than a majority vote among Birdwatch members.
Instead, initial notes that receive a few positive ratings from people who normally disagree on their ratings are more likely to be rated as helpful than initial notes that receive many positive ratings from people who normally agree.
This system is known as bridge-based ranking and algorithmically prioritizes this form of consensus over potential alternatives The Washington Post article notes this approach is unlikely to scale, especially in “an era when left and right often lack a shared set of facts.”
To see how well this approach does or does not scale right now, let’s dive into the data.
Birdwatchers have put initial notes on 28723 tweets since January of 2021.
For context, there are approximately 500,000,000 tweets sent per day.
Even if we assume that only the top 0.1% of tweets require the scrutiny of Birdwatch that would mean 500 tweets should be considered for notes per day.
We can be extra generous and say fewer tweets than that might require notes, but we’d still expect around 500 notes per day. How many days in the Birdwatch data meet that criteria?
# Converting to date instead of milliseconds since epoch
initial_notes["dateCreated"] = pd.to_datetime(initial_notes["createdAtMillis"], unit = "ms").dt.date
# Counting the number of notes per day
tweet_initial_dates = initial_notes.groupby(["dateCreated"]).count()
# Finding the earliest date
min_tweet_date = tweet_initial_dates.index.min()
# Finding how many dates where over 500 notes or more were created
days_500_per = tweet_initial_dates[tweet_initial_dates.tweetId >= 500]
# Getting value of only date where >500 notes were created
only_date_over = days_500_per.index.values
print(f"In all Birdwatch data going back to {min_tweet_date}, there was {len(days_500_per)} day where at least 500 notes were written - {only_date_over[0]}")
In all Birdwatch data going back to 2021-01-23, there was 1 day where at least 500 notes were written - 2021-01-28
Even with relaxed criteria, there was only 1 day at the very beginning of Birdwatch where the community reviewed approximately 0.1% of all tweets in a day.
This relatively low review volume is understandable given Birdwatch is an almost all-volunteer effort. However, this precedent of not operating at scale becomes concerning if Birdwatch is expected to play a large role in preventing disinformation on the platform.
# Getting status for which notes were rated as helpful or not
note_status = initial_history[["noteId","currentStatus"]]
# Seeing what percentage of notes with evaluations need more evaluation
status_counts = note_status.currentStatus.value_counts()
pd.DataFrame(status_counts)\
.assign(percent = lambda x: (x["currentStatus"] / x["currentStatus"].sum()) * 100)\
.round(2)
currentStatus | percent | |
---|---|---|
NEEDS_MORE_RATINGS | 14517 | 86.90 |
CURRENTLY_RATED_HELPFUL | 1506 | 9.01 |
CURRENTLY_RATED_NOT_HELPFUL | 683 | 4.09 |
Another indication that an all-volunteer effort isn’t enough to scale this form of content moderation - nearly 87% of initial notes need more ratings to determine whether they could be helpful or not.
All initial notes start out as “Needs More Ratings” until they’ve received at least 5 ratings, and it appears a vast majority of notes never meet that threshold.
There could be multiple reaons for this, ranging from charitable4 to less so.5 There could be reasons internal to the Birdwatch community I’m unaware of that drive this pattern.
And no matter what, the current Birdwatch system is failing to identify whether a vast majority initial notes are helpful. This is true even though the volume of initial notes is infentisimal compared to the total volume of tweets. If more initial notes were written to better keep up with overall tweet volume, there’s a chance this lack of ratings problem would be exacerbated.
print(f"The tweet with the most initial notes had {tweets_initial_notes.noteId.max()} notes.")
# tweets_initial_notes[tweets_initial_notes.noteId == 58]
# Can use this website to get tweets from tweetId without using the API https://www.bram.us/2017/11/22/accessing-a-tweet-using-only-its-id-and-without-the-twitter-api/
The tweet with the most initial notes had 58 notes.
The tweet with the most initial notes was by Rep. Alexandria Ocasio-Cortez in response to Senator Ted Cruz. The tweet touched on the trading platform Robinhood’s decision to prevent retail investors from trading certain stocks and the January 6th insurrection.
This tweet ultimately did not have a note attached to it.
Tweets could not have a note attached to them for 2 reasons:
1. There is no note rated as helpful
2. There is at least one note rated as helpful but the Tweet is not marked as “potentially misleading”
In this case no initial note was rated as helpful, and to boot none of the initial notes had enough ratings to even be considered.
# Getting all noteIds in reference to the AOC tweet into a list
aoc_note_ids = initial_notes[initial_notes["tweetId"] == 1354848253729234944].noteId.to_list()
# Filtering the note history based on this list and counting values
initial_history[initial_history["noteId"].isin(aoc_note_ids)].currentStatus.value_counts()
NEEDS_MORE_RATINGS 54
Name: currentStatus, dtype: int64
Even if you believe this tweet should not have received a note,6 it’s troubling that its status remained up in the air rather than seeing a definitive “not helpful” label applied to all initial notes.
The two previous scalability issues could, at least in principle, be solved by having a lot more people joining Birdwatch. More initial notes could be written, more initial notes could receive ratings, and the system could achieve at least some scalability.
However, there are reasons to believe that using its current standards more members could actually make Birdwatch less scalable.
Think back to the example of the tweet with the most initial notes ever. Lots of people wrote initial notes, but nowhere near enough people rated all those initial notes.
It’s possible Birdwatch has better procedures in place now, but it seems like more Birdwatch members could exacerbate this coordination problem. Too many people writing initial notes, and - after the initial probationary period - not enough people rating initial notes.
This analysis is only possible because Birdwatch has open data. Trust and Safety measures require some procedures be kept under lock and key, and the considered transparency baked into Birdwatch’s approach since January 2021 is admirable. The now-defunct META team at Twitter also made considered transparency a consistent practice.
The more teams follow these examples the better we’ll be able to moderate content in helpful, just ways.
There are many stones still left to turn in this data. For example, I think8 that a vast majority9 of the tweets with initial notes are in English. Someone could look into that and contextualize the volume of tweets Birdwatch hasn’t even attempted to moderate.
I hope I can inspire at least a couple of other people to take a closer look, and if you find anything interesting please get in touch.
Including a contractor making critical changes to child safety workflows https://twitter.com/CaseyNewton/status/1591608307927556096?s=20&t=4lurUg2rjlnq6mZ8xqquNQ↩︎
Now referred to by some people as Community Notes, though I’ll be using Birdwatch throughout↩︎
And if you don’t care about the code you can ignore it. You don’t need to know Python to read this post!↩︎
The Birdwatch community actively doesn’t bother rating initial notes from obvious trolls, notes on low value tweets, or some other combination of undesirable features↩︎
There just aren’t enough people who can volunteer their time to such an intensive effort so most initial notes never receive enough ratings↩︎
Cards on the table, I don’t think this tweet needs a note↩︎
I think the Birdwatch community has on balance elected to prioritize high-value tweets such as misleading tweets from Twitter’s owner. I’m also certain Twitter has more data that could help a system like this nip misinformation in the bud before it’s accrued millions of impressions↩︎
But haven’t directly confirmed!↩︎
And maybe all↩︎