Read the Tea Leaves

Mastodon and the challenges of abuse in a federated system

This post will probably only make sense to those deeply involved in Mastodon and the fediverse. So if that’s not your thing, or you’re not interested in issues of social media and safety online, you may want to skip this post.

I keep thinking of the Wil Wheaton fiasco as a kind of security incident that the Mastodon community needs to analyze and understand to prevent it from happening again.

Similar to this thread by CJ Silverio, I’m not thinking about this in terms of whether Wil Wheaton or his detractors were right or wrong. Rather, I’m thinking about how this incident demonstrates that a large-scale harassment attack by motivated actors is not only possible in the fediverse, but is arguably easier than in a centralized system like Twitter or Facebook, where automated tools can help moderators to catch dogpiling as it happens.

As someone who both administrates and moderates Mastodon instances, and who believes in Mastodon’s mission to make social media a more pleasant and human-centric place, this post is my attempt to define the attack vector and propose strategies to prevent it in the future.

Harassment as an attack vector

First off, it’s worth pointing out that there is probably a higher moderator-to-user ratio in the fediverse than on centralized social media like Facebook or Twitter.

According to a Motherboard report, Facebook has about 7,500 moderators for 2 billion users. This works out to roughly 1 moderator per 260,000 users.

Compared to that, a small Mastodon instance like toot.cafe has about 450 active users and one moderator, which is better than Facebook’s ratio by a factor of 500. Similarly, a large instance like mastodon.cloud (where Wil Wheaton had his account) apparently has one moderator and about 5,000 active users, which is still better than Facebook by a factor of 50. But it wasn’t enough to protect Wil Wheaton from mobbing.

The attack vector looks like this: a group of motivated harassers chooses a target somewhere in the fediverse. Every time that person posts, they immediately respond, maybe with something clever like “fuck you” or “log off.” So from the target’s point of view, every time they post something, even something innocuous like a painting or a photo of their dog, they immediately get a dozen comments saying “fuck you” or “go away” or “you’re not welcome here” or whatever. This makes it essentially impossible for them to use the social media platform.

The second part of the attack is that, when the target posts something, harassers from across the fediverse click the “report” button and send a report to their local moderators as well as the moderator of the target’s instance. This overwhelms both the local moderators and (especially) the remote moderator. In mastodon.cloud’s case, it appears the moderator got 60 reports overnight, which was so much trouble that they decided to evict Wil Wheaton from the instance rather than deal with the deluge.

A list of reports in the Mastodon moderation UI

For anyone who has actually done Mastodon moderation, this is totally understandable. The interface is good, but for something like 60 reports, even if your goal is to dismiss them on sight, it’s a lot of tedious pointing-and-clicking. There are currently no batch-operation tools in the moderator UI, and the API is incomplete, so it’s not yet possible to write third-party tools on top of it.

Comparisons to spam

These moderation difficulties also apply to spam, which, as Sarah Jeong points out, is a close cousin to harassment if not the same thing.

During a recent spambot episode in the fediverse, I personally spent hours reporting hundreds of accounts and then suspending them. Many admins like myself closed registrations as a temporary measure to prevent new spambot accounts, until email domain blocking was added to Mastodon and we were able to block the spambots in one fell swoop. (The spambots used various domains in their email addresses, but they all used same email MX domain.)

This was a good solution, but obviously it’s not ideal. If another spambot wave arrives, admins will have to coordinate yet again to block the email domain, and there’s no guarantee that the next attacker will be unsophisticated enough to use the same email domain for each account.

The moderator’s view

Back to harassment campaigns: the point is that moderators are often in the disadvantageous position of being a small number of humans, with all the standard human frailties, trying to use a moderation UI that leaves a lot to be desired.

As a moderator, I might get an email notifying me of a new report while I’m on vacation, on my phone, using a 3G connection somewhere in the countryside, and I might try to resolve the report using a tiny screen with my fumbly human fingers. Or I might get the report when I’m asleep, so I can’t even resolve it for another 8 hours.

Even in the best of conditions, resolving a report is hard. There may be a lot of context behind the report. For instance, if the harassing comment is “lol you got bofa’d in the ligma” then suddenly there’s a lot of context that the moderator has to unpack. (And in case you’re wondering, Urban Dictionary is almost useless for this kind of stuff, because the whole point of the slang and in-jokes is to ensure that the uninitiated aren’t in on the joke, so the top-voted Urban Dictionary definitions usually contain a lot of garbage.)

The Mastodon moderation UI

So now, as a moderator, I might be looking through a thread history and trying to figure out whether something actually constitutes harassment or not, who the reported account is, who reported it, which instance they’re on, etc.

If I choose to suspend, I have to be careful because a remote suspension is not the same thing as a local suspension: a remote suspension merely hides the remote content, whereas a local suspension permanently deletes the account and all their toots. So account moderation can feel like a high-wire act, where if you click the wrong thing, you can completely ruin someone’s Mastodon experience with no recourse. (Note though that in Mastodon 2.5.0 a confirmation dialogue was added for local suspensions, which makes it less scary.)

As a moderator working on a volunteer basis, it can also be hard to muster the willpower to respond to a report in a timely manner. Whenever I see a new report for my instance, I groan and think to myself, “Oh great, what horrible thing do I have to look at now.” Hate speech, vulgar images, potentially illegal content – this is all stuff I’d rather not deal with, especially if I’m on my phone, away from my computer, trying to enjoy my free time. If I’m at work, I may even have to switch away from my work computer and use a separate device and Internet connection, since otherwise I could get flagged by my work’s IT admin for downloading illegal or pornographic content.

In short: moderation is a stressful and thankless job, and those who do it deserve our respect and admiration.

Now take all these factors into account, and imagine that a coordinated group of harassers have dumped 60 (or more!) reports into the moderator’s lap all at once. This is such a stressful and laborious task that it’s not surprising that the admin may decide to suspend the target’s account rather than deal with the coordinated attack. Even if the moderator does decide to deal with it, a sustained harassment campaign could mean that managing the onslaught has become their full-time job.

A harassment campaign is also something like a human DDoS attack: it can flare up and reach its peak in a matter of hours or minutes, depending on how exactly the mob gets whipped up. This means that a moderator who doesn’t handle it on-the-spot may miss the entire thing. So again: a moderator going to sleep, turning off notifications, or just living their life is a liability, at least from the point of view of the harassment target.

Potential solutions

Now let’s start talking about solutions. First off, let’s see what the best-in-class defense is, given how Mastodon currently works.

Someone who wants to avoid a harassment campaign has a few different options:

  1. Use a private (locked) account
  2. Run their own single-user instance
  3. Move to an instance that uses instance whitelisting rather than blacklisting

Let’s go through each of these in turn.

Using a private (locked) account

Using a locked account makes your toots “followers-only” by default and requires approval before someone can follow you or view those toots. This prevents a large-scale harassment attack, since nobody but your approved followers can interact with you. However, it’s sort of a nuclear option, and from the perspective of a celebrity like Wil Wheaton trying to reach his audience, it may not be considered feasible.

Account locking can also be turned on and off at anytime. Unlike Twitter, though, this doesn’t affect the visibility of past posts, so an attacker could still send harassing replies to any of your previous toots, even if your account is currently locked. This means that if you’re under siege from a sudden harassment campaign that flares up and dies down over the course of a few hours, keeping your account locked during that time is not an effective strategy.

Running your own single-user instance

A harassment target could move to an instance where they are the admin, the moderator, and the only user. This gives them wide latitude to apply instance blocks across the entire instance, but those same instance blocks are already available at an individual level, so it doesn’t change much. On the other hand, it allows them to deal with reports about themselves by simply ignoring them, so it does solve the “report deluge” problem.

However, it doesn’t solve the problem of getting an onslaught of harassing replies from different accounts across the fediverse. Each harassing account will still require a block or an instance block, which are tools that are already available even if you don’t own your own instance.

Running your own instance may also require a level of technical savvy and dedication to learning the ins and outs of Mastodon (or another fediverse technology like Pleroma), which the harassment target may consider too much effort with little payoff.

Moving to a whitelisting instance

By default, a Mastodon instance federates with all other instances unless the admin explicitly applies a “domain block.” Some Mastodon instances, though, such as awoo.space, have forked the Mastodon codebase to allow for whitelisting rather than blacklisting.

This means that awoo.space doesn’t federate with other instances by default. Instead, awoo.space admins have to explicitly choose the instances that they federate with. This can limit the attack surface, since awoo.space isn’t exposed to every non-blacklisted instance in the fediverse; instead, it’s exposed only to a subset of instances that have already been vetted and considered safe to federate with.

In the face of a sudden coordinated attack, though, even a cautious instance like awoo.space probably federates with enough instances that a group of motivated actors could set up new accounts on the whitelisted instances and attack the target, potentially overwhelming the target instance’s moderators as well as the moderators of the connected instances. So whitelisting reduces the surface area but doesn’t prevent the attack.

Now, the target could both run their own single-user instance and enable whitelisting. If they were very cautious about which instances to federate with, this could prevent the bulk of the attack, but would require a lot of time investment and have similar problems to a locked account in terms of limiting the reach to their audience.

Conclusion

I don’t have any good answers yet as to how to prevent another dogpiling incident like the one that targeted Wil Wheaton. But I do have some ideas.

First off, the Mastodon project needs better tools for moderation. The current moderation UI is good but a bit clunky, and the API needs to be opened up so that third-party tools can be written on top of it. For instance, a tool could automatically find the email domains for reported spambots and block them. Or, another tool could read the contents of a reported toot, check for certain blacklisted curse words, and immediately delete the toot or silence/suspend the account.

Second off, Mastodon admins need to take the problem of moderation more seriously. Maybe having a team of moderators living in multiple time zones should just be considered the “cost of doing business” when running an instance. Like security features, it’s not a cost that pays visible dividends every single day, but in the event of a sudden coordinated attack it could make the difference between a good experience and a horrible experience.

Perhaps more instances should consider having paid moderators. mastodon.social already pays its moderation team via the main Mastodon Patreon page. Another possible model is for an independent moderator to be employed by multiple instances and get paid through their own Patreon page.

However, I think the Mastodon community also needs to acknowledge the weaknesses of the federated system in handling spam and harassment compared to a centralized system. As Sarah Jamie Lewis says in “On Emergent Centralization”:

Email is a perfect example of an ostensibly decentralized, distributed system that, in defending itself from abuse and spam, became a highly centralized system revolving around a few major oligarchical organizations. The majority of email sent […] today is likely to find itself being processed by the servers of one of these organizations.

Mastodon could eventually move in a similar direction, if the problems aren’t anticipated and headed off at the pass. The fediverse is still relatively peaceful, but right now that’s mostly a function of its size. The fediverse is just not as interesting of a target for attackers, because there aren’t that many people using it.

However, if the fediverse gets much bigger, it could became inundated by dedicated harassment, disinformation, or spambot campaigns (as Twitter and Facebook already are), and it could shift towards centralization as a defense mechanism. For instance, a centralized service might be set up to check toots for illegal content, or to verify accounts, or something similar.

To prevent this, Mastodon needs to recognize its inherent structural weaknesses and find solutions to mitigate them. If it doesn’t, then enough people might be harassed or spammed off of the platform that Mastodon will lose its credibility as a kinder, gentler social network. At that point, it would be abandoned by its responsible users, leaving only the spammers and harassers behind.

Thanks to Eugen Rochko for feedback on a draft of this blog post.