Author Archive

YubiKeys are neat

I recently picked up a YubiKey, because we use them at work and I was impressed with how simple and easy-to-use they are. I’ve been really happy with it so far – enough to write a blog post about it.

Photo of my YubiKeys on a keychain on a table

Basically, YubiKey works like this: whenever you need to do two-factor authentication (2FA), you just plug this little wafer into a USB port and tap a button, and it types out your one-time pass code. Interestingly, it does this by pretending to be a keyboard, which means it doesn’t require any special drivers. (Although it’s funny how Mac pops up a window saying, “Set up your keyboard…”)

The YubiKey Neo, which is the one I got, also supports NFC, so you can use it on a phone or tablet as well. I’ve only tested it on Android, but apparently iOS has some support too.

YubiKey is especially nice for sites like Google, GitHub, and Dropbox, because it runs directly in the browser using the FIDO U2F standard. Currently this is only supported in Chrome, but in Firefox you can also set security.webauth.u2f to true in about:config and it works just fine. (I use Firefox as my main browser, so I can confirm that this works across a variety of websites.)

One thing that pleasantly surprised me about YubiKey is that you can even use it for websites that don’t support U2F devices. Just download the Yubico Authenticator app, plug in your YubiKey, and now your YubiKey is an OTP app, i.e. a replacement for Google Authenticator, Authy, FreeOTP, etc. (Note that Yubico Authenticator doesn’t seem to support iOS, but it runs on desktops and Android, and is even open source on F-Droid.)

What I like the most about Yubico Authenticator is that it works the same across multiple devices, as long as you’re using the same YubiKey. This is great for me, because I have a weird Android setup, and so I’m frequently factory-resetting my phone, meaning I’d normally have to go through the hassle of setting up all my 2FA accounts again. But with YubiKey, I just have to remember to hold onto this little device that’s smaller than a stick of gum and fits on a keyring.

One thing I did find a bit annoying, though, is that the NFC communication between my YubiKey and OnePlus 5T is pretty spotty. To get it to work, I have to remove my phone from its case and the YubiKey from my keyring and clumsily mash them together a few times until it finally registers. But it does work.

Overall though, YubiKey is really cool. Definitely a worthy addition to one’s keyring, and as a bonus it makes me feel like a 21st-century James Bond. (I mean, when I plug it in and it “just works,” not when I’m mashing it into my phone like a monkey.)

If you’d like to read more about YubiKey and security, you might enjoy this article by Maciej Ceglowski on “basic security precautions for non-profits and journalists in the United States.”

Update: In addition to U2F, there is also an emerging standard called WebAuthn which is supported in Chrome, Firefox, and Edge without flags and is supported by YubiKey. So far though, website support seems limited, with Dropbox being a major exception.

Moving on from Microsoft

When I first joined Microsoft, I wrote an idealistic blog post describing the web as a public good, one that I hoped to serve by working for a browser vendor. I still believe in that vision of the web: it’s the freest platform on earth, it’s not totally dominated by any one entity, and it provides both users and developers with a degree of control that isn’t really available in the walled-garden app models of iOS, Android, etc.

Plaque saying "this is for everyone", dedicated, to Tim Berners-Lee

“This is for everyone”, a tribute to the web and Tim Berners-Lee (source)

After joining Microsoft, I continued to be blown away by the dedication and passion of those working on the Edge browser, many of whom shared my feelings about the open web. There were folks who were ex-Mozilla, ex-Opera, ex-Google, ex-Samsung – basically ex-AnyBrowserVendor. There were also plenty of Microsoft veterans who took their expertise in Windows, the CLR, and other Microsoft technologies and applied it with gusto to the unique challenges of the web.

It was fascinating to see so many people from so many different backgrounds working on something as enormously complex as a browser, and collaborating with like-minded folks at other browser vendors in the open forums of the W3C, TC39, WHATWG, and other standards bodies. Getting a peek behind the curtain at how the web “sausage” is made was a unique experience, and one that I cherish.

I’m also proud of the work I accomplished during my tenure at Microsoft. I wrote a blog post on how browser scrolling works, which not only provided a comprehensive overview for web developers, but I suspect may have even spurred Mozilla to up their scrolling game. (Congrats to Firefox for becoming the first browser to support asynchronous keyboard scrolling!)

I also did some work on the Intersection Observer spec, which I became interested in after discovering some cross-browser incompatibilities prompted by a bug in Mastodon. This was exactly the kind of thing I wanted to do for the web:

  1. Find a browser bug while working on an open-source project.
  2. Rather than just work around the bug, actually fix the browsers!
  3. Discuss the problem with the spec owners at other browser vendors.
  4. Submit the fixes to the spec and the web platform tests.

I didn’t do as much spec work as I would have liked (in particular, as a member of the Web Performance Working Group), but I am happy with the small contributions I managed to make.

While at Microsoft, I was also given the opportunity to speak at several conferences, an experience I found exhilarating if not a bit exhausting. (Eight talks in one year was perhaps too ambitious of me!) Overall, though, being a public speaker was a part of the browser gig that I thoroughly enjoyed, and the friendships I made with other conference attendees will surely linger in my mind long after I’ve forgotten whatever it was I gave a talk about. (Thankfully, though, there are always the videos!)

Photo of me delivering the talk "Solving the web performance crisis", a talk I gave on web performance

“Solving the web performance crisis,” a talk I gave on JavaScript performance (video link)

I also wrote a blog post on input responsiveness, which later inspired a post on JavaScript timers. It’s amazing how much you can learn about how a browser works by (wait for it) working on a browser! During my first year at Microsoft, I found myself steeped in discussions about the HTML5 event loop, Promises, setTimeout, setImmediate, and all the other wonderful ways of scheduling things on the web, because at the time we were knee-deep in rewriting the EdgeHTML event loop to improve performance and reliability.

Some of this work was even paralleled by other browser vendors, as shown in this blog post by Ben Kelly about Firefox 52. I fondly recall Ben and me swapping some benchmarks at a TPAC meeting. (When you get into the nitty-gritty of this stuff, sometimes it feels like other browser vendors are the only ones who really understand what you’re going through!)

I also did some work internal to Microsoft that I believe had a positive impact. In short, I met with lots of web teams and coached them on performance – walking through traces of Edge, IE, and Chrome – and helped them improve performance of their site across all browsers. Most of this coaching involved Windows Performance Analyzer, which is a surprisingly powerful tool despite being somewhat under-documented. (Although this post by my colleague Todd Reifsteck goes a long way toward demystifying some of the trickier aspects.)

I discussed this work a bit in an interview I did for Between the Wires, but most of it is private to the teams I worked with, since performance can be a tricky subject to talk about publicly. In general, neither browser vendors nor website owners want to shout to the heavens about their performance problems, so to avoid embarrassing both parties, most of the work I did in this area will probably never be public.

Presentation slide saying "you've founda  perf issue," either "website looks bad compared to competitors," or "browser looks bad compared to competitors" with hand-darwn "websites" and browser logos

A slide from a talk I gave at the Edge Web Summit (video link, photo source)

Still, this work (we called it “Performance Clubs”) was one of my favorite parts of working at Microsoft. Being a “performance consultant,” analyzing traces, and reasoning about the interplay between browser architecture and website architecture was something I really enjoyed. It was part education (I gave a lot of impromptu speeches in front of whiteboards!) and part detective work (lots of puzzling over traces, muttering to myself “this thread isn’t supposed to do that!”). But as someone who is fond of both people and technology, I think I was well-suited for the task.

After Microsoft, I’ll continue doing this same sort of work, but in a new context. I’ll be joining Salesforce as a developer working on the Lightning Platform. I’m looking forward to the challenges of building an easy-to-use web framework that doesn’t sacrifice on performance – anticipating the needs of developers as well as the inherent limitations of CPUs, GPUs, memory, storage, and networks.

It will also be fun to apply my knowledge of cross-browser differences to the enterprise space, where developers often don’t have the luxury of saying “just use another browser.” It’s an unforgiving environment to develop in, but those are exactly the kinds of challenges I relish about the web!

For those who follow me mostly for my open-source work, nothing will change with my transition to Salesforce. I intend to continue working on Mastodon- and JavaScript-related projects in my spare time, including Pinafore. In fact, my experience with Pinafore and SvelteJS may pay dividends at my new gig; one of my new coworkers even mentioned SvelteJS as their north star for building a great JavaScript framework. (Seems I may have found my tribe!) Much of my Salesforce work will also be open-source, so I’m looking forward to spending more time back on GitHub as well. (Although perhaps not as intensely as I used to.)

Leaving Microsoft is a bit bittersweet for me. I’m excited by the new challenges, but I’m also going to miss all the talented and passionate people whose company I enjoyed on the Microsoft Edge team. That said, the web is nothing if not a big tent, and there’s plenty of room to work in, on, and around it. To everyone else who loves the web as I do: I’m sure our paths will cross again.

A tour of JavaScript timers on the web

Pop quiz: what is the difference between these JavaScript timers?

  • Promises
  • setTimeout
  • setInterval
  • setImmediate
  • requestAnimationFrame
  • requestIdleCallback

More specifically, if you queue up all of these timers at once, do you have any idea which order they’ll fire in?

If not, you’re probably not alone. I’ve been doing JavaScript and web programming for years, I’ve worked for a browser vendor for two of those years, and it’s only recently that I really came to understand all these timers and how they play together.

In this post, I’m going to give a high-level overview of how these timers work, and when you might want to use them. I’ll also cover the Lodash functions debounce() and throttle(), because I find them useful as well.

Promises and microtasks

Let’s get this one out of the way first, because it’s probably the simplest. A Promise callback is also called a “microtask,” and it runs at the same frequency as MutationObserver callbacks. Assuming queueMicrotask() ever makes it out of spec-land and into browser-land, it will also be the same thing.

I’ve already written a lot about promises. One quick misconception about promises that’s worth covering, though, is that they don’t give the browser a chance to breathe. Just because you’re queuing up an asynchronous callback, that doesn’t mean that the browser can render, or process input, or do any of the stuff we want browsers to do.

For example, let’s say we have a function that blocks the main thread for 1 second:

function block() {
  var start = Date.now()
  while (Date.now() - start < 1000) { /* wheee */ }
}

If we were to queue up a bunch of microtasks to call this function:

for (var i = 0; i < 100; i++) {
  Promise.resolve().then(block)
}

This would block the browser for about 100 seconds. It’s basically the same as if we had done:

for (var i = 0; i < 100; i++) {
  block()
}

Microtasks execute immediately after any synchronous execution is complete. There’s no chance to fit in any work between the two. So if you think you can break up a long-running task by separating it into microtasks, then it won’t do what you think it’s doing.

setTimeout and setInterval

These two are cousins: setTimeout queues a task to run in x number of milliseconds, whereas setInterval queues a recurring task to run every x milliseconds.

The thing is… browsers don’t really respect that milliseconds thing. You see, historically, web developers have abused setTimeout. A lot. To the point where browsers have had to add mitigations for setTimeout(/* ... */, 0) to avoid locking up the browser’s main thread, because a lot of websites tended to throw around setTimeout(0) like confetti.

This is the reason that a lot of the tricks in crashmybrowser.com don’t work anymore, such as queuing up a setTimeout that calls two more setTimeouts, which call two more setTimeouts, etc. I covered a few of these mitigations from the Edge side of things in “Improving input responsiveness in Microsoft Edge”.

Broadly speaking, a setTimeout(0) doesn’t really run in zero milliseconds. Usually, it runs in 4. Sometimes, it may run in 16 (this is what Edge does when it’s on battery power, for instance). Sometimes it may be clamped to 1 second (e.g., when running in a background tab). These are the sorts of tricks that browsers have had to invent to prevent runaway web pages from chewing up your CPU doing useless setTimeout work.

So that said, setTimeout does allow the browser to run some work before the callback fires (unlike microtasks). But if your goal is to allow input or rendering to run before the callback, setTimeout is usually not the best choice because it only incidentally allows those things to happen. Nowadays, there are better browser APIs that can hook more directly into the browser’s rendering system.

setImmediate

Before moving on to those “better browser APIs,” it’s worth mentioning this thing. setImmediate is, for lack of a better word … weird. If you look it up on caniuse.com, you’ll see that only Microsoft browsers support it. And yet it also exists in Node.js, and has lots of “polyfills” on npm. What the heck is this thing?

setImmediate was originally proposed by Microsoft to get around the problems with setTimeout described above. Basically, setTimeout had been abused, and so the thinking was that we can create a new thing to allow setImmediate(0) to actually be setImmediate(0) and not this funky “clamped to 4ms” thing. You can see some discussion about it from Jason Weber back in 2011.

Unfortunately, setImmediate was only ever adopted by IE and Edge. Part of the reason it’s still in use is that it has a sort of superpower in IE, where it allows input events like keyboard and mouseclicks to “jump the queue” and fire before the setImmediate callback is executed, whereas IE doesn’t have the same magic for setTimeout. (Edge eventually fixed this, as detailed in the previously-mentioned post.)

Also, the fact that setImmediate exists in Node means that a lot of “Node-polyfilled” code is using it in the browser without really knowing what it does. It doesn’t help that the differences between Node’s setImmediate and process.nextTick are very confusing, and even the official Node docs say the names should really be reversed. (For the purposes of this blog post though, I’m going to focus on the browser rather than Node because I’m not a Node expert.)

Bottom line: use setImmediate if you know what you’re doing and you’re trying to optimize input performance for IE. If not, then just don’t bother. (Or only use it in Node.)

requestAnimationFrame

Now we get to the most important setTimeout replacement, a timer that actually hooks into the browser’s rendering loop. By the way, if you don’t know how the browser event loops works, I strongly recommend this talk by Jake Archibald. Go watch it, I’ll wait.

Okay, now that you’re back, requestAnimationFrame basically works like this: it’s sort of like a setTimeout, except instead of waiting for some unpredictable amount of time (4 milliseconds, 16 milliseconds, 1 second, etc.), it executes before the browser’s next style/layout calculation step. Now, as Jake points out in his talk, there is a minor wrinkle in that it actually executes after this step in Safari, IE, and Edge <18, but let's ignore that for now since it's usually not an important detail.

The way I think of requestAnimationFrame is this: whenever I want to do some work that I know is going to modify the browser's style or layout – for instance, changing CSS properties or starting up an animation – I stick it in a requestAnimationFrame (abbreviated to rAF from here on out). This ensures a few things:

  1. I'm less likely to layout thrash, because all of the changes to the DOM are being queued up and coordinated.
  2. My code will naturally adapt to the performance characteristics of the browser. For instance, if it's a low-cost device that is struggling to render some DOM elements, rAF will naturally slow down from the usual 16.7ms intervals (on 60 Hertz screens) and thus it won't bog down the machine in the same way that running a lot of setTimeouts or setIntervals might.

This is why animation libraries that don't rely on CSS transitions or keyframes, such as GreenSock or React Motion, will typically make their changes in a rAF callback. If you're animating an element between opacity: 0 and opacity: 1, there's no sense in queuing up a billion callbacks to animate every possible intermediate state, including opacity: 0.0000001 and opacity: 0.9999999.

Instead, you're better off just using rAF to let the browser tell you how many frames you're able to paint during a given period of time, and calculate the "tween" for that particular frame. That way, slow devices naturally end up with a slower framerate, and faster devices end up with a faster framerate, which wouldn't necessarily be true if you used something like setTimeout, which operates independently of the browser's rendering speed.

requestIdleCallback

rAF is probably the most useful timer in the toolkit, but requestIdleCallback is worth talking about as well. The browser support isn't great, but there's a polyfill that works just fine (and it uses rAF under the hood).

In many ways rAF is similar to requestIdleCallback. (I'll abbreviate it to rIC from now on. Starting to sound like a pair of troublemakers from West Side Story, huh? "There go Rick and Raff, up to no good!")

Like rAF, rIC will naturally adapt to the browser's performance characteristics: if the device is under heavy load, rIC may be delayed. The difference is that rIC fires on the browser "idle" state, i.e. when the browser has decided it doesn't have any tasks, microtasks, or input events to process, and you're free to do some work. It also gives you a "deadline" to track how much of your budget you're using, which is a nice feature.

Dan Abramov has a good talk from JSConf Iceland 2018 where he shows how you might use rIC. In the talk, he has a webapp that calls rIC for every keyboard event while the user is typing, and then it updates the rendered state inside of the callback. This is great because a fast typist can cause many keydown/keyup events to fire very quickly, but you don't necessarily want to update the rendered state of the page for every keypress.

Another good example of this is a “remaining character count” indicator on Twitter or Mastodon. I use rIC for this in Pinafore, because I don't really care if the indicator updates for every single key that I type. If I'm typing quickly, it's better to prioritize input responsiveness so that I don't lose my sense of flow.

Screenshot of Pinafore with some text entered in the text box and a digit counter showing the number of remaining characters

In Pinafore, the little horizontal bar and the “characters remaining” indicator update as you type.

One thing I’ve noticed about rIC, though, is that it’s a little finicky in Chrome. In Firefox it seems to fire whenever I would, intuitively, think that the browser is “idle” and ready to run some code. (Same goes for the polyfill.) In mobile Chrome for Android, though, I’ve noticed that whenever I scroll with touch scrolling, it might delay rIC for several seconds even after I’m done touching the screen and the browser is doing absolutely nothing. (I suspect the issue I’m seeing is this one.)

Update: Alex Russell from the Chrome team informs me that this is a known issue and should be fixed soon!

In any case, rIC is another great tool to add to the tool chest. I tend to think of it this way: use rAF for critical rendering work, use rIC for non-critical work.

debounce and throttle

These two functions aren’t built in to the browser, but they’re so useful that they’re worth calling out on their own. If you aren’t familiar with them, there’s a good breakdown in CSS Tricks.

My standard use for debounce is inside of a resize callback. When the user is resizing their browser window, there’s no point in updating the layout for every resize callback, because it fires too frequently. Instead, you can debounce for a few hundred milliseconds, which will ensure that the callback eventually fires once the user is done fiddling with their window size.

throttle, on the other hand, is something I use much more liberally. For instance, a good use case is inside of a scroll event. Once again, it’s usually senseless to try to update the rendered state of the app for every scroll callback, because it fires too frequently (and the frequency can vary from browser to browser and from input method to input method… ugh). Using throttle normalizes this behavior, and ensures that it only fires every x number of milliseconds. You can also tweak Lodash’s throttle (or debounce) function to fire at the start of the delay, at the end, both, or neither.

In contrast, I wouldn’t use debounce for the scrolling scenario, because I don’t want the UI to only update after the user has explicitly stopped scrolling. That can get annoying, or even confusing, because the user might get frustrated and try to keep scrolling in order to update the UI state (e.g. in an infinite-scrolling list). throttle is better in this case, because it doesn’t wait for the scroll event to stop firing.

throttle is a function I use all over the place for all kinds of user input, and even for some regularly-scheduled tasks like IndexedDB cleanups. It’s extremely useful. Maybe it should just be baked into the browser some day!

Conclusion

So that’s my whirlwind tour of the various timer functions available in the browser, and how you might use them. I probably missed a few, because there are certainly some exotic ones out there (postMessage or lifecycle events, anyone?). But hopefully this at least provides a good overview of how I think about JavaScript timers on the web.

How to deal with “discourse”

It was chaotic human weather. There’d be a nice morning and then suddenly a storm would roll in.

– Jaron Lanier, describing computer message boards in the 1970s (source, p. 42)

Are you tired of the “discourse” and drama in Mastodon and the fediverse? When it happens, do you wish it would just go away?

Here’s one simple trick to stop discourse dead in its tracks:

Don’t talk about it.

Now, this may sound too glib and oversimplified, so to put it in other words:

When discourse is happening, just don’t talk about it.

That’s it. That’s the way you solve discourse. It’s really as easy as that.

Discourse is a reflection of the innate human desire to not only look at a car crash, but to slow down and gawk at it, causing traffic to grind to a halt so that everyone else says, “Well, I may as well see what the fuss is about.” The more you talk about it, the more you feed it.

So just don’t. Don’t write hot takes on it, don’t make jokes about it, don’t comment on how you’re tired of it, don’t try to calm everybody down, don’t write a big thread about how discourse is ruining the fediverse and won’t it please stop. Just don’t. Pretend like it’s not even there.

There’s a scene in a Simpsons Halloween episode where a bunch of billboard ads have come to life and are running amuck, destroying Springfield. Eventually though, Lisa realizes that the only power ads have is the power we give them, and if you “just don’t look” then they’ll keel over and die.

Simpsons animation of billboard ads wrecking buildings with subtitle "Just don't look"

The “discourse” is exactly the same. Every time you talk about it, even just to mention it offhand or make a joke about it, it encourages more people to say to themselves, “Ooh, a fight! I gotta check this out.” Then they scroll back in their timeline to try to figure out the context, and the cycle begins anew. It’s like a disease that spreads by people complaining about it.

This is why whenever discourse is happening, I just talk about something else. I might also block or mute anyone who is talking about it, because I find the endless drama boring.

Like a car crash, it’s never really interesting. It’s never something that’s going to change your life by finding out about it. It’s always the same petty squabbling you’ve seen a hundred times online.

Once the storm has passed, though, it’s safe to talk about it. You may even write a longwinded blog post about it. But while it’s happening, remember: “just don’t look, just don’t look.”

Mastodon and the challenges of abuse in a federated system

This post will probably only make sense to those deeply involved in Mastodon and the fediverse. So if that’s not your thing, or you’re not interested in issues of social media and safety online, you may want to skip this post.

I keep thinking of the Wil Wheaton fiasco as a kind of security incident that the Mastodon community needs to analyze and understand to prevent it from happening again.

Similar to this thread by CJ Silverio, I’m not thinking about this in terms of whether Wil Wheaton or his detractors were right or wrong. Rather, I’m thinking about how this incident demonstrates that a large-scale harassment attack by motivated actors is not only possible in the fediverse, but is arguably easier than in a centralized system like Twitter or Facebook, where automated tools can help moderators to catch dogpiling as it happens.

As someone who both administrates and moderates Mastodon instances, and who believes in Mastodon’s mission to make social media a more pleasant and human-centric place, this post is my attempt to define the attack vector and propose strategies to prevent it in the future.

Harassment as an attack vector

First off, it’s worth pointing out that there is probably a higher moderator-to-user ratio in the fediverse than on centralized social media like Facebook or Twitter.

According to a Motherboard report, Facebook has about 7,500 moderators for 2 billion users. This works out to roughly 1 moderator per 260,000 users.

Compared to that, a small Mastodon instance like toot.cafe has about 450 active users and one moderator, which is better than Facebook’s ratio by a factor of 500. Similarly, a large instance like mastodon.cloud (where Wil Wheaton had his account) apparently has one moderator and about 5,000 active users, which is still better than Facebook by a factor of 50. But it wasn’t enough to protect Wil Wheaton from mobbing.

The attack vector looks like this: a group of motivated harassers chooses a target somewhere in the fediverse. Every time that person posts, they immediately respond, maybe with something clever like “fuck you” or “log off.” So from the target’s point of view, every time they post something, even something innocuous like a painting or a photo of their dog, they immediately get a dozen comments saying “fuck you” or “go away” or “you’re not welcome here” or whatever. This makes it essentially impossible for them to use the social media platform.

The second part of the attack is that, when the target posts something, harassers from across the fediverse click the “report” button and send a report to their local moderators as well as the moderator of the target’s instance. This overwhelms both the local moderators and (especially) the remote moderator. In mastodon.cloud’s case, it appears the moderator got 60 reports overnight, which was so much trouble that they decided to evict Wil Wheaton from the instance rather than deal with the deluge.

Screenshot of a list of reports in the Mastodon moderation UI

A list of reports in the Mastodon moderation UI

For anyone who has actually done Mastodon moderation, this is totally understandable. The interface is good, but for something like 60 reports, even if your goal is to dismiss them on sight, it’s a lot of tedious pointing-and-clicking. There are currently no batch-operation tools in the moderator UI, and the API is incomplete, so it’s not yet possible to write third-party tools on top of it.

Comparisons to spam

These moderation difficulties also apply to spam, which, as Sarah Jeong points out, is a close cousin to harassment if not the same thing.

During a recent spambot episode in the fediverse, I personally spent hours reporting hundreds of accounts and then suspending them. Many admins like myself closed registrations as a temporary measure to prevent new spambot accounts, until email domain blocking was added to Mastodon and we were able to block the spambots in one fell swoop. (The spambots used various domains in their email addresses, but they all used same email MX domain.)

This was a good solution, but obviously it’s not ideal. If another spambot wave arrives, admins will have to coordinate yet again to block the email domain, and there’s no guarantee that the next attacker will be unsophisticated enough to use the same email domain for each account.

The moderator’s view

Back to harassment campaigns: the point is that moderators are often in the disadvantageous position of being a small number of humans, with all the standard human frailties, trying to use a moderation UI that leaves a lot to be desired.

As a moderator, I might get an email notifying me of a new report while I’m on vacation, on my phone, using a 3G connection somewhere in the countryside, and I might try to resolve the report using a tiny screen with my fumbly human fingers. Or I might get the report when I’m asleep, so I can’t even resolve it for another 8 hours.

Even in the best of conditions, resolving a report is hard. There may be a lot of context behind the report. For instance, if the harassing comment is “lol you got bofa’d in the ligma” then suddenly there’s a lot of context that the moderator has to unpack. (And in case you’re wondering, Urban Dictionary is almost useless for this kind of stuff, because the whole point of the slang and in-jokes is to ensure that the uninitiated aren’t in on the joke, so the top-voted Urban Dictionary definitions usually contain a lot of garbage.)

Screenshot of the Mastodon moderation UI

The Mastodon moderation UI

So now, as a moderator, I might be looking through a thread history and trying to figure out whether something actually constitutes harassment or not, who the reported account is, who reported it, which instance they’re on, etc.

If I choose to suspend, I have to be careful because a remote suspension is not the same thing as a local suspension: a remote suspension merely hides the remote content, whereas a local suspension permanently deletes the account and all their toots. So account moderation can feel like a high-wire act, where if you click the wrong thing, you can completely ruin someone’s Mastodon experience with no recourse. (Note though that in Mastodon 2.5.0 a confirmation dialogue was added for local suspensions, which makes it less scary.)

As a moderator working on a volunteer basis, it can also be hard to muster the willpower to respond to a report in a timely manner. Whenever I see a new report for my instance, I groan and think to myself, “Oh great, what horrible thing do I have to look at now.” Hate speech, vulgar images, potentially illegal content – this is all stuff I’d rather not deal with, especially if I’m on my phone, away from my computer, trying to enjoy my free time. If I’m at work, I may even have to switch away from my work computer and use a separate device and Internet connection, since otherwise I could get flagged by my work’s IT admin for downloading illegal or pornographic content.

In short: moderation is a stressful and thankless job, and those who do it deserve our respect and admiration.

Now take all these factors into account, and imagine that a coordinated group of harassers have dumped 60 (or more!) reports into the moderator’s lap all at once. This is such a stressful and laborious task that it’s not surprising that the admin may decide to suspend the target’s account rather than deal with the coordinated attack. Even if the moderator does decide to deal with it, a sustained harassment campaign could mean that managing the onslaught has become their full-time job.

A harassment campaign is also something like a human DDoS attack: it can flare up and reach its peak in a matter of hours or minutes, depending on how exactly the mob gets whipped up. This means that a moderator who doesn’t handle it on-the-spot may miss the entire thing. So again: a moderator going to sleep, turning off notifications, or just living their life is a liability, at least from the point of view of the harassment target.

Potential solutions

Now let’s start talking about solutions. First off, let’s see what the best-in-class defense is, given how Mastodon currently works.

Someone who wants to avoid a harassment campaign has a few different options:

  1. Use a private (locked) account
  2. Run their own single-user instance
  3. Move to an instance that uses instance whitelisting rather than blacklisting

Let’s go through each of these in turn.

Using a private (locked) account

Using a locked account makes your toots “followers-only” by default and requires approval before someone can follow you or view those toots. This prevents a large-scale harassment attack, since nobody but your approved followers can interact with you. However, it’s sort of a nuclear option, and from the perspective of a celebrity like Wil Wheaton trying to reach his audience, it may not be considered feasible.

Account locking can also be turned on and off at anytime. Unlike Twitter, though, this doesn’t affect the visibility of past posts, so an attacker could still send harassing replies to any of your previous toots, even if your account is currently locked. This means that if you’re under siege from a sudden harassment campaign that flares up and dies down over the course of a few hours, keeping your account locked during that time is not an effective strategy.

Running your own single-user instance

A harassment target could move to an instance where they are the admin, the moderator, and the only user. This gives them wide latitude to apply instance blocks across the entire instance, but those same instance blocks are already available at an individual level, so it doesn’t change much. On the other hand, it allows them to deal with reports about themselves by simply ignoring them, so it does solve the “report deluge” problem.

However, it doesn’t solve the problem of getting an onslaught of harassing replies from different accounts across the fediverse. Each harassing account will still require a block or an instance block, which are tools that are already available even if you don’t own your own instance.

Running your own instance may also require a level of technical savvy and dedication to learning the ins and outs of Mastodon (or another fediverse technology like Pleroma), which the harassment target may consider too much effort with little payoff.

Moving to a whitelisting instance

By default, a Mastodon instance federates with all other instances unless the admin explicitly applies a “domain block.” Some Mastodon instances, though, such as awoo.space, have forked the Mastodon codebase to allow for whitelisting rather than blacklisting.

This means that awoo.space doesn’t federate with other instances by default. Instead, awoo.space admins have to explicitly choose the instances that they federate with. This can limit the attack surface, since awoo.space isn’t exposed to every non-blacklisted instance in the fediverse; instead, it’s exposed only to a subset of instances that have already been vetted and considered safe to federate with.

In the face of a sudden coordinated attack, though, even a cautious instance like awoo.space probably federates with enough instances that a group of motivated actors could set up new accounts on the whitelisted instances and attack the target, potentially overwhelming the target instance’s moderators as well as the moderators of the connected instances. So whitelisting reduces the surface area but doesn’t prevent the attack.

Now, the target could both run their own single-user instance and enable whitelisting. If they were very cautious about which instances to federate with, this could prevent the bulk of the attack, but would require a lot of time investment and have similar problems to a locked account in terms of limiting the reach to their audience.

Conclusion

I don’t have any good answers yet as to how to prevent another dogpiling incident like the one that targeted Wil Wheaton. But I do have some ideas.

First off, the Mastodon project needs better tools for moderation. The current moderation UI is good but a bit clunky, and the API needs to be opened up so that third-party tools can be written on top of it. For instance, a tool could automatically find the email domains for reported spambots and block them. Or, another tool could read the contents of a reported toot, check for certain blacklisted curse words, and immediately delete the toot or silence/suspend the account.

Second off, Mastodon admins need to take the problem of moderation more seriously. Maybe having a team of moderators living in multiple time zones should just be considered the “cost of doing business” when running an instance. Like security features, it’s not a cost that pays visible dividends every single day, but in the event of a sudden coordinated attack it could make the difference between a good experience and a horrible experience.

Perhaps more instances should consider having paid moderators. mastodon.social already pays its moderation team via the main Mastodon Patreon page. Another possible model is for an independent moderator to be employed by multiple instances and get paid through their own Patreon page.

However, I think the Mastodon community also needs to acknowledge the weaknesses of the federated system in handling spam and harassment compared to a centralized system. As Sarah Jamie Lewis says in “On Emergent Centralization”:

Email is a perfect example of an ostensibly decentralized, distributed system that, in defending itself from abuse and spam, became a highly centralized system revolving around a few major oligarchical organizations. The majority of email sent […] today is likely to find itself being processed by the servers of one of these organizations.

Mastodon could eventually move in a similar direction, if the problems aren’t anticipated and headed off at the pass. The fediverse is still relatively peaceful, but right now that’s mostly a function of its size. The fediverse is just not as interesting of a target for attackers, because there aren’t that many people using it.

However, if the fediverse gets much bigger, it could became inundated by dedicated harassment, disinformation, or spambot campaigns (as Twitter and Facebook already are), and it could shift towards centralization as a defense mechanism. For instance, a centralized service might be set up to check toots for illegal content, or to verify accounts, or something similar.

To prevent this, Mastodon needs to recognize its inherent structural weaknesses and find solutions to mitigate them. If it doesn’t, then enough people might be harassed or spammed off of the platform that Mastodon will lose its credibility as a kinder, gentler social network. At that point, it would be abandoned by its responsible users, leaving only the spammers and harassers behind.

Thanks to Eugen Rochko for feedback on a draft of this blog post.

Should computers serve humans, or should humans serve computers?

The best science fiction doesn’t necessarily tell us something about the future, but it might tell us something about the present.

At its best, sci-fi finds something true about human nature or human society and then places it in a new context, where we can look at it with fresh eyes. Sci-fi helps us see ourselves more clearly.

This is a video made by Microsoft in 2011 that shows one sci-fi vision of the future:

This is a utopian vision of technology. Computers exist to make people more productive, to extend the natural capabilities of our bodies, to serve as a true “bicycle of the mind”. Computers are omnipresent, but they are at our beck and call, and they exist to serve us.

This is a video showing a different vision of the future:

 

This is a dystopian vision of technology. Computers are omnipresent, but instead of enabling us to be more productive or to grant us more leisure time, they exist to distract us, harass us, and cajole us. In this world, the goal of technology is to convince us to buy more things, or to earn points in a useless game, or to send us on odd jobs the computer chose for us.

A similar vision of the future comes from Audrey Schulman’s Theory of Bastards. The protagonist rides a self-driving car, but she can’t turn off the video advertisements because her implant is six months out of date, and so the commands she barks at the car fail with an “unknown” error.

She blames herself for failing to upgrade her implant, in the way you might chide yourself for forgetting to see the dentist.

As the car arrives, she pays for the trip. Then she notes:

“At least in terms of payment, the manufacturers made sure there was never any difficulty with version differences. It was only the actual applications that gradually became impossible to control.”

Between the utopian and dystopian, which vision of the future seems more likely to you? Which vision seems more true to how we currently live with technology, in the form of our smartphones and social media apps?

I know which one seems more likely to me, and it gives me the willies.

The core question we technologists should be asking ourselves is: do we want to live in a world where computers serve humans, or where humans serve computers?

Or to put it another way: do we want to live in a world where the users of technology are in control of their devices? Or do we want to live in a world where the owners of technology use it as yet another means of control over those without the resources, the knowledge, or the privilege to fight back?

Are we building technology for a world of masters, or a world of slaves?

Introducing Pinafore for Mastodon

Today I’m happy to announce a project I’ve been quietly working on for some time: Pinafore. Pinafore is an alternative web client for Mastodon, which looks like this:

Screenshot of Pinafore home page

Here are some of its features:

  • Speed. Pinafore is built on Svelte, meaning it’s faster and lighter-weight[1] than most web apps.
  • Simplicity. Single-column layout, easy-to-read text, and large images.
  • Multi-account support. Log in to multiple instances and set a custom theme for each one.
  • Works offline. Recently-viewed timelines are fully browsable offline.
  • PWA. Pinafore is a Progressive Web App, so you can add it to your phone’s home screen and it will work like a native app.
  • Private. All communication is private between your browser and your instance. No ads or third-party trackers.

Pinafore is still beta quality, but I’m releasing it now to get early feedback. Of course it’s also open-source, so feel free to browse the source code.

In the rest of this post, I want to share a bit about the motivation behind Pinafore, as well as some technical details about how it works.

If you don’t care about technical details, you can skip to pinafore.social or read the user guide.

The need for speed

I love the Mastodon web app, and I’ve even contributed code to it. It’s a PWA, it’s responsive, and it works well across multiple devices. But eventually, I felt like I could make something interesting by rewriting the frontend from scratch. I had a few main goals.

First off, I wanted the UI to be fast even on low-end laptops or phones. For Pinafore, my litmus test was whether it could work well on a Nexus 5 (released in 2013).

Having set the bar that high, I made some hard choices to squeeze out better performance:

  • For the framework, I chose Sapper and Svelte because they offer state-of-the-art performance, essentially compiling to vanilla JavaScript.
  • To be resilient on slow connections, or Lie-Fi, or when offline altogether, Pinafore stores data locally in IndexedDB.
  • To use less memory, Pinafore keeps only a fraction of its UI state in memory (most is stored in IndexedDB), and it uses a fully virtual list to reduce the number of DOM nodes.

Other design decisions that impacted performance:

I also chose to only support modern browsers – the latest versions of Chrome, Edge, Firefox, and Safari. Because of this, Pinafore is able to directly ship modern JavaScript instead of using something like Babel to compile down to a more bloated ES5 format. It also loads a minimum of polyfills, and only for those browsers that need them.

Privacy is key

Thanks to the magic of CORS, Pinafore is an almost purely client-side web app[2]. When you use the Pinafore website, your browser communicates directly with your instance’s public API, just like a native app would. The only job of the pinafore.social server is to serve up HTML, CSS, and JavaScript.

This not only makes the implementation simpler: it also guarantees privacy. You don’t have to trust Pinafore to keep your data safe, because it never handles it in the first place! All user data is stored in your browser, and logging out of your instance simply deletes it.

And even if you don’t trust the Pinafore server, it’s an open-source project, so you can always run it locally. Like the Mastodon project itself, I gave it an AGPL license, so you can host it yourself as long as you make the modified source code available.

Q & A

What’s with the name?

Pinafore is named after my favorite Gilbert and Sullivan play. If you’re unfamiliar, this bit from The Simpsons is a great intro.

Does it work without JavaScript?

Pinafore’s landing page works without JavaScript for SEO reasons, but the app itself requires JavaScript. Although Server-Side Rendering (SSR) is possible, it would require storing user data on Pinafore’s servers, and so I chose to avoid it.

Why are you trying to compete with Mastodon?

Pinafore doesn’t compete with Mastodon; it complements it. Mastodon has a very open API, which is what allows for the flourishing of mobile apps, command-line tools, and even web clients like halcyon.social or Pinafore itself.

One of my goals with Pinafore is to take a bit of the pressure off the Mastodon project to try to be all things to all people. There are lots of demands on Mastodon to make small UI tweaks to suit various users’ preferences, which is a major source of contention, and also partly the motivation for forks like glitch-soc.

But ideally, the way a user engages with their Mastodon instance should be up to that user. As a user, if I want a different background color or for images to be rendered differently, then why should I wait for the Mastodon maintainers or my admin to make that change? I use whatever mobile app I like, so why should the web UI be any different?

As Eugen has said, the web UI is just one application out of many. And I don’t even intend for Pinafore to replace the web UI. There are various features in that UI that I have no plans to implement, such as administration tools, moderation tools, and complex multi-column scenarios. Plus, the web UI is the landing page for each instance, and an opportunity for those instances to be creative and express themselves.

Why didn’t you implement <x feature>?

As with any project, I prioritized some features at the expense of others. Some of these decisions were based on design goals, whereas others were just to get a working beta out the door. I have a list of goals and non-goals in the project README, as well as a roadmap for basic feature parity with the Mastodon web UI.

Why didn’t you use the ActivityPub client-to-server API?

ActivityPub defines both a server-to-server and a client-to-server API, but Mastodon only supports the former. Also, Mastodon’s client-to-server API is supported by other projects like Pleroma, so for now, it’s the most practical choice.

What’s your business model?

None. I wrote Pinafore for fun, out of love for the Mastodon project.

How can I chip in?

I’m a privileged white dude living in a developed country who works for a large tech company. I don’t need any donations. Please donate to Eugen instead so he can continue making Mastodon better!

Thanks!

If you’ve read this far, give Pinafore a try and tell me what you think.

Footnotes

1. Measuring the size of the JavaScript payload after gzip when loading the home feed on a desktop browser, I recorded 333KB for Pinafore, 1.01MB for mastodon.social, and 2.25MB for twitter.com.

2. For the purpose of readable URLs, some minor routing logic is done on the server-side. For instance, account IDs, status IDs, instance names, and hashtags may be sent to the server as part of the URL. But on modern browsers, this will only occur if you explicitly navigate to a page with that URL and the Service Worker hasn’t already been activated, or you hard-refresh. In the vast majority of cases, the Service Worker should handle these URLs, and thus even this light metadata is not sent to the server.