Archive for August, 2021

My love-hate affair with technology

Ten years ago I would have considered myself someone who was excited about new technology. I always had the latest smartphone, I would read the reviews of new Android releases with a lot of interest, and I was delighted when things like Google Maps Navigation, speech-to-text, or keyboard swiping made my life easier.

Nowadays, to the average person I probably look like a technology curmudgeon. I don’t have a smart speaker, a smart watch, or any smart home appliances. My 4-year-old phone runs a de-Googled LineageOS that barely runs any apps other than Signal and F-Droid. My house has a Raspberry Pi running Nextcloud for file storage and Pi-hole for ad blocking. When I bought a new TV I refused to connect it to the Internet; instead, I hooked it up to an old PC running Ubuntu so I can watch Netflix, Hulu, etc.

My wife complains that none of the devices in our house work, and she’s right. The Pi-hole blocks a lot of websites, and it’s a struggle to unblock them. Driving the TV with a wireless keyboard is cumbersome. Nextcloud is clunky compared to something like Dropbox or Google Drive. I even tried cloudflared for a while, but I had to give up when DNS kept periodically failing.

One time – no joke – I had a dream that I was using some open-source alternative to a popular piece of software, and it was slow and buggy. I don’t even remember what it was, but I remember being frustrated. This is just what I’m used to nowadays – not using a technology because it’s the best-in-class or makes my life easier, but because it meets some high-minded criteria about how I think software should be: privacy-respecting, open-source, controlled by the user, etc.

To the average person, this is probably crazy. “Nolan,” they’d say. “You couldn’t order a Lyft because their web app didn’t work in Firefox for Android. Your files don’t sync away from home because you’re only running Nextcloud on your local network. Your friends can’t even message you on WhatsApp, Facebook, or Twitter because you don’t have an account and the apps don’t work on your phone. If you want to live in the eighteenth century so bad, why don’t you get a horse and buggy while you’re at it?”

Maybe this nagging voice in my head is right (and I do think these thoughts sometimes). Maybe what I’m practicing is a kind of tech veganism that, like real veganism, is a great idea in theory but really hard to stick to in practice. (And yes, I’ve tried real veganism too. Maybe I should join a monastery at this point.)

On the other hand, I have to remind myself that there are benefits to the somewhat ascetic lifestyle I’ve chosen. The thing that finally pushed me to switch from stock Android to de-Googled LineageOS was all the ads and notifications in Google Maps. I remember fumbling around with a dozen settings, but never being able to get rid of the “Hey, rate this park” message. (Because everything on Earth needs a star rating apparently.)

And now, I don’t have to deal with Google Maps anymore! Instead I deal with OsmAnd~, which broke down the other day and failed to give me directions. So it goes.

Maybe someday I’ll relent. Maybe I’ll say, “I’m too old for this shit” and start using technology that actually works instead of technology that meets some idealistic and probably antiquated notion of software purity. Maybe I’ll be forced to, because I need a pacemaker that isn’t open-source. Or maybe there will be some essential government service that requires a Google or Apple phone – my state’s contact tracing app does! I got jury duty recently and was unsurprised to find that they do everything through Zoom. At what point will it be impossible to be a tech hermit, without being an actual hermit?

That said, I’m still doing what I’m doing for now. It helps that I’m on Mastodon, where there are plenty of folks who are even more hardcore than me. (“I won’t even look at a computer if it’s running non-FLOSS software,” they smirk, typing from their BSD laptop behind five layers of Tor.) Complaining to this crowd about how I can’t buy a TV anymore without it spying on me makes me feel a little bit normal. Just a bit.

The thing that has always bothered me about this, and which continues to bother me, is that I’m only able to live this lifestyle because I have the technical know-how. The average person would neither know how to do any of the things I’m doing (installing a custom Android ROM, setting up Nextcloud, etc.), nor would they probably want to, given that it’s a lot of extra hassle for a sub-par experience.

And who am I, anyway? Edward Snowden? Why am I LARPing as a character in a spy novel when I could be focusing on any one of a million other hobbies in the world?

I guess the answer is: this is my hobby. Figuring out how to get my Raspberry Pi to auto-update is a hobby. Tinkering with my TV setup so that I can get Bluetooth headphones working while the TV is in airplane mode is a hobby. Like a gearhead who’s delighted when their car breaks down (“Hey! Now I can fix it!”), I don’t mind when the technology around me doesn’t work – it gives me something to do on the weekend! But I have no illusions that this lifestyle makes sense for most people. Or that it will even make sense for me, once I get older and probably bored of my hobby.

For the time being, though, I’m going to keep acting like technology is an enemy I need to subdue rather than a purveyor of joys and delights. So if you want to know how it’s going, subscribe to my blog via RSS or message me on Signal. Or if that fails, come visit me in a horse and buggy.

Speeding up IndexedDB reads and writes

Recently I read James Long’s article “A future for SQL on the web”. It’s a great post, and if you haven’t read it, you should definitely go take a look!

I don’t want to comment on the specifics of the tool James created, except to say that I think it’s a truly amazing feat of engineering, and I’m excited to see where it goes in the future. But one thing in the post that caught my eye was the benchmark comparisons of IndexedDB read/write performance (compared to James’s tool, absurd-sql).

The IndexedDB benchmarks are fair enough, in that they demonstrate the idiomatic usage of IndexedDB. But in this post, I’d like to show how raw IndexedDB performance can be improved using a few tricks that are available as of IndexedDB v2 and v3:

  • Pagination (v2)
  • Relaxed durability (v3)
  • Explicit transaction commits (v3)

Let’s go over each of these in turn.

Pagination

Years ago when I was working on PouchDB, I hit upon an IndexedDB pattern that, at the time, improved performance in Firefox and Chrome by roughly 40-50%. I’m probably not the first person to come up with this idea, but I’ll lay it out here.

In IndexedDB, a cursor is basically a way of iterating through the data in a database one-at-a-time. And that’s the core problem: one-at-a-time. Sadly, this tends to be slow, because at every step of the iteration, JavaScript can respond to a single item from the cursor and decide whether to continue or stop the iteration.

Effectively this means that there’s a back-and-forth between the JavaScript main thread and the IndexedDB engine (running off-main-thread). You can see it in this screenshot of the Chrome DevTools performance profiler:

Screenshot of Chrome DevTools profiler showing multiple small tasks separated by a small amount of idle time each

Or in Chrome tracing, which shows a bit more detail:

Screenshot of Chrome tracing tool showing multiple separate tasks, separated by a bit of idle time. The top of each task says RunNormalPriorityTask, and near the bottom each one says IDBCursor continue.

Notice that each call to cursor.continue() gets its own little JavaScript task, and the tasks are separated by a bit of idle time. That’s a lot of wasted time for each item in a database!

Luckily, in IndexedDB v2, we got two new APIs to help out with this problem: getAll() and getAllKeys(). These allow you to fetch multiple items from an object store or index in a single go. They can also start from a given key range and return a given number of items, meaning that we can implement a paginated cursor:

const batchSize = 100
let keys, values, keyRange = null

function fetchMore() {
  // If there could be more results, fetch them
  if (keys && values && values.length === batchSize) {
    // Find keys greater than the last key
    keyRange = IDBKeyRange.lowerBound(keys.at(-1), true)
    keys = values = undefined
    next()
  }
}

function next() {
  store.getAllKeys(keyRange, batchSize).onsuccess = e => {
    keys = e.target.result
    fetchMore()
  }
  store.getAll(keyRange, batchSize).onsuccess = e => {
    values = e.target.result
    fetchMore()
  }
}

next()

In the example above, we iterate through the object store, fetching 100 items at a time rather than just 1. Using a modified version of the absurd-sql benchmark, we can see that this improves performance considerably. Here are the results for the “read” benchmark in Chrome:

Chart image, see table below

Click for table

DB size (columns) vs batch size (rows):

100 1000 10000 50000
1 8.9 37.4 241 1194.2
100 7.3 34 145.1 702.8
1000 6.5 27.9 100.3 488.3

(Note that a batch size of 1 means a cursor, whereas 100 and 1000 use a paginated cursor.)

And here’s Firefox:

Chart image, see table below

Click for table

DB size (columns) vs batch size (rows):

100 1000 10000 50000
1 2 15 125 610
100 2 9 70 468
1000 2 8 51 271

And Safari:

Chart image, see table below

Click for table

DB size (columns) vs batch size (rows):

100 1000 10000 50000
1 11 106 957 4673
100 1 5 44 227
1000 1 3 26 127

All benchmarks were run on a 2015 MacBook Pro, using Chrome 92, Firefox 91, and Safari 14.1. Tachometer was configured with 15 minimum iterations, a 1% horizon, and a 10-minute timeout. I’m reporting the median of all iterations.

As you can see, the paginated cursor is particularly effective in Safari, but it improves performance in all browser engines.

Now, this technique isn’t without its downsides. For one, you have to choose an explicit batch size, and the ideal number will depend on the size of the data and the usage patterns. You may also want to consider the downsides of overfetching – i.e. if the cursor should stop at a given value, you may end up fetching more items from the database than you really need. (Although ideally, you can use the upper bound of the key range to guard against that.)

The main downside of this technique is that it only works in one direction: you cannot build a paginated cursor in descending order. This is a flaw in the IndexedDB specification, and there are ideas to fix it, but currently it’s not possible.

Of course, instead of implementing a paginated cursor, you could also just use getAll() and getAllKeys() as-is and fetch all the data at once. This probably isn’t a great idea if the database is large, though, as you may run into memory pressure, especially on constrained devices. But it could be useful if the database is small.

getAll() and getAllKeys() both have great browser support, so this technique can be widely adopted for speeding up IndexedDB read patterns, at least in ascending order.

Relaxed durability

The paginated cursor can speed up database reads, but what about writes? In this case, we don’t have an equivalent to getAll()/getAllKeys() that we can lean on. Apparently there was some effort put into building a putAll(), but currently it’s abandoned because it didn’t actually improve write performance in Chrome.

That said, there are other ways to improve write performance. Unfortunately, none of these techniques are as effective as the paginated cursor, but they are worth investigating, so I’m reporting my results here.

The most significant way to improve write performance is with relaxed durability. This API is currently only available in Chrome, but it has also been implemented in WebKit as of Safari Technology Preview 130.

The idea behind relaxed durability is to resolve some disagreement between the browser vendors as to whether IndexedDB transactions should optimize for durability (writes succeed even in the event of a power failure or crash) or performance (writes succeed quickly, even if not fully flushed to disk).

It’s been well documented that Chrome’s IndexedDB performance is worse than Firefox’s or Safari’s, and part of the reason seems to be that Chrome defaults to a durable-by-default mode. But rather than sacrifice durability across-the-board, the Chrome team wanted to expose an explicit API for developers to decide which mode to use. (After all, only the web developer knows if IndexedDB is being used as an ephemeral cache or a store of priceless family photos.) So now we have three durability options: default, relaxed, and strict.

Using the “write” benchmark, we can test out relaxed durability in Chrome and see the improvement:

Chart image, see table below

Click for table
Durability 100 1000 10000 50000
Default 26.4 125.9 1373.7 7171.9
Relaxed 17.1 112.9 1359.3 6969.8

As you can see, the results are not as dramatic as with the pagination technique. The effect is most visible in the smaller database sizes, and the reason turns out to be that relaxed durability is better at speeding up multiple small transactions than one big transaction.

Modifying the benchmark to do one transaction per item in the database, we can see a much clearer impact of relaxed durability:

Chart image, see table below

Click for table
Durability 100 1000
Default 1074.6 10456.2
Relaxed 65.4 630.7

(I didn’t measure the larger database sizes, because they were too slow, and the pattern is clear.)

Personally, I find this option to be nice-to-have, but underwhelming. If performance is only really improved for multiple small transactions, then usually there is a simpler solution: use fewer transactions.

It’s also underwhelming given that, even with this option enabled, Chrome is still much slower than Firefox or Safari:

Chart image, see table below

Click for table
Browser 100 1000 10000 50000
Chrome (default) 26.4 125.9 1373.7 7171.9
Chrome (relaxed) 17.1 112.9 1359.3 6969.8
Firefox 8 53 436 1893
Safari 3 28 279 1359

That said, if you’re not storing priceless family photos in IndexedDB, I can’t see a good reason not to use relaxed durability.

Explicit transaction commits

The last technique I’ll cover is explicit transaction commits. I found it to be an even smaller performance improvement than relaxed durability, but it’s worth mentioning.

This API is available in both Chrome and Firefox, and (like relaxed durability) has also been implemented in Safari Technology Preview 130.

The idea is that, instead of allowing the transaction to auto-close based on the normal flow of the JavaScript event loop, you can explicitly call transaction.close() to signal that it’s safe to close the transaction immediately. This results in a very small performance boost because the IndexedDB engine is no longer waiting for outstanding requests to be dispatched. Here is the improvement in Chrome using the “write” benchmark:

Chart image, see table below

Click for table
Relaxed / Commit 100 1000 10000 50000
relaxed=false, commit=false 26.4 125.9 1373.7 7171.9
relaxed=false, commit=true 26 125.5 1373.9 7129.7
relaxed=true, commit=false 17.1 112.9 1359.3 6969.8
relaxed=true, commit=true 16.8 112.8 1356.2 7215

You’d really have to squint to see the improvement, and only for the smaller database sizes. This makes sense, since explicit commits can only shave a bit of time off the end of each transaction. So, like relaxed durability, it has a bigger impact on multiple small transactions than one big transaction.

The results are similarly underwhelming in Firefox:

Chart image, see table below

Click for table
Commit 100 1000 10000 50000
No commit 8 53 436 1893
Commit 8 52 434 1858

That said, especially if you’re doing multiple small transactions, you might as well use it. Since it’s not supported in all browsers, though, you’ll probably want to use a pattern like this:

if (transaction.commit) {
  transaction.commit()
}

If transaction.commit is undefined, then the transaction can just close automatically, and functionally it’s the same.

Update: Daniel Murphy points out that transaction.commit() can have bigger perf gains if the page is busy with other JavaScript tasks, which would delay the auto-closing of the transaction. This is a good point! My benchmark doesn’t measure this.

Conclusion

IndexedDB has a lot of detractors, and I think most of the criticism is justified. The IndexedDB API is awkward, it has bugs and gotchas in various browser implementations, and it’s not even particularly fast, especially compared to a full-featured, battle-hardened, industry-standard tool like SQLite. The new APIs unveiled in IndexedDB v3 don’t even move the needle much. It’s no surprise that many developers just say “forget it” and stick with localStorage, or they create elaborate solutions on top of IndexedDB, such as absurd-sql.

Perhaps I just have Stockholm syndrome from having worked with IndexedDB for so long, but I don’t find it to be so bad. The nomenclature and the APIs are a bit weird, but once you wrap your head around it, it’s a powerful tool with broad browser support – heck, it even works in Node.js via fake-indexeddb and indexeddbshim. For better or worse, IndexedDB is here to stay.

That said, I can definitely see a future where IndexedDB is not the only player in the browser storage game. We had WebSQL, and it’s long gone (even though I’m still maintaining a Node.js port!), but that hasn’t stopped people from wanting a more high-level database API in the browser, as demonstrated by tools like absurd-sql. In the future, I can imagine something like the Storage Foundation API making it more straightforward to build custom databases of top of low-level storage primitives – which is what IndexedDB was designed to do, and arguably failed at. (PouchDB, for one, makes extensive use of IndexedDB’s capabilities, but I’ve seen plenty of storage wrappers that essentially use IndexedDB as a dumb key-value store.)

I’d also like to see the browser vendors (especially Chrome) improve their IndexedDB performance. The Chrome team has said that they’re focused on read performance rather than write performance, but really, both matter. A mobile app developer can ship a prebuilt SQLite .db file in their app; in terms of quickly populating a database, there is nothing even remotely close for IndexedDB. As demonstrated above, cursor performance is also not great

For those web developers sticking it out with IndexedDB, though, I hope I’ve made a case that it’s not completely a lost cause, and that its performance can be improved. Who knows: maybe the browser vendors still have some tricks up their sleeves, especially if we web developers keep complaining about IndexedDB performance. It’ll be interesting to watch this space evolve and to see how both IndexedDB and its alternatives improve over the years.

Does shadow DOM improve style performance?

Short answer: Kinda. It depends. And it might not be enough to make a big difference in the average web app. But it’s worth understanding why.

First off, let’s review the browser’s rendering pipeline, and why we might even speculate that shadow DOM could improve its performance. Two fundamental parts of the rendering process are style calculation and layout calculation, or simply “style” and “layout.” The first part is about figuring out which DOM nodes have which styles (based on CSS), and the second part is about figuring out where to actually place those DOM nodes on the page (using the styles calculated in the previous step).

Screenshot of Chrome DevTools showing a performance trace with JavaScript stacks followed by a purple Style/Layout region and green Paint region

A performance trace in Chrome DevTools, showing the basic JavaScript → Style → Layout → Paint pipeline.

Browsers are complex, but in general, the more DOM nodes and CSS rules on a page, the longer it will take to run the style and layout steps. One of the ways we can improve the performance of this process is to break up the work into smaller chunks, i.e. encapsulation.

For layout encapsulation, we have CSS containment. This has already been covered in other articles, so I won’t rehash it here. Suffice it to say, I think there’s sufficient evidence that CSS containment can improve performance (I’ve seen it myself), so if you haven’t tried putting contain: content on parts of your UI to see if it improves layout performance, you definitely should!

For style encapsulation, we have something entirely different: shadow DOM. Just like how CSS containment can improve layout performance, shadow DOM should (in theory) be able to improve style performance. Let’s consider why.

What is style calculation?

As mentioned before, style calculation is different from layout calculation. Layout calculation is about the geometry of the page, whereas style calculation is more explicitly about CSS. Basically, it’s the process of taking a rule like:

div > button {
  color: blue;
}

And a DOM tree like:

<div>
  <button></button>
</div>

…and figuring out that the <button> should have color: blue because its parent is a <div>. Roughly speaking, it’s the process of evaluating CSS selectors (div > button in this case).

Now, in the worst case, this is an O(n * m) operation, where n is the number of DOM nodes and m is the number of CSS rules. (I.e. for each DOM node, and for each rule, figure out if they match each other.) Clearly, this isn’t how browsers do it, or else any decently-sized web app would become grindingly slow. Browsers have a lot of optimizations in this area, which is part of the reason that the common advice is not to worry too much about CSS selector performance (see this article for a good, recent treatment of the subject).

That said, if you’ve worked on a non-trivial codebase with a fair amount of CSS, you may notice that, in Chrome performance profiles, the style costs are not zero. Depending on how big or complex your CSS is, you may find that you’re actually spending more time in style calculation than in layout calculation. So it isn’t a completely worthless endeavor to look into style performance.

Shadow DOM and style calculation

Why would shadow DOM improve style performance? Again, it’s because of encapsulation. If you have a CSS file with 1,000 rules, and a DOM tree with 1,000 nodes, the browser doesn’t know in advance which rules apply to which nodes. Even if you authored your CSS with something like CSS Modules, Vue scoped CSS, or Svelte scoped CSS, ultimately you end up with a stylesheet that is only implicitly coupled to the DOM, so the browser has to figure out the relationship at runtime (e.g. using class or attribute selectors).

Shadow DOM is different. With shadow DOM, the browser doesn’t have to guess which rules are scoped to which nodes – it’s right there in the DOM:

<my-component>
  #shadow-root
    <style>div {color: green}</style>
    <div></div>
<my-component>
<another-component>
  #shadow-root
    <style>div {color: blue}</style>
    <div></div>
</another-component>

In this case, the browser doesn’t need to test the div {color: green} rule against every node in the DOM – it knows that it’s scoped to <my-component>. Ditto for the div {color: blue} rule in <another-component>. In theory, this can speed up the style calculation process, because the browser can rely on explicit scoping through shadow DOM rather than implicit scoping through classes or attributes.

Benchmarking it

That’s the theory, but of course things are always more complicated in practice. So I put together a benchmark to measure the style calculation performance of shadow DOM. Certain CSS selectors tend to be faster than others, so for decent coverage, I tested the following selectors:

  • ID (#foo)
  • class (.foo)
  • attribute ([foo])
  • attribute value ([foo="bar"])
  • “silly” ([foo="bar"]:nth-of-type(1n):last-child:not(:nth-of-type(2n)):not(:empty))

Roughly, I would expect IDs and classes to be the fastest, followed by attributes and attribute values, followed by the “silly” selector (thrown in to add something to really make the style engine work).

To measure, I used a simple requestPostAnimationFrame polyfill, which measures the time spent in style, layout, and paint. Here is a screenshot in the Chrome DevTools of what’s being measured (note the “total” under the Timings section):

Screenshot of Chrome DevTools showing a "total" measurement in Timings which corresponds to style, layout, and other purple "rendering" blocks in the "Main" section

To run the actual benchmark, I used Tachometer, which is a nice tool for browser microbenchmarks. In this case, I just took the median of 51 iterations.

The benchmark creates several custom elements, and either attaches a shadow root with its own <style> (shadow DOM “on”) , or uses a global <style> with implicit scoping (shadow DOM “off”). In this way, I wanted to make a fair comparison between shadow DOM itself and shadow DOM “polyfills” – i.e. systems for scoping CSS that don’t rely on shadow DOM.

Each CSS rule looks something like this:

#foo {
  color: #000000;
}

And the DOM structure for each component looks like this:

<div id="foo">hello</div>

(Of course, for attribute and class selectors, the DOM node would have an attribute or class instead.)

Benchmark results

Here are the results in Chrome for 1,000 components and 1 CSS rule for each component:

Chart of Chrome with 1000 components and 1 rule. See tables for full data

Click for table
id class attribute attribute-value silly
Shadow DOM 67.90 67.20 67.30 67.70 69.90
No Shadow DOM 57.50 56.20 120.40 117.10 130.50

As you can see, classes and IDs are about the same with shadow DOM on or off (in fact, it’s a bit faster without shadow DOM). But once the selectors get more interesting (attribute, attribute value, and the “silly” selector), shadow DOM stays roughly constant, whereas the non-shadow DOM version gets more expensive.

We can see this effect even more clearly if we bump it up to 10 CSS rules per component:

Chart of Chrome with 1000 components and 10 rules. See tables for full data

Click for table
id class attribute attribute-value silly
Shadow DOM 70.80 70.60 71.10 72.70 81.50
No Shadow DOM 58.20 58.50 597.10 608.20 740.30

The results above are for Chrome, but we see similar numbers in Firefox and Safari. Here’s Firefox with 1,000 components and 1 rule each:

Chart of Firefox with 1000 components and 1 rule. See tables for full data

Click for table
id class attribute attribute-value silly
Shadow DOM 27 25 25 25 25
No Shadow DOM 18 18 32 32 32

And Firefox with 1,000 components, 10 rules each:

Chart of Firefox with 1000 components and 10 rules. See tables for full data

Click for table
id class attribute attribute-value silly
Shadow DOM 30 30 30 30 34
No Shadow DOM 22 22 143 150 153

And here’s Safari with 1,000 components and 1 rule each:

Chart of Safari with 1000 components and 1 rule. See tables for full data

Click for table
id class attribute attribute-value silly
Shadow DOM 57 58 61 63 64
No Shadow DOM 52 52 126 126 177

And Safari with 1,000 components, 10 rules each:

Chart of Safari with 1000 components and 10 rules. See tables for full data

Click for table
id class attribute attribute-value silly
Shadow DOM 60 61 81 81 92
No Shadow DOM 56 56 710 716 1157

All benchmarks were run on a 2015 MacBook Pro with the latest version of each browser (Chrome 92, Firefox 91, Safari 14.1).

Conclusions and future work

We can draw a few conclusions from this data. First off, it’s true that shadow DOM can improve style performance, so our theory about style encapsulation holds up. However, ID and class selectors are fast enough that actually it doesn’t matter much whether shadow DOM is used or not – in fact, they’re slightly faster without shadow DOM. This indicates that systems like Svelte, CSS Modules, or good old-fashioned BEM are using the best approach performance-wise.

This also indicates that using attributes for style encapsulation does not scale well compared to classes. So perhaps scoping systems like Vue would be better off switching to classes.

Another interesting question is why, in all three browser engines, classes and IDs are slightly slower when using shadow DOM. This is probably a better question for the browser vendors themselves, and I won’t speculate. I will say, though, that the differences are small enough in absolute terms that I don’t think it’s worth it to favor one or the other. The clearest signal from the data is just that shadow DOM helps to keep the style costs roughly constant, whereas without shadow DOM, you would want to stick to simple selectors like classes and IDs to avoid hitting a performance cliff.

As for future work: this is a pretty simple benchmark, and there are lots of ways to expand it. For instance, the benchmark only has one inner DOM node per component, and it only tests flat selectors – no descendant or sibling selectors (e.g. div div, div > div, div ~ div, and div + div). In theory, these scenarios should also favor shadow DOM, especially since these selectors can’t cross shadow boundaries, so the browser doesn’t need to look outside of the shadow root to find the relevant ancestors or siblings. (Although the browser’s Bloom filter makes this more complicated – see these notes for an good explanation of how this optimization works.)

Overall, though, I’d say that the numbers above are not big enough that the average web developer should start worrying about optimizing their CSS selectors, or migrating their entire web app to shadow DOM. These benchmark results are probably only relevant if 1) you’re building a framework, so any pattern you choose is magnified multiple times, or 2) you’ve profiled your web app and are seeing lots of high style calculation costs. But for everyone else, I hope at least that these results are interesting, and reveal a bit about how shadow DOM works.

Update: Thomas Steiner wondered about tag selectors as well (e.g. div {}), so I modified the benchmark to test it out. I’ll only report the results for the Shadow DOM version, since the benchmark uses divs, and in the non-shadow case it wouldn’t be possible to use tags alone to distinguish between different divs. In absolute terms, the numbers look pretty close to those for IDs and classes (or even a bit faster in Chrome and Firefox):

Click for table
Chrome Firefox Safari
1,000 components, 1 rule 53.9 19 56
1,000 components, 10 rules 62.5 20 58

Improving responsiveness in text inputs

For me, one of the most aggravating performance issues on the web is when it’s slow to type into a text input. I’m a fairly fast typist, so if there’s even a tiny delay in a <textarea> or <input>, I can feel it slowing me down, and it drives me nuts.

I find this problem especially irksome because it’s usually solvable with a few simple tricks. There’s no reason for a chat app or a social media app to be slow to type into, except that web developers often take the naïve approach, and that’s where the delay comes from.

To understand the source of input delays, let’s take a concrete example. Imagine a Twitter-like UI with a text field and a “remaining characters” count. As you type, the number gradually decreases down to zero.

Screenshot of a text area with the text "Hello I'm typing!" and the text "Characters remaining: 263"

Here’s the naïve way to implement this:

  1. Attach an input event listener to the <textarea>.
  2. Whenever the event fires, update some global state (e.g. in Redux).
  3. Update the “remaining characters” display based on that global state.

And here’s a live example. Really mash on the keyboard if you don’t notice the input delay:

Note: This example contains an artificial 70-millisecond delay to simulate a heavy real-world app, and to make the demo consistent across devices. Bear with me for a moment.

The problem with the naïve approach is that it usually ends up doing far too much work relative to the benefit that the user gets out of the “remaining characters” display. In the worst case, changing the global state could cause the entire UI to re-render (e.g. in a poorly-optimized React app), meaning that as the user types, every keypress causes a full global re-render.

Also, because we are directly listening to the input event, there will be a delay between the actual keypress and the character appearing in the <textarea>. Because the DOM is single-threaded, and because we’re doing blocking work on the main thread, the browser can’t render the new input until that work finishes. This can lead to noticeable typing delays and therefore user frustration.

My preferred solution to this kind of problem is to use requestIdleCallback to wait for the UI thread to be idle before running the blocking code. For instance, something like this:

let queued = false
textarea.addEventListener('input', () => {
  if (!queued) {
    queued = true
    requestIdleCallback(() => {
      updateUI(textarea.value)
      queued = false
    })
  }
})

This technique has several benefits:

  1. We are not directly blocking the input event with anything expensive, so there shouldn’t be a delay between typing a character and seeing that character appear in the <textarea>.
  2. We are not updating the UI for every keypress. requestIdleCallback will batch the UI updates when the user pauses between typing characters. This is sensible, because the user probably doesn’t care if the “remaining characters” count updates for every single keypress – their attention is on the text field, not on the remaining characters.
  3. On a slower machine, requestIdleCallback will naturally do fewer batches-per-keypress than on a faster machine. So a user on a faster device will have the benefit of a faster-updating UI, but neither user will experience poor input responsiveness.

And here’s a live example of the optimized version. Feel free to mash on the keyboard: you shouldn’t see (much) of a delay!

In the past, you might have used something like debouncing to solve this problem. But I like requestIdleCallback because of the third point above: it naturally adapts to the characteristics of the user’s device, rather than forcing us to choose a hardcoded delay.

Note: Running your state logic in a web worker is also a way to avoid this problem. But the vast majority of web apps aren’t architected this way, so I find requestIdleCallback to be better as a bolt-on solution.

To be fair, this technique isn’t foolproof. Some UIs really need to respond immediately to every keypress: for instance, to disallow certain characters or resize the <textarea> as it grows. (In those cases, though, I would throttle with requestAnimationFrame.) Also, some UIs may still lag if the work they’re doing is large enough that it’s perceptible even when batched. (In the live examples above, I set an artificial delay of 70 milliseconds, which you can still “feel” with the optimized version.) But for the most part, using requestIdleCallback is enough to get rid of any major responsiveness issues.

If you want to test this on your own website, I’d recommend putting the Chrome DevTools at 6x CPU slowdown and then mashing the keyboard as fast as you can. On a vanilla <textarea> or <input> with no JavaScript handlers, you won’t see any delay. Whereas if your own website feels sluggish, then maybe it’s time to optimize your text inputs!

Handling properties in custom element upgrades

It’s been well-documented that one of the most awkward parts of working with custom elements is handling properties and attributes. In this post, I want to go a step further and talk about a tricky situation with properties and the component lifecycle.

The problem

First off, see if you can find the bug in this code:

<hello-world></hello-world>
<script src="./hello.js" type="module"></script>
<script>
  document.querySelector('hello-world').mode = 'dark'
</script>

And here’s the component we’re loading, which is just a “hello world” that switches between dark and light mode:

// hello.js
customElements.define('hello-world', class extends HTMLElement {
  constructor() {
    super()
    this.innerHTML = '<div>Hello world!</div>'
  }

  set mode (mode) {
    this.querySelector('div')
      .setAttribute('style', mode === 'light'
        ? 'background: white; color: black;'
        : 'background: black; color: white;'
    )
  }
})

Do you see it? Don’t worry if you missed it; it’s extremely subtle and took me by surprise, too.

The problem is the timing. There are two <script>s – one loading hello.js as a module, and the other setting the mode property on the <hello-world> element. The problem is that the first <script> is type="module", meaning it’s deferred by default, whereas the second is an inline script, which runs immediately. So the first script will always run after the second script.

In terms of custom elements, this means that the set mode setter will never actually get called! The HTML element goes through the custom element upgrade process after its mode has already been set, so the setter has no impact. The component is still in light mode.

Note: Curiously, this is not the case for attributes. As long as we have observedAttributes and attributeChangedCallback defined in the custom element, we’ll be able to handle any attributes that existed before the upgrade. But, in the tradition of funky differences between properties and attributes, this isn’t true of properties.

The fix

To work around this issue, the first option is to just do nothing. After all, this is kind of an odd timing issue, and you can put the onus on consumers to load the custom element script before setting any properties on it.

I find this a bit unsatisfying, though. It feels like it should work, so why shouldn’t it? And as it turns out, there is a fix.

When the custom element is defined, all existing HTML elements are upgraded. This means they go through the constructor() callback, and we can check for any existing properties in that block:

constructor() {
  /* ... */
  if (Object.prototype.hasOwnProperty.call(this, 'mode')) {
    const mode = this.mode
    delete this.mode
    this.mode = mode
  }
}

Let’s break it down step-by-step:

Object.prototype.hasOwnProperty.call(this, 'mode')

Here we check if we already have a property defined called mode. The hasOwnProperty is necessary because we’re checking if the object has its own mode as opposed to the one it gets from the class (i.e. its prototype).

The Object.prototype dance is just an ESLint-recommended safety measure. Using this.hasOwnProperty directly is probably fine too.

const mode = this.mode
delete this.mode

Next, we cache and delete the mode that was set on the object. This way, the object no longer has its own mode property.

this.mode = mode

At this point, we can just set the mode and the setter from the prototype (set mode) will be invoked.

Here is a full working example if you’re curious.

Conclusion

Properties and attributes are an awkward part of working with web components, and this is a particularly tricky situation. But it’s not impossible to work around, with just a bit of extra constructor code.

Also, you shouldn’t have to deal with this unless you’re writing your own vanilla custom element, or a wrapper around a framework. Many frameworks have built-in support for building custom elements, which means they should handle this logic automatically.

For more reading on this topic, you can check out Google’s Web Fundamentals or take a look at how Lit and Stencil handle this situation.

Why it’s okay for web components to use frameworks

Should standalone web components be written in vanilla JavaScript? Or is it okay if they use (or even bundle) their own framework? With Vue 3 announcing built-in support for building web components, and with frameworks like Svelte and Lit having offered this functionality for some time, it seems like a good time to revisit the question.

First off, I should state my own bias. When I released emoji-picker-element, I made the decision to bundle its framework (Svelte) directly into the component. Clearly I don’t think this is a bad idea (despite my reputation as a perf guy!), so I’d like to explain why it doesn’t shock me for a web component to rely on a framework.

Size concerns

Many web developers might bristle at the idea of a standalone web component relying on its own framework. If I want a date picker, or a modal dialog, or some other utility component, why should I pay the tax of including its entire framework in my bundle? But I think this is the wrong way to look at things.

First off, JavaScript frameworks have come a long way from the days when they were huge, kitchen-sink monoliths. Today’s frameworks like Svelte, Lit, Preact, Vue, and others tend to be smaller, more focused, and more tree-shakeable. A Svelte “hello world” is 1.18 kB (minified and compressed), a Lit “hello world” is 5.7 kB, and petite-vue aims for a 5.8 kB compressed size. These are not huge by any stretch of the imagination.

If you dig deeper, the situation gets even more interesting. As Evan You points out, some frameworks (such as Vue) have a relatively high baseline cost that is amortized by a small per-component size, whereas other frameworks (such as Svelte) have a lower baseline cost but a higher per-component size. The days when you could confidently say “Framework X costs Y kilobytes” are over – the conversation has become much more complex and nuanced.

Second, with code-splitting becoming more common, the individual cost of a dependency has become less important than whether it can be lazy-loaded. For instance, if you use a date picker or modal dialog that bundles its own framework, why not dynamically import() it when it actually needs to be shown? There’s no reason to pay the cost on initial page load for a component that the user may never even need.

Third, bundle size is not the only performance metric that matters. There are also considerations like runtime cost, memory overhead, and energy usage that web developers rarely consider.

Looking at runtime cost, a framework can be small, but that’s not necessarily the same thing as being fast. Sometimes it takes more code to make an algorithm faster! For example, Inferno aims for faster runtime performance at the cost of a higher bundle size when compared to something like Preact. So it’s worth considering whether a component is fast in other metrics beside bundle size.

Caveats

That said, I don’t think “bring your own framework” is without its downsides. So let’s go over some problems you may run into when you mix-and-match frameworks.

You can imagine that, if every web component came with its own framework, then you might end up with multiple copies of the same framework on the same page. And this is definitely a concern! But assuming that the component externalizes its framework dependency (e.g. import 'my-framework'), then multiple components should be able to share the same framework code under the hood.

I used this technique in my own emoji-picker-element. If you’re already using Svelte in your project, then you can import 'emoji-picker-element/svelte' and get a version that doesn’t bundle its own framework, ensuring de-duplication. This saves a paltry 1.4 kB out of 13.9 kB total (compressed), but hey, it’s there. (Potentially I could make this the default behavior, but I like the bundled version for the benefit of folks who use <script> tags instead of bundlers. Maybe something like Skypack could make this simpler in the future.)

Another potential downside of bring-your-own-framework is when frameworks mutate global state, which can lead to conflicts between frameworks. For instance, React has historically attached global event listeners to the document (although thankfully this changed in React v17). Also, Angular’s Zone.js overrides the global Object.defineProperty (although there is a workaround). When mixing-and-matching frameworks, it’s best to avoid frameworks that mutate global state, or to carefully ensure that they don’t conflict with one another.

If you look at the compiled output for a framework like Svelte, though, you’ll see that it’s basically just a collection of pure functions that don’t modify the global state. Combining such frameworks in the same codebase is no more harmful than bundling different versions of Lodash or Underscore.

Now, to be clear: in an ideal world, your web app would only contain one framework. Otherwise it’s shipping duplicate code that essentially does the same thing. But web development is all about tradeoffs, and I don’t believe that it’s worth rejecting a component out-of-hand just to avoid a few extra kBs from a tiny framework like Preact or Lit. (Of course, for a larger framework, this may be a different story. But this is true of any component dependency, not just a framework.)

Framework chauvinism

In general, I don’t think the question should be whether a component uses its own framework or not. Instead, the question should be: Is this component small enough/fast enough for my use case? After all, a component can be huge without using a framework, and it can be slow even when written in vanilla JS. The framework is part of the story, but it’s not the whole story.

I also think that focusing too much on frameworks plays against the strengths of web components. The whole point of web components is to have a standard, interoperable way to add a component to a page without worrying about what framework it’s using under the hood (or if it’s using a framework at all).

Web components also serve as a fantastic glue layer between frameworks. If there’s a great React component out there that you want to use in your Vue codebase, why not wrap it in Remount (2.4 kB) and Preact (4 kB) and call it a day? Even if you spent the time to laboriously create your own Vue version of the component, are you really sure you’ll improve upon the battle-tested version that already exists on npm?

Part of the reason I wrote emoji-picker-element as a web component (and not, for instance, as a Svelte component) is that I think it’s silly to re-implement something like an emoji picker in multiple frameworks. The core business logic of an emoji picker has nothing to do with frameworks – in fact, I think my main contribution to the emoji picker landscape was in innovating around IndexedDB, accessibility, and data loading. Should we really re-implement all of those things just to satisfy developers who want their codebase to be pure Vue, or pure Lit, or pure React, or pure whatever? Do we need an entirely new ecosystem every time a new framework comes out?

The belief that it’s unacceptable for a web app to contain more than one framework is something I might call “framework chauvinism.” And honestly, if you feel this way, then you may as well choose the framework that has the most market share and biggest ecosystem – i.e. you may as well choose React. After all, if you chose Vue or Svelte or some other less-popular framework, then you might find that when you reach for some utility component on npm, nobody has written it in your framework of choice.

Now, if you like living in a React-only world: that’s great. You can definitely do so, given how enormous the React ecosystem is. But personally, I like playing around with different frameworks, comparing their strengths and weaknesses, and letting developers use whichever one tickles their fancy. The vision of a React-only future fills me with a deep boredom. I would much rather see frameworks continue to compete and innovate and push the boundaries of what’s possible in web development than to see one framework “solve” web development forever. (Or to see frameworks locked in a perpetual ecosystem race against each other.)

To me, the main benefit of web components is that they liberate us from the tyranny of frameworks. Rather than focusing on cosmetic questions of how a component is written (did you use React? did you use Vue? who cares!), we can focus on more important questions of performance, accessibility, correctness, and things that have nothing to do with whether you use HTML templates or a render() function. Balking at web components that use frameworks is, in my opinion, missing the entire point of web components.

Thanks to Thomas Steiner and Thomas Wilburn for their thoughtful feedback on a draft of this blog post.