Posts Tagged ‘performance’

Accurately measuring layout on the web

We all want to make faster websites. The question is just what to measure, and how to use that information to determine what’s “slow” and what could be made faster.

The browser rendering pipeline is complicated. For that reason, it’s tricky to measure the performance of a webpage, especially when components are rendered client-side and everything becomes an intricate ballet between JavaScript, the DOM, styling, layout, and rendering. Many folks stick to what they understand, and so they may under-measure or completely mis-measure their website’s frontend performance.

So in this post, I want to demystify some of these concepts, and offer techniques for accurately measuring what’s going on when we render things on the web.

The web rendering pipeline

Let’s say we have a component that is rendered client-side, using JavaScript. To keep things simple, I wrote a demo component in vanilla JS, but everything I’m about to say would also apply to React, Vue, Angular, etc.

When we use the handy Performance profiler in the Chrome Dev Tools, we see something like this:

Screenshot of Chrome Dev Tools showing work on the UI thread divided into JavaScript, then Style, then Layout, then Render

This is a view of the CPU costs of our component, in terms of milliseconds on the UI thread. To break things down, here are the steps required:

  1. Execute JavaScript – executing (but not necessarily compiling) JavaScript, including any state manipulation, “virtual DOM diffing,” and modifying the DOM.
  2. Calculate style – taking a CSS stylesheet and matching its selector rules with elements in the DOM. This is also known as “formatting.”
  3. Calculate layout – taking those CSS styles we calculated in step #2 and figuring out where the boxes should be laid out on the screen. This is also known as “reflow.”
  4. Render – the process of actually putting pixels on the screen. This often involves painting, compositing, GPU acceleration, and a separate rendering thread.

All of these steps invoke CPU costs, and therefore all of them can impact the user experience. If any one of them takes a long time, it can lead to the appearance of a slow-loading component.

The naïve approach

Now, the most common mistake that folks make when trying to measure this process is to skip steps 2, 3, and 4 entirely. In other words, they just measure the time spent executing JavaScript, and completely ignore everything after that.

Screenshot of Chrome Dev Tools, showing an arrow pointing after JavaScript but before Style and Layout with the text 'Most devs stop measuring here'

When I worked as a browser performance engineer, I would often look at a trace of a team’s website and ask them which mark they used to measure “done.” More often than not, it turned out that their mark landed right after JavaScript, but before style and layout, meaning the last bit of CPU work wasn’t being measured.

So how do we measure these costs? For the purposes of this post, let’s focus on how we measure style and layout in particular. As it turns out, the render step is much more complicated to measure, and indeed it’s impossible to measure accurately, because rendering is often a complex interplay between separate threads and the GPU, and therefore isn’t even visible to userland JavaScript running on the main thread.

Style and layout calculations, however, are 100% measurable because they block the main thread. And yes, this is true even with something like Firefox’s Stylo engine – even if multiple threads can be employed to speed up the work, ultimately the main thread has to wait on all the other threads to deliver the final result. This is just the way the web works, as specc’ed.

What to measure

So in practical terms, we want to put a performance mark before our JavaScript starts executing, and another one after all the additional work is done:

Screenshot of Chrome Dev Tools, with arrow pointing before JavaScript execution saying 'Ideal start' and arrow pointing after Render (Paint) saying 'Ideal end'

I’ve written previously about various JavaScript timers on the web. Can any of these help us out?

As it turns out, requestAnimationFrame will be our main tool of choice, but there’s a problem. As Jake Archibald explains in his excellent talk on the event loop, browsers disagree on where to fire this callback:

Screenshot of Chrome Dev Tools showing arrow pointing before style/layout saying "Chrome, FF, Edge >= 18" and arrow pointing after style/layout saying "Safari, IE, Edge < 18"

Now, per the HTML5 event loop spec, requestAnimationFrame is indeed supposed to fire before style and layout are calculated. Edge has already fixed this in v18, and perhaps Safari will fix it in the future as well. But that would still leave us with inconsistent behavior in IE, as well as in older versions of Safari and Edge.

Also, if anything, the spec-compliant behavior actually makes it more difficult to measure style and layout! In an ideal world, the spec would have two timers – one for requestAnimationFrame, and another for requestAnimationFrameAfterStyleAndLayout (or something like that). In fact, there has been some discussion at the WHATWG about adding an API for this, but so far it’s just a gleam in the spec authors’ eyes.

Unfortunately, we live in the real world with real constraints, and we can’t wait for browsers to add this timer. So we’ll just have to figure out how to crack this nut, even with browsers disagreeing on when requestAnimationFrame should fire. Is there any solution that will work cross-browser?

Cross-browser “after frame” callback

There’s no solution that will work perfectly to place a callback right after style and layout, but based on the advice of Todd Reifsteck, I believe this comes closest:

requestAnimationFrame(() => {
  setTimeout(() => {
    performance.mark('end')
  })
})

Let’s break down what this code is doing. In the case of spec-compliant browsers, such as Chrome, it looks like this:

Screenshot of Chrome Dev Tools showing 'Start' before JavaScript execution, requestAnimationFrame before style/layout, and setTimeout falling a bit after Paint/Render

Note that rAF fires before style and layout, but the next setTimeout fires just after those steps (including “paint,” in this case).

And here’s how it works in non-spec-compliant browsers, such as Edge 17:

Screenshot of Edge F12 Tools showing 'Start' before JavaScript execution, and requestAnimationFrame/setTimeout both almost immediately after style/layout

Note that rAF fires after style and layout, and the next setTimeout happens so soon that the Edge F12 Tools actually render the two marks on top of each other.

So essentially, the trick is to queue a setTimeout callback inside of a rAF, which ensures that the second callback happens after style and layout, regardless of whether the browser is spec-compliant or not.

Downsides and alternatives

Now to be fair, there are a lot of problems with this technique:

  1. setTimeout is somewhat unpredictable in that it may be clamped to 4ms (or more in some cases).
  2. If there are any other setTimeout callbacks that have been queued elsewhere in the code, then ours may not be the last one to run.
  3. In the non-spec-compliant browsers, doing the setTimeout is actually a waste, because we already have a perfectly good place to set our mark – right inside the rAF!

However, if you’re looking for a one-size-fits-all solution for all browsers, rAF + setTimeout is about as close as you can get. Let’s consider some alternative approaches and why they wouldn’t work so well:

rAF + microtask

requestAnimationFrame(() => {
  Promise.resolve().then(() => {
    performance.mark('after')
  })
})

This one doesn’t work at all, because microtasks (e.g. Promises) run immediately after JavaScript execution has completed. So it doesn’t wait for style and layout at all:

Screenshot of Chrome Dev Tools showing microtask firing before style/layout

rAF + requestIdleCallback

requestAnimationFrame(() => {
  requestIdleCallback(() => {
    performance.mark('after')
  })
})

Calling requestIdleCallback from inside of a requestAnimationFrame will indeed capture style and layout:

Screenshot of Chrome Dev Tools showing requestIdleCallback firing a bit after render/paint

However, if the microtask version fires too early, I would worry that this one would fire too late. The screenshot above shows it firing fairly quickly, but if the main thread is busy doing other work, rIC could be delayed a long time waiting for the browser to decide that it’s safe to run some “idle” work. This one is far less of a sure bet than setTimeout.

rAF + rAF

requestAnimationFrame(() => {
  requestAnimationFrame(() => {
    performance.mark('after')
  })
})

This one, also called a “double rAF,” is a perfectly fine solution, but compared to the setTimeout version, it probably captures more idle time – roughly 16.7ms on a 60Hz screen, as opposed to the standard 4ms for setTimeout – and is therefore slightly more inaccurate.

Screenshot of Chrome Dev Tools showing a second requestAnimationFrame firing a bit after render/paint

You might wonder about that, given that I’ve already talked about setTimeout(0) not really firing in 0 (or even necessarily 4) milliseconds in a previous blog post. But keep in mind that, even though setTimeout() may be clamped by as much as a second, this only occurs in a background tab. And if we’re running in a background tab, we can’t count on rAF at all, because it may be paused altogether. (How to deal with noisy telemetry from background tabs is an interesting but separate question.)

So rAF+setTimeout, despite its flaws, is probably still better than rAF+rAF.

Not fooling ourselves

In any case, whether we choose rAF+setTimeout or double rAF, we can rest assured that we’re capturing any event-loop-driven style and layout costs. With this measure in place, it’s much less likely that we’ll fool ourselves by only measuring JavaScript and direct DOM API performance.

As an example, let’s consider what would happen if our style and layout costs weren’t just invoked by the event loop – that is, if our component were calling one of the many APIs that force style/layout recalculation, such as getBoundingClientRect(), offsetTop, etc.

If we call getBoundingClientRect() just once, notice that the style and layout calculations shift over into the middle of JavaScript execution:

Screenshot of Chrome Dev Tools showing style/layout costs moved to the left inside of JavaScript execution under getBoundingClientRect with red triangles on each purple rectangle

The important point here is that we’re not doing anything any slower or faster – we’ve merely moved the costs around. If we don’t measure the full costs of style and layout, though, we might deceive ourselves into thinking that calling getBoundingClientRect() is slower than not calling it! In fact, though, it’s just a case of robbing Peter to pay Paul.

It’s worth noting, though, that the Chrome Dev Tools have added little red triangles to our style/layout calculations, with the message “Forced reflow is a likely performance bottleneck.” This can be a bit misleading in this case, because again, the costs are not actually any higher – they’ve just moved to earlier in the trace.

(Now it’s true that, if we call getBoundingClientRect() repeatedly and change the DOM in the process, then we might invoke layout thrashing, in which case the overall costs would indeed be higher. So the Chrome Dev Tools are right to warn folks in that case.)

In any case, my point is that it’s easy to fool yourself if you only measure explicit JavaScript execution, and ignore any event-loop-driven style and layout costs that come afterward. The two costs may be scheduled differently, but they both impact performance.

Conclusion

Accurately measuring layout on the web is hard. There’s no perfect metric to capture style and layout – or indeed, rendering – even though all three can impact the user experience just as much as JavaScript.

However, it’s important to understand how the HTML5 event loop works, and to place performance marks at the appropriate points in the component rendering lifecycle. This can help avoid any mistaken conclusions about what’s “slower” or “faster” based on an incomplete view of the pipeline, and ensure that style and layout costs are accounted for.

I hope this blog post was useful, and that the art of measuring client-side performance is a little less mysterious now. And maybe it’s time to push browser vendors to add requestAnimationFrameAfterStyleAndLayout (we’ll bikeshed on the name though!).

Thanks to Ben Kelly, Todd Reifsteck, and Alex Russell for feedback on a draft of this blog post.

A tour of JavaScript timers on the web

Pop quiz: what is the difference between these JavaScript timers?

  • Promises
  • setTimeout
  • setInterval
  • setImmediate
  • requestAnimationFrame
  • requestIdleCallback

More specifically, if you queue up all of these timers at once, do you have any idea which order they’ll fire in?

If not, you’re probably not alone. I’ve been doing JavaScript and web programming for years, I’ve worked for a browser vendor for two of those years, and it’s only recently that I really came to understand all these timers and how they play together.

In this post, I’m going to give a high-level overview of how these timers work, and when you might want to use them. I’ll also cover the Lodash functions debounce() and throttle(), because I find them useful as well.

Promises and microtasks

Let’s get this one out of the way first, because it’s probably the simplest. A Promise callback is also called a “microtask,” and it runs at the same frequency as MutationObserver callbacks. Assuming queueMicrotask() ever makes it out of spec-land and into browser-land, it will also be the same thing.

I’ve already written a lot about promises. One quick misconception about promises that’s worth covering, though, is that they don’t give the browser a chance to breathe. Just because you’re queuing up an asynchronous callback, that doesn’t mean that the browser can render, or process input, or do any of the stuff we want browsers to do.

For example, let’s say we have a function that blocks the main thread for 1 second:

function block() {
  var start = Date.now()
  while (Date.now() - start < 1000) { /* wheee */ }
}

If we were to queue up a bunch of microtasks to call this function:

for (var i = 0; i < 100; i++) {
  Promise.resolve().then(block)
}

This would block the browser for about 100 seconds. It’s basically the same as if we had done:

for (var i = 0; i < 100; i++) {
  block()
}

Microtasks execute immediately after any synchronous execution is complete. There’s no chance to fit in any work between the two. So if you think you can break up a long-running task by separating it into microtasks, then it won’t do what you think it’s doing.

setTimeout and setInterval

These two are cousins: setTimeout queues a task to run in x number of milliseconds, whereas setInterval queues a recurring task to run every x milliseconds.

The thing is… browsers don’t really respect that milliseconds thing. You see, historically, web developers have abused setTimeout. A lot. To the point where browsers have had to add mitigations for setTimeout(/* ... */, 0) to avoid locking up the browser’s main thread, because a lot of websites tended to throw around setTimeout(0) like confetti.

This is the reason that a lot of the tricks in crashmybrowser.com don’t work anymore, such as queuing up a setTimeout that calls two more setTimeouts, which call two more setTimeouts, etc. I covered a few of these mitigations from the Edge side of things in “Improving input responsiveness in Microsoft Edge”.

Broadly speaking, a setTimeout(0) doesn’t really run in zero milliseconds. Usually, it runs in 4. Sometimes, it may run in 16 (this is what Edge does when it’s on battery power, for instance). Sometimes it may be clamped to 1 second (e.g., when running in a background tab). These are the sorts of tricks that browsers have had to invent to prevent runaway web pages from chewing up your CPU doing useless setTimeout work.

So that said, setTimeout does allow the browser to run some work before the callback fires (unlike microtasks). But if your goal is to allow input or rendering to run before the callback, setTimeout is usually not the best choice because it only incidentally allows those things to happen. Nowadays, there are better browser APIs that can hook more directly into the browser’s rendering system.

setImmediate

Before moving on to those “better browser APIs,” it’s worth mentioning this thing. setImmediate is, for lack of a better word … weird. If you look it up on caniuse.com, you’ll see that only Microsoft browsers support it. And yet it also exists in Node.js, and has lots of “polyfills” on npm. What the heck is this thing?

setImmediate was originally proposed by Microsoft to get around the problems with setTimeout described above. Basically, setTimeout had been abused, and so the thinking was that we can create a new thing to allow setImmediate(0) to actually be setImmediate(0) and not this funky “clamped to 4ms” thing. You can see some discussion about it from Jason Weber back in 2011.

Unfortunately, setImmediate was only ever adopted by IE and Edge. Part of the reason it’s still in use is that it has a sort of superpower in IE, where it allows input events like keyboard and mouseclicks to “jump the queue” and fire before the setImmediate callback is executed, whereas IE doesn’t have the same magic for setTimeout. (Edge eventually fixed this, as detailed in the previously-mentioned post.)

Also, the fact that setImmediate exists in Node means that a lot of “Node-polyfilled” code is using it in the browser without really knowing what it does. It doesn’t help that the differences between Node’s setImmediate and process.nextTick are very confusing, and even the official Node docs say the names should really be reversed. (For the purposes of this blog post though, I’m going to focus on the browser rather than Node because I’m not a Node expert.)

Bottom line: use setImmediate if you know what you’re doing and you’re trying to optimize input performance for IE. If not, then just don’t bother. (Or only use it in Node.)

requestAnimationFrame

Now we get to the most important setTimeout replacement, a timer that actually hooks into the browser’s rendering loop. By the way, if you don’t know how the browser event loops works, I strongly recommend this talk by Jake Archibald. Go watch it, I’ll wait.

Okay, now that you’re back, requestAnimationFrame basically works like this: it’s sort of like a setTimeout, except instead of waiting for some unpredictable amount of time (4 milliseconds, 16 milliseconds, 1 second, etc.), it executes before the browser’s next style/layout calculation step. Now, as Jake points out in his talk, there is a minor wrinkle in that it actually executes after this step in Safari, IE, and Edge <18, but let's ignore that for now since it's usually not an important detail.

The way I think of requestAnimationFrame is this: whenever I want to do some work that I know is going to modify the browser's style or layout – for instance, changing CSS properties or starting up an animation – I stick it in a requestAnimationFrame (abbreviated to rAF from here on out). This ensures a few things:

  1. I'm less likely to layout thrash, because all of the changes to the DOM are being queued up and coordinated.
  2. My code will naturally adapt to the performance characteristics of the browser. For instance, if it's a low-cost device that is struggling to render some DOM elements, rAF will naturally slow down from the usual 16.7ms intervals (on 60 Hertz screens) and thus it won't bog down the machine in the same way that running a lot of setTimeouts or setIntervals might.

This is why animation libraries that don't rely on CSS transitions or keyframes, such as GreenSock or React Motion, will typically make their changes in a rAF callback. If you're animating an element between opacity: 0 and opacity: 1, there's no sense in queuing up a billion callbacks to animate every possible intermediate state, including opacity: 0.0000001 and opacity: 0.9999999.

Instead, you're better off just using rAF to let the browser tell you how many frames you're able to paint during a given period of time, and calculate the "tween" for that particular frame. That way, slow devices naturally end up with a slower framerate, and faster devices end up with a faster framerate, which wouldn't necessarily be true if you used something like setTimeout, which operates independently of the browser's rendering speed.

requestIdleCallback

rAF is probably the most useful timer in the toolkit, but requestIdleCallback is worth talking about as well. The browser support isn't great, but there's a polyfill that works just fine (and it uses rAF under the hood).

In many ways rAF is similar to requestIdleCallback. (I'll abbreviate it to rIC from now on. Starting to sound like a pair of troublemakers from West Side Story, huh? "There go Rick and Raff, up to no good!")

Like rAF, rIC will naturally adapt to the browser's performance characteristics: if the device is under heavy load, rIC may be delayed. The difference is that rIC fires on the browser "idle" state, i.e. when the browser has decided it doesn't have any tasks, microtasks, or input events to process, and you're free to do some work. It also gives you a "deadline" to track how much of your budget you're using, which is a nice feature.

Dan Abramov has a good talk from JSConf Iceland 2018 where he shows how you might use rIC. In the talk, he has a webapp that calls rIC for every keyboard event while the user is typing, and then it updates the rendered state inside of the callback. This is great because a fast typist can cause many keydown/keyup events to fire very quickly, but you don't necessarily want to update the rendered state of the page for every keypress.

Another good example of this is a “remaining character count” indicator on Twitter or Mastodon. I use rIC for this in Pinafore, because I don't really care if the indicator updates for every single key that I type. If I'm typing quickly, it's better to prioritize input responsiveness so that I don't lose my sense of flow.

Screenshot of Pinafore with some text entered in the text box and a digit counter showing the number of remaining characters

In Pinafore, the little horizontal bar and the “characters remaining” indicator update as you type.

One thing I’ve noticed about rIC, though, is that it’s a little finicky in Chrome. In Firefox it seems to fire whenever I would, intuitively, think that the browser is “idle” and ready to run some code. (Same goes for the polyfill.) In mobile Chrome for Android, though, I’ve noticed that whenever I scroll with touch scrolling, it might delay rIC for several seconds even after I’m done touching the screen and the browser is doing absolutely nothing. (I suspect the issue I’m seeing is this one.)

Update: Alex Russell from the Chrome team informs me that this is a known issue and should be fixed soon!

In any case, rIC is another great tool to add to the tool chest. I tend to think of it this way: use rAF for critical rendering work, use rIC for non-critical work.

debounce and throttle

These two functions aren’t built in to the browser, but they’re so useful that they’re worth calling out on their own. If you aren’t familiar with them, there’s a good breakdown in CSS Tricks.

My standard use for debounce is inside of a resize callback. When the user is resizing their browser window, there’s no point in updating the layout for every resize callback, because it fires too frequently. Instead, you can debounce for a few hundred milliseconds, which will ensure that the callback eventually fires once the user is done fiddling with their window size.

throttle, on the other hand, is something I use much more liberally. For instance, a good use case is inside of a scroll event. Once again, it’s usually senseless to try to update the rendered state of the app for every scroll callback, because it fires too frequently (and the frequency can vary from browser to browser and from input method to input method… ugh). Using throttle normalizes this behavior, and ensures that it only fires every x number of milliseconds. You can also tweak Lodash’s throttle (or debounce) function to fire at the start of the delay, at the end, both, or neither.

In contrast, I wouldn’t use debounce for the scrolling scenario, because I don’t want the UI to only update after the user has explicitly stopped scrolling. That can get annoying, or even confusing, because the user might get frustrated and try to keep scrolling in order to update the UI state (e.g. in an infinite-scrolling list). throttle is better in this case, because it doesn’t wait for the scroll event to stop firing.

throttle is a function I use all over the place for all kinds of user input, and even for some regularly-scheduled tasks like IndexedDB cleanups. It’s extremely useful. Maybe it should just be baked into the browser some day!

Conclusion

So that’s my whirlwind tour of the various timer functions available in the browser, and how you might use them. I probably missed a few, because there are certainly some exotic ones out there (postMessage or lifecycle events, anyone?). But hopefully this at least provides a good overview of how I think about JavaScript timers on the web.

Smaller Lodash bundles with Webpack and Babel

One of the benefits of working with smart people is that you can learn a lot from them through osmosis. As luck would have it, a recent move placed my office next to John-David Dalton‘s, with the perk being that he occasionally wanders into my office to talk about cool stuff he’s working on, like Lodash and ES modules in Node.

Recently we chatted about Lodash and the various plugins for making its bundle size smaller, such as lodash-webpack-plugin and babel-plugin-lodash. I admitted that I had used both projects but only had a fuzzy notion of what they actually did, or why you’d want to use one or the other. Fortunately J.D. set me straight, and so I thought it’d be a good opportunity to take what I’ve learned and turn it into a short blog post.

TL;DR

Use the import times from 'lodash/times' format over import { times } from 'lodash' wherever possible. If you do, then you don’t need the babel-plugin-lodash. Update: or use lodash-es instead.

Be very careful when using lodash-webpack-plugin to check that you’re not omitting any features you actually need, or stuff can break in production.

Avoid Lodash chaining (e.g. _(array).map(...).filter(...).take(...)), since there’s currently no way to reduce its size.

babel-plugin-lodash

The first thing to understand about Lodash is that there are multiple ways you can use the same method, but some of them are more expensive than others:

import { times } from 'lodash'   // 68.81kB  :(
import times from 'lodash/times' //  2.08kB! :)

times(3, () => console.log('whee'))

You can see the difference using something like webpack-bundle-analyzer. Here’s the first version:

Screenshot of lodash.js taking up almost the entire bundle size

Using the import { times } from 'lodash' idiom, it turns out that lodash.js is so big that you can’t even see our tiny index.js! Lodash takes up a full parsed size of 68.81kB. (In the bundle analyzer, hover your mouse over the module to see the size.)

Now here’s the second version (using import times from 'lodash/times'):

Screenshot showing many smaller Lodash modules not taking up so much space

In the second screenshot, Lodash’s total size has shrunk down to 2.08kB. Now we can finally see our index.js!

However, some people prefer the second syntax to the first, especially since it can get more terse the more you import.

Consider:

import { map, filter, times, noop } from 'lodash'

compared to:

import map from 'lodash/map'
import filter from 'lodash/filter'
import times from 'lodash/times'
import noop from 'lodash/noop'

What the babel-plugin-lodash proposes is to automatically rewrite your Lodash imports to use the second pattern rather than the first. So it would rewrite

import { times } from 'lodash'

as

import times from 'lodash/times'

One takeway from this is that, if you’re already using the import times from 'lodash/times' idiom, then you don’t need babel-plugin-lodash.

Update: apparently if you use the lodash-es package, then you also don’t need the Babel plugin. It may also have better tree-shaking outputs in Webpack due to setting "sideEffects": false in package.json, which the main lodash package does not do.

lodash-webpack-plugin

What lodash-webpack-plugin does is a bit more complicated. Whereas babel-plugin-lodash focuses on the syntax in your own code, lodash-webpack-plugin changes how Lodash works under the hood to make it smaller.

The reason this cuts down your bundle size is that it turns out there are a lot of edge cases and niche functionality that Lodash provides, and if you’re not using those features, they just take up unnecessary space. There’s a full list in the README, but let’s walk through some examples.

Iteratee shorthands

What in the heck is an “iteratee shorthand”? Well, let’s say you want to map() an Array of Objects like so:

import map from 'lodash/map'
map([{id: 'foo'}, {id: 'bar'}], obj => obj.id) // ['foo', 'bar']

In this case, Lodash allows you to use a shorthand:

import map from 'lodash/map'
map([{id: 'foo'}, {id: 'bar'}], 'id') // ['foo', 'bar']

This shorthand syntax is nice to save a few characters, but unfortunately it requires Lodash to use more code under the hood. So lodash-webpack-plugin can just remove this functionality.

For example, let’s say I use the full arrow function instead of the shorthand. Without lodash-webpack-plugin, we get:

Screenshot showing multiple lodash modules under .map

In this case, Lodash takes up 18.59kB total.

Now let’s add lodash-webpack-plugin:

Screenshot of lodash with a very small map.js dependency

And now Lodash is down to 117 bytes! That’s quite the savings.

Collection methods

Another example is “collection methods” for Objects. This means being able to use standard Array methods like forEach() and map() on an Object, in which case Lodash gives you a callback with both the key and the value:

import forEach from 'lodash/forEach'

forEach({foo: 'bar', baz: 'quux'}, (value, key) => {
  console.log(key, value)
  // prints 'foo bar' then 'baz quux'
})

This is handy, but once again it has a cost. Let’s say we’re only using forEach for Arrays:

import forEach from 'lodash/forEach'

forEach(['foo', 'bar'], obj => {
  console.log(obj) // prints 'foo' then 'bar
})

In this case, Lodash will take up a total of 5.06kB:

Screenshot showing Lodash forEach() taking up quite a few modules

Whereas once we add in lodash-webpack-plugin, Lodash trims down to a svelte 108 bytes:

Screenshot showing a very small Lodash forEach.js module

Chaining

Another common Lodash feature is chaining, which exposes functionality like this:

import _ from 'lodash'
const array = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
console.log(_(array)
  .map(i => parseInt(i, 10))
  .filter(i => i % 2 === 1)
  .take(5)
  .value()
) // prints '[ 1, 3, 5, 7, 9 ]'

Unfortunately there is currently no good way to reduce the size required for chaining. So you’re better off importing the Lodash functions individually:

import map from 'lodash/map'
import filter from 'lodash/filter'
import take from 'lodash/take'
const array = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']

console.log(
  take(
    filter(
      map(array, i => parseInt(i, 10)),
    i => i % 2 === 1),
  5)
) // prints '[ 1, 3, 5, 7, 9 ]'

Using the lodash-webpack-plugin with the chaining option enabled, the first example takes up the full 68.81kB:

Screenshot showing large lodash.js dependency

This makes sense, since we’re still importing all of Lodash for the chaining to work.

Whereas the second example with chaining disabled gives us only 590 bytes:

Screenshot showing a handful of small Lodash modules

The second piece of code is a bit harder to read than the first, but it’s certainly a big savings in file size! Luckily J.D. tells me there may be some work in progress on a plugin that could rewrite the second syntax to look more like the first (similar to babel-plugin-lodash).

Edit: it was brought to my attention in the comments that this functionality should be coming soon to babel-plugin-lodash!

Gotchas

Saving bundle size is great, but lodash-webpack-plugin comes with some caveats. By default, all of these features – shorthands for the iteratee shorthands, collections for the Object collection methods, and others – are disabled by default. Furthermore, they may break or even silently fail if you try to use them when they’re disabled.

This means that if you only use lodash-webpack-plugin in production, you may be in for a rude surprise when you test something in development mode and then find it’s broken in production. In my previous examples, if you use the iteratee shorthand:

map([{id: 'foo'}, {id: 'bar'}], 'id') // ['foo', 'bar']

And if you don’t enable shorthands in lodash-webpack-plugin, then this will actually throw a runtime error:

map.js:16 Uncaught TypeError: iteratee is not a function

In the case of the Object collection methods, it’s more insidious. If you use:

forEach({foo: 'bar', baz: 'quux'}, (value, key) => {
  console.log(key, value)
})

And if you don’t enable collections in lodash-webpack-plugin, then the forEach() method will silently fail. This can lead to some very hard-to-uncover bugs!

Conclusion

The babel-plugin-lodash and lodash-webpack-plugin packages are great. They’re an easy way to reduce your bundle size by a significant amount and with minimal effort.

The lodash-webpack-plugin is particularly useful, since it actually changes how Lodash operates under the hood and can remove functionality that almost nobody uses. Support for edge cases like sparse arrays (guards) and typed arrays (exotics) is unlikely to be something you’ll need.

While the lodash-webpack-plugin is extremely useful, though, it also has some footguns. If you’re only enabling it for production builds, you may be surprised when something works in development but then fails in production. It might also be hard to add to a large existing project, since you’ll have to meticulously audit all your uses of Lodash.

So be sure to carefully read the documentation before installing the lodash-webpack-plugin. And if you’re not sure if you need a certain feature, then you may be better off enabling that feature (or disabling the plugin entirely) and just take the ~20kB hit.

Note: if you’d like to experiment with this yourself, I put these examples into a small GitHub repo. If you uncomment various bits of code in src/index.js, and enable or disable the Babel and Webpack plugins in .babelrc and webpack.config.js, then you can play around with these examples yourself.

High-performance Web Worker messages

Update: this blog post was based on the latest browsers as of early 2016. Things have changed, and in particular the benchmark shows that recent versions of Chrome do not exhibit the performance cliff for non-stringified postMessage() messages as described in this post.

In recent posts and talks, I’ve explored how Web Workers can vastly improve the responsiveness of a web application, by moving work off the UI thread and thereby reducing DOM-blocking. In this post, I’ll delve a bit more deeply into the performance characteristics of postMessage(), which is the primary interface for communicating with Web Workers.

Since Web Workers run in a separate thread (although not necessarily a separate process), and since JavaScript environments don’t share memory across threads, messages have to be explicitly sent between the main thread and the worker. As it turns out, the format you choose for this message can have a big impact on performance.

TLDR: always use JSON.stringify() and JSON.parse() to communicate with a Web Worker. Be sure to fully stringify the message.

I first came across this tip from IndexedDB spec author and Chrome developer Joshua Bell, who mentioned offhand:

We know that serialization/deserialization is slow. It’s actually faster to JSON.stringify() then postMessage() a string than to postMessage() an object.

This insight was further confirmed by Parashuram N., who demonstrated experimentally that stringify was a key factor in making a worker-based React implementation that improved upon vanilla React. He says:

By “stringifying” all messages between the worker and the main thread, React implemented on a Web Worker [is] faster than the normal React version. The perf benefit of the Web Worker approach starts to increase as the number of nodes increases.

Malte Ubl, tech lead of the AMP project, has also been experimenting with postMessage() in Web Workers. He had this to say:

On phones, [stringifying] is quickly relevant, but not with just 3 or so fields. Just measured the other day. It is bad.

This made me curious as to where, exactly, the tradeoffs lie with stringfying messages. So I decided to create a simple benchmark and run it on a variety of browsers. My tests confirmed that stringifying is indeed faster than sending raw objects, and that the message size has a dramatic impact on the speed of worker communication.

Furthermore, the only real benefit comes if you stringify the entire message. Even a small object that wraps the stringified message (e.g. {msg: JSON.stringify(message)}) performs worse than the fully-stringified case. (These results differ between Chrome, Firefox, and Safari, but keep reading for the full analysis.)

Test results

In this test, I ran 50,000 iterations of postMessage() (both to and from the worker) and used console.time() to measure the total time spent posting messages back and forth. I also varied the number of keys in the object between 0 and 30 (keys and values were both just Math.random()).

Clarification: the test does include the overhead of JSON.parse() and JSON.stringify(). The worker even re-stringifies the message when echoing it back.

First, here are the results in Chrome 48 (running on a 2013 MacBook Air with Yosemite):

Chrome 48 test results

And in Chrome 48 for Android (running on a Nexus 5 with Android 5.1):

Nexus 5 Chrome test results

What’s clear from these results is that full stringification beats both partial stringification and no-stringification across all message sizes. The difference is fairly stark on desktop Chrome for small messages sizes, but this difference start to narrow as message size increases. On the Nexus 5, there’s no such dramatic swing.

In Firefox 46 (also on the MacBook Air), stringification is still the winner, although by a smaller margin:

Firefox test results

In Safari 9, it gets more interesting. For Safari, at least, stringification is actually slower than posting raw messages:

Safari test results

Based on these results, you might be tempted to think it’s a good idea to UA-sniff for Safari, and avoid stringification in that browser. However, it’s worth considering that Safari is consistently faster than Chrome (with or without stringification), and that it’s also faster than Firefox, at least for small message sizes. Here are the stringified results for all three browsers:

Stringification results for all browsers

So the fact that Safari is already fast for small messages would reduce the attractiveness of any UA-sniffing hack. Also notice that Firefox, to its credit, maintains a fairly consistent response time regardless of message size, and starts to actually beat both Safari and Chrome at the higher levels.

Now, assuming we were to use the UA-sniffing approach, we could swap in the raw results for Safari (i.e. showing the fastest times for each browser), which gives us this:

Results with the best time for each browser

So it appears that avoiding stringification in Safari allows it to handily beat the other browsers, although it does start to converge with Firefox for larger message sizes.

On a whim, I also tested Transferables, i.e. using ArrayBuffers as the data format to transfer the stringified JSON. In theory, Transferables can offer some performance gains when sending large data, because the ArrayBuffer is instantly zapped from one thread to the other, without any cloning or copying. (After transfer, the ArrayBuffer is unavailable to the sender thread.)

As it turned out, though, this didn’t perform well in either Chrome or Firefox. So I didn’t explore it any further.

Chrome test results, with arraybuffer

Firefox results with arraybuffer

Transferables might be useful for sending binary data that’s already in that format (e.g. Blobs, Files, etc.), but for JSON data it seems like a poor fit. On the bright side, they do have wide browser support, including Chrome, Firefox, Safari, IE, and Edge.

Speaking of Edge, I would have run these tests in that browser, but unfortunately my virtual machine kept crashing due to the intensity of the tests, and I didn’t have an actual Windows device handy. Contributions welcome!

Correction: this post originally stated that Safari doesn’t support Transferables. It does.

Update: Boulos Dib has gracious run the numbers for Edge 13, and they look very similar to Safari (in that raw objects are faster than stringification):

Edge 13 results

Conclusion

Based on these tests, my recommendation would be to use stringification across the board, or to UA-sniff for Safari and avoid stringification in that browser (but only if you really need maximum performance!).

Another takeaway is that, in general, message sizes should be kept small. Firefox seems to be able to maintain a relatively speedy delivery regardless of the message size, but Safari and Chrome tend to slow down considerably as the message size increases. For very large messages, it may even make sense to save the data to IndexedDB from the worker, and then simply fetch the saved data from the main thread, but I haven’t verified this idea with a benchmark.

The full results for my tests are available in this spreadsheet. I encourage anybody who wants to reproduce these results to check out the test suite and offer a pull request or the results from their own browser.

And if you’d like a simple Web Worker library that makes use of stringification, check out promise-worker.

Update: Chris Thoburn has offered another Web Worker performance test that adds some additional ways of sending messages, like MessageChannels. Here are his own browser results.

How to think about databases

As a maintainer of PouchDB, I get a lot of questions from developers about how best to work with databases. Since PouchDB is a JavaScript library, and one with fairly approachable documentation (if I do say so myself), many of these folks tend toward the more beginner-ish side of the spectrum. However, even with experienced developers, I find that many of them don’t have a clear picture of how a database should fit into their overall app structure.

The goal of this article is to lay out my perspective on the proper place for a database within your app code. My focus will be on the frontend – e.g. SQLite in an Android app, CoreData in an iOS app, or IndexedDB in a webapp – but the discussion could apply equally well to a server-side app using MongoDB, MySQL, etc.

What is a database, anyway?

I have a friend who recently went through a developer bootcamp. He’s a smart guy, but totally inexperienced with JavaScript (or any kind of coding) before he started. So I found his questions endlessly fascinating, because they reminded me what it was like learning to code.

Part of his coursework was on MongoDB, and I recall spending some time coaching him on Mongoose queries. As I was explaining the concepts to him, he got a little frustrated and asked, “What’s the point of a database, anyway? Why do I need this thing?”

For a beginner, this is a perfectly valid question. You’ve already spent a long time learning to work with data in the form of objects and arrays (or “dictionaries” and “lists,” or whatever your language calls them), and now suddenly you’re told you need to learn about this separate thing called a “database” that has similar kinds of operations, but they’re a lot more awkward. Instead of your familiar for-loops and assignments, you’re structuring queries and defining schemas. Why all the overhead?

To answer that question, let’s take a step back and remember why we have databases in the first place.

#1 goal of a database: don’t forget stuff

When you create an object or an array in your code, what you have is data:

var array = [1, 2, 3, 4, 5];

This data feels tangible. You can iterate through it, you can print it out, you can insert and remove things, and you can even .map() and .filter() it to transform it in all sorts of interesting ways. Data structures like this are the raw material your code is made of.

However, there’s an ephemeral side to this data. We happen to call the space that it lives in “memory” or “RAM” (Random Access Memory), but in fact “memory” is kind of a nasty misnomer, because as soon as your application stops, that data is gone forever.

You can imagine that if computers only had memory to work with, then computer programs would be pretty frustrating to use. If you wanted to write a Word document, you’d need to be sure to print it out before you closed Word, because otherwise you’d lose your work. And of course, once you restarted Word, you’d have to laboriously type your document back in by hand. Even worse, if you ever had a power outage or the program crashed, your data would vanish into the ether.

Thankfully, we don’t have to deal with this awful scenario, because we have hard disks, i.e. a place where data can be stored more permanently. Sometimes this is called “storage,” so for instance when you buy a new laptop with 200GB of storage but only 8GB of RAM, you’re looking at the difference between disk (or storage) and memory (or RAM). One is permanent, the other is fleeting.

So if disk is so awesome, why don’t computers just use that? Why do we have RAM at all?

Well, the reason for that is that there’s a pretty big tradeoff in speed between “storage” and “memory.” You’ve felt it if you’ve ever copied a large file to a USB stick, or if you’ve seen an old low-RAM machine that look a long time to switch between windows. That’s called paging, and it’s when your computer runs out of RAM, so it starts hot-swapping between RAM and disk.

Latency numbers every programmer should know

Latency numbers, visualized.

This performance difference cannot be overstated. If you look at a chart of latency numbers every programmer should know, you’ll see that reading 1MB sequentially from memory takes about 250 microseconds, whereas reading 1MB from disk is 20 milliseconds. If those numbers both sound small, consider the scale: if 250 microseconds were the time it took to brush your teeth (5 minutes, if you listen to your dentist!), then 20 milliseconds would be 4.6 days, which is enough time to drive east-to-west across North America, with plenty of breaks in between.

And if you think reading 1MB from SSD is much better (1 millisecond), then consider that in our toothbrush-scale, it would still be 5.5 hours. That’s the time it would take for you to fly from New York to San Francisco, which is quite a bit shorter than our road trip, but still something you’d need to pack your bags for.

In a computer program, the kind of operations you can “get away with” in the toothbrush-scale of 5 minutes are totally different than what you can do in 5 hours or 4 days. This is the difference between a snappy application and a sluggish application, and it’s also at the heart of how you should be thinking about databases within your app.

Storage vs memory

Let’s move away from toothbrushes for a moment and try a different analogy. This is the one I find most useful myself when I’m writing an app.

Memory (including objects, arrays, variables, etc.) are like the counter space in your kitchen when you’re preparing a meal. You have all the tools available to you, you can quickly chop your carrots and put them into a bowl, you can mix the onions with the celery, and all of these things can be done fairly quickly without having to move around the kitchen.

Storage, on the other hand (including filesystems and databases) are like the freezer. It’s a place where you put food that you know you’re going to need later. However, when you pull it out of the freezer, there’s often a thawing period. You also don’t want to be constantly opening your freezer to pull ingredients in and out, or your electric bill is going to go through the roof! Plus, your food will probably end up tasting awful.

Probably the biggest mistake I see beginners make when working with databases is that they want to treat their freezer like their counter space. They want their application data to be perfectly mirrored in their database schemas, and they don’t want to have to think about where their food comes from – whether it’s been sitting on the counter for a few seconds, or in the freezer for a few days.

This is at the root of a lot of suffering when working with databases. You either end up constantly reading things in and out of disk, which means that your app runs slowly (and you probably blame your database!), or you have to meticulously manage your schemas and painstakingly migrate your data whenever anything in your in-memory representation changes.

Unfortunately, this idea that we can treat our databases like our RAM is a by-product of the ORM (Object-Relational Mapping) mentality, which in my opinion is one of the most toxic and destructive ideas in software engineering, because it sells you a false vision of hope. The ORM salesman promises that you can work with your in-memory objects and make them as fancy as you like, and then magically those objects will be persisted to the database (exactly as you left them!), and you’ll never even have to think about what a database is or how you’re accessing it.

In my experience, this is never how it works out with ORMs. It may seem easy at first, but eventually your usage of the database will become inefficient, and you’ll have to drop down into the murky details of the ORM layer, figure out the queries you wish it were doing, and then try to guess the incantation needed to make it perform that query. In effect, the promise of not having to think about the database is a sham, because you just end up just having to learn two layers: the database layer and the ORM layer. It’s a classic leaky abstraction.

Even if you manage to tame your ORM, you usually end up with a needlessly complex schema format, as the inflexibility of working with stored data collides with the needs of a flexible in-memory format. You might find that you wind up with a SQLite table with 20 columns, merely because your class has 20 variables – even if none of those 20 columns are ever used for querying, and in fact are just wasted space.

This partially explains the attraction of NoSQL databases, but I believe that even without rigid schemas, this problem of the “ORM mindset” remains. Mongoose is a good example of this, as it tries to mix JavaScript and MongoDB in a way that you can’t tell where one starts and the other ends. Invariably, though, this leads developers to hope that their in-memory format can exactly match their database format, which leads to irreconcilable situations (such as classes with behavior) or slowdowns (such as over-fetching or over-storing).

All of this is pretty abstract, so let me take some concrete examples from a recent app I wrote, Pokedex.org, and how I carefully modeled my database structure to maximize performance. (If you’re unfamiliar with Pokedex.org, you may want to read the introductory blog post.)

Case study: Pokedex.org

The first consideration I had to make for Pokedex.org was which database to use in the first place. Without going into the details of browser databases, I ended up choosing two:

  • LocalForage, because it has a simple key-value API that’s good for storing application state.
  • PouchDB, because it has good APIs for working with larger datasets, and can serve as an offline-first layer in front of Cloudant or CouchDB.

PouchDB can also store key-value data, so I might have used it for both. However, another benefit of LocalForage is that the bundle size is much smaller (8KB vs PouchDB’s 45KB). And in my case I had three JavaScript bundles (one for the service worker, one for the web worker, and one for the main JavaScript app), so I didn’t want to push 45KB down the wire three times. Hence I chose LocalForage for the simple stuff.

Pokedex.org database usage

Pokedex.org database usage

You can see what kind of data I stored in LocalForage if you go into the Chrome Dev Tools on Pokedex.org and open the “Resources” tab. You’ll see I’m using it to store the ServiceWorker data version (so it knows when to update), as well as "informedOffline", which just tells me whether I’ve already shown the dialog that says, “Hey, this app works offline.” If I had more app data to store (such as the user’s favorite Pokémon, or how many times they’ve opened the app), I might store that in LocalForage.

PouchDB, however, is responsible for storing the majority of the Pokémon data – i.e. the 649 species of monsters, their stats, and their moves. So this is much more interesting.

First off, you’ll notice that as you type into the search bar, you immediately get a filtered list showing Pokémon that match your search string. This is a simple prefix search, so if you type “bu” you will see “Bulbasaur” and “Butterfree” amongst others.

 

This search bar is super fast, and it ought to be, because it’s supposed to respond to user input. There’s a debounce on the actual <input> handler, but in principle every keystroke represents a database query, meaning that there’s a lot of data flying back and forth.

I considered using PouchDB for this, but I decided it would be too slow. PouchDB does offer built-in prefix search, but I don’t want to have to go back and forth to IndexedDB (i.e. disk) for every keystroke. So instead, I wrote a simple in-memory database layer that stores Pokémon summary data, i.e. only the things that are necessary to show in the list, which happens to be their name, number, and types. (The sprite comes from a CSS class based on their number.)

To perform the search itself, I just used a sorted array of String names, with a binary search to ensure that lookups take O(log n) time. If the list were larger, I might try to condense it as a trie, but I figured that would be overkill for this app.

For a small amount of data, this in-memory strategy works great. However, when you click on a Pokémon, it brings up a detail page with stats, evolutions, and moves, which is much too large to keep in memory. So for this, I used PouchDB.

 

Given that I am the primary author of PouchDB map/reduce, relational-pouch, and pouchdb-find, you may be surprised to learn that I didn’t use any of them for this task. Obviously I put a lot of care into those libraries, and I do think they’re useful for beginners who are unsure how to structure their data. But from a performance standpoint, none of them can beat the potential gains from rolling your own, so that’s what I did.

In this case, I used my knowledge of IndexedDB performance subtleties to get the maximum possible throughput in the shortest amount of time. Essentially, what I did was split up my data into seven separate PouchDB databases, representing seven different IndexedDB databases on disk:

  • Monster basic data
  • Monster descriptions
  • Monster evolutions
  • Monster supplemental data (anything not covered above)
  • Types
  • Monster moves
  • Moves

The first four all use IDs based on the number of the Pokémon (e.g. Bulbasaur is 1, Ivysaur is 2, etc.), and map to data such as evolutions, stats, and descriptions. This means that tapping on a Pokémon involves a simple key-value lookup.

The reason I segmented this data into multiple databases is because IndexedDB happens to do a lot of transaction-level blocking at the database level. If you have the luxury of specifying separate IndexedDB objectStores, you can allow your databases queries to run in parallel under the hood, but in the case of PouchDB all of the objectStores are predefined (due to the CouchDB-style revision semantics written on top of IndexedDB).

In practice, this usually means that read/write operations (such as the initial import of the data) will run sequentially unless you use separate PouchDB objects. Sequential is bad – we want the database to do as much work as quickly as possible – so I avoided using one large PouchDB database. (If you were using a lower-level library like Dexie, though, you could use a single database with separate objectStores and probably get a similar result.)

So when you tap on a Pokémon, the app fires off six concurrent get() requests, which the underlying IndexedDB layer is free to run in parallel. This is why you barely have to wait at all to see the Pokémon data, although it helps that I have a snazzy animation while the lookup is in progress. (Animations are a great way to mask slow operations!) The query is also run in a web worker, which is why you won’t see any UI blocking from IndexedDB during database interactions.

Pokémon's type(s) determine its strengths/weaknesses

A Pokémon’s type(s) determine its strengths/weaknesses relative to other types

Now, two of the six requests I described above are for a Pokémon’s “type” information, which merit some explanation. Each Pokémon has up to two types (e.g. Fire and Water), and types also have strengths and weaknesses relative to each other: Fire beats Grass, Water beats Fire, etc. The “types” database contains this big rock-paper-scissors grid, which isn’t keyed by Pokémon ID like the other four, but rather by type name.

However, since the type names of each Pokémon are already available in-memory (due to the summary view), the queries for a Pokémon’s strengths and weaknesses can be fired off in parallel with the other queries. And since they’re equally simple get() requests, they take about the same amount of time to complete. This was a nice side effect of my previous in-memory optimizations.

The last two databases are a bit trickier than the others, and are quite relation-y. I called these the “monster moves” and “moves” databases, and I modeled their implementation after relational-pouch (although I didn’t feel the need to use relational-pouch itself).

 

Essentially, the “monster moves” database contains a mapping from monster IDs to a list of learned moves (e.g. Bulbasaur learns Razor Leaf at level 27), while the “moves” database contains a mapping from move IDs to information about the move (e.g. Razor Leaf has a certain power, accuracy, and description). If you’re familiar with SQL, you might recognize that I would need a JOIN to combine this data together, although in my case I just did the join explicitly myself, in JavaScript.

Since this is a many-to-many relationship (Pokémon can learn many moves, and moves can be learned by many Pokémon), it would be prohibitively redundant to include the “move” data inside the “monster move” database – that’s why I split it apart. However, the relational query (i.e. the JOIN) has a cost, and I saw it while developing my app – it takes nearly twice as long to fetch the full “moves” data (75ms on a Nexus 5X) as it does to fetch the more basic data (40ms – these numbers are much larger on a slow device). So what to do?

Well, I pulled off a sleight-of-hand. You’ll notice that, especially in a mobile browser, the list of Pokémon moves is “below the fold.” Thus, I can simply load the above-the-fold data first, and then lazily fetch the rest before the user has scrolled down. On a fast mobile browser, you probably won’t even notice that anything was fetched in two stages, although on a huge monitor you might be able to glimpse it. (I considered adding a loading spinner, but the “moves” data was already fast enough that I felt it was unnecessary.)

So there you have it: the queries that ought to feel “instant” are done in memory, the queries that take a bit longer are fetched in parallel (with an animation to keep the eye busy), and the queries that are super slow are slipped in below-the-fold. This is a subtle ballet with lots of carefully orchestrated movements, and the end result is an app that feels pretty seamless, despite all the work going on behind the scenes.

Conclusion

When you’re working with databases, it’s worthwhile to understand the APIs you’re dealing with, and what they’re doing under the hood. Unfortunately, databases are not magic, and there’s no abstraction in the world (I believe) that can obviate the need to learn at least a little bit about how a database works.

So when you’re using a database, be sure to ask yourself the following questions:

  1. Is this database in-memory (Redis, LokiJS, MemDOWN, etc.) or on-disk (PouchDB, LocalForage, Lovefield, etc.)? Is it a mix between the two (e.g. LevelDB)?
  2. What needs to be stored on disk? What data should survive the application being closed or crashing?
  3. What needs to be indexed in order to perform fast queries? Can I use an in-memory index instead of going to disk?
  4. How should I structure my in-memory data relative to my database data? What’s my strategy for mapping between the two?
  5. What are the query needs of my app? Does a summary view really need to fetch the full data, or can it just fetch the little bit it needs? Can I lazy-load anything?

Once you’ve answered these questions, you can write an app that is fast, responsive, and doesn’t lose user data. You’ll also make your own job easier as a programmer, if you try to maintain a good grasp of the differences between your in-memory data (your counter space) and your on-disk data (your freezer).

Nobody likes freezer burn, but spoiled meat that’s been left on the counter overnight is even worse. Understand the difference between the two, and you’ll be a master chef in no time.

Notes

Of course there are more advanced topics I could have covered here, such as indexes, sync, caching, B-trees, and on and on. (We could even extend the metaphor to talk about “tagging” food in the freezer as an analogy for indexing!) But I wanted to keep this blog post small and focused, and just communicate the bare basics of the common mistakes I’ve seen people make with databases.

I also apologize for all the abstruse IndexedDB tricks – those probably merit their own blog post. In particular, I need to provide some experimental data to back up my claim that it’s better to break up a single IndexedDB database into multiple smaller ones. This trick is based on my personal experience with IndexedDB, where I noticed a high cost of fetching and storing large monolithic documents, but I should do a more formal study to confirm it.

Thanks to Nick Colley, Chris Gullian, Jan Lehnardt, and Garren Smith for feedback on a draft of this blog post.