javascript | Read the Tea Leaves

Posts Tagged ‘javascript’

2 Dec

Let’s learn how modern JavaScript frameworks work by building one

Posted by Nolan Lawson in Web. Tagged: javascript. 20 comments

Hand-drawn looking JavaScript logo saying DIY JS

In my day job, I work on a JavaScript framework (LWC). And although I’ve been working on it for almost three years, I still feel like a dilettante. When I read about what’s going on in the larger framework world, I often feel overwhelmed by all the things I don’t know.

One of the best ways to learn how something works, though, is to build it yourself. And plus, we gotta keep those “days since last JavaScript framework” memes going. So let’s write our own modern JavaScript framework!

What is a “modern JavaScript framework”?

React is a great framework, and I’m not here to dunk on it. But for the purposes of this post, “modern JavaScript framework” means “a framework from the post-React era” – i.e. Lit, Solid, Svelte, Vue, etc.

React has dominated the frontend landscape for so long that every newer framework has grown up in its shadow. These frameworks were all heavily inspired by React, but they’ve evolved away from it in surprisingly similar ways. And although React itself has continued innovating, I find that the post-React frameworks are more similar to each other than to React nowadays.

To keep things simple, I’m also going to avoid talking about server-first frameworks like Astro, Marko, and Qwik. These frameworks are excellent in their own way, but they come from a slightly different intellectual tradition compared to the client-focused frameworks. So for this post, let’s only talk about client-side rendering.

What sets modern frameworks apart?

From my perspective, the post-React frameworks have all converged on the same foundational ideas:

Using reactivity (e.g. signals) for DOM updates.
Using cloned templates for DOM rendering.
Using modern web APIs like <template> and Proxy, which make all of the above easier.

Now to be clear, these frameworks differ a lot at the micro level, and in how they handle things like web components, compilation, and user-facing APIs. Not all frameworks even use Proxys. But broadly speaking, most framework authors seem to agree on the above ideas, or they’re moving in that direction.

So for our own framework, let’s try to do the bare minimum to implement these ideas, starting with reactivity.

Reactivity

It’s often said that “React is not reactive”. What this means is that React has a more pull-based rather than a push-based model. To grossly oversimplify things: in the worst case, React assumes that your entire virtual DOM tree needs to be rebuilt from scratch, and the only way to prevent these updates is to implement React.memo (or in the old days, shouldComponentUpdate).

Using a virtual DOM mitigates some of the cost of the “blow everything away and start from scratch” strategy, but it doesn’t fully solve it. And asking developers to write the correct memo code is a losing battle. (See React Forget for an ongoing attempt to solve this.)

Instead, modern frameworks use a push-based reactive model. In this model, individual parts of the component tree subscribe to state updates and only update the DOM when the relevant state changes. This prioritizes a “performant by default” design in exchange for some upfront bookkeeping cost (especially in terms of memory) to keep track of which parts of the state are tied to which parts of the UI.

Note that this technique is not necessarily incompatible with the virtual DOM approach: tools like Preact Signals and Million show that you can have a hybrid system. This is useful if your goal is to keep your existing virtual DOM framework (e.g. React) but to selectively apply the push-based model for more performance-sensitive scenarios.

For this post, I’m not going to rehash the details of signals themselves, or subtler topics like fine-grained reactivity, but I am going to assume that we’ll use a reactive system.

Cloning DOM trees

For a long time, the collective wisdom in JavaScript frameworks was that the fastest way to render the DOM is to create and mount each DOM node individually. In other words, you use APIs like createElement, setAttribute, and textContent to build the DOM piece-by-piece:

const div = document.createElement('div')
div.setAttribute('class', 'blue')
div.textContent = 'Blue!'

One alternative is to just shove a big ol’ HTML string into innerHTML and let the browser parse it for you:

const container = document.createElement('div')
container.innerHTML = `
  <div class="blue">Blue!</div>
`

This naïve approach has a big downside: if there is any dynamic content in your HTML (for instance, red instead of blue), then you would need to parse HTML strings over and over again. Plus, you are blowing away the DOM with every update, which would reset state such as the value of <input>s.

Note: using innerHTML also has security implications. But for the purposes of this post, let’s assume that the HTML content is trusted. ¹

At some point, though, folks figured out that parsing the HTML once and then calling cloneNode(true) on the whole thing is pretty danged fast:

const template = document.createElement('template')
template.innerHTML = `
  <div class="blue">Blue!</div>
`
template.content.cloneNode(true) // this is fast!

Here I’m using a <template> tag, which has the advantage of creating “inert” DOM. In other words, things like <img> or <video autoplay> don’t automatically start downloading anything.

How fast is this compared to manual DOM APIs? To demonstrate, here’s a small benchmark. Tachometer reports that the cloning technique is about 50% faster in Chrome, 15% faster in Firefox, and 10% faster in Safari. (This will vary based on DOM size and number of iterations, but you get the gist.)

What’s interesting is that <template> is a new-ish browser API, not available in IE11, and originally designed for web components. Somewhat ironically, this technique is now used in a variety of JavaScript frameworks, regardless of whether they use web components or not.

Note: for reference, here is the use of cloneNode on <template>s in Solid, Vue Vapor, and Svelte v5.

There is one major challenge with this technique, which is how to efficiently update dynamic content without blowing away DOM state. We’ll cover this later when we build our toy framework.

Modern JavaScript APIs

We’ve already encountered one new API that helps a lot, which is <template>. Another one that’s steadily gaining traction is Proxy, which can make building a reactivity system much simpler.

When we build our toy example, we’ll also use tagged template literals to create an API like this:

const dom = html`
  <div>Hello ${ name }!</div>
`

Not all frameworks use this tool, but notable ones include Lit, HyperHTML, and ArrowJS. Tagged template literals can make it much simpler to build ergonomic HTML templating APIs without needing a compiler.

Step 1: building reactivity

Reactivity is the foundation upon which we'll build the rest of the framework. Reactivity will define how state is managed, and how the DOM updates when state changes.

Let's start with some "dream code" to illustrate what we want:

const state = {}

state.a = 1
state.b = 2

createEffect(() => {
  state.sum = state.a + state.b
})

Basically, we want a “magic object” called state, with two props: a and b. And whenever those props change, we want to set sum to be the sum of the two.

Assuming we don’t know the props in advance (or have a compiler to determine them), a plain object will not suffice for this. So let’s use a Proxy, which can react whenever a new value is set:

const state = new Proxy({}, {
  get(obj, prop) {
    onGet(prop)
    return obj[prop]
  },
  set(obj, prop, value) {
    obj[prop] = value
    onSet(prop, value)
    return true
  }
})

Right now, our Proxy doesn’t do anything interesting, except give us some onGet and onSet hooks. So let’s make it flush updates after a microtask:

let queued = false

function onSet(prop, value) {
  if (!queued) {
    queued = true
    queueMicrotask(() => {
      queued = false
      flush()
    })
  }
}

Note: if you’re not familiar with queueMicrotask, it’s a newer DOM API that’s basically the same as Promise.resolve().then(...), but with less typing.

Why flush updates? Mostly because we don’t want to run too many computations. If we update whenever both a and b change, then we’ll uselessly compute the sum twice. By coalescing the flush into a single microtask, we can be much more efficient.

Next, let’s make flush update the sum:

function flush() {
  state.sum = state.a + state.b
}

This is great, but it’s not yet our “dream code.” We’ll need to implement createEffect so that the sum is computed only when a and b change (and not when something else changes!).

To do this, let’s use an object to keep track of which effects need to be run for which props:

const propsToEffects = {}

Next comes the crucial part! We need to make sure that our effects can subscribe to the right props. To do so, we’ll run the effect, note any get calls it makes, and create a mapping between the prop and the effect.

To break it down, remember our “dream code” is:

createEffect(() => {
  state.sum = state.a + state.b
})

When this function runs, it calls two getters: state.a and state.b. These getters should trigger the reactive system to notice that the function relies on the two props.

To make this happen, we’ll start with a simple global to keep track of what the “current” effect is:

let currentEffect

Then, the createEffect function will set this global before calling the function:

function createEffect(effect) {
  currentEffect = effect
  effect()
  currentEffect = undefined
}

The important thing here is that the effect is immediately invoked, with the global currentEffect being set in advance. This is how we can track whatever getters it might be calling.

Now, we can implement the onGet in our Proxy, which will set up the mapping between the global currentEffect and the property:

function onGet(prop) {
  const effects = propsToEffects[prop] ?? 
      (propsToEffects[prop] = [])
  effects.push(currentEffect)
}

After this runs once, propsToEffects should look like this:

{
  "a": [theEffect],
  "b": [theEffect]
}

…where theEffect is the “sum” function we want to run.

Next, our onSet should add any effects that need to be run to a dirtyEffects array:

const dirtyEffects = []

function onSet(prop, value) {
  if (propsToEffects[prop]) {
    dirtyEffects.push(...propsToEffects[prop])
    // ...
  }
}

At this point, we have all the pieces in place for flush to call all the dirtyEffects:

function flush() {
  while (dirtyEffects.length) {
    dirtyEffects.shift()()
  }
}

Putting it all together, we now have a fully functional reactivity system! You can play around with it yourself and try setting state.a and state.b in the DevTools console – the state.sum will update whenever either one changes.

Now, there are plenty of advanced cases that we’re not covering here:

Using try/catch in case an effect throws an error
Avoiding running the same effect twice
Preventing infinite cycles
Subscribing effects to new props on subsequent runs (e.g. if certain getters are only called in an if block)

However, this is more than enough for our toy example. Let’s move on to DOM rendering.

Step 2: DOM rendering

We now have a functional reactivity system, but it’s essentially “headless.” It can track changes and compute effects, but that’s about it.

At some point, though, our JavaScript framework needs to actually render some DOM to the screen. (That’s kind of the whole point.)

For this section, let’s forget about reactivity for a moment and imagine we’re just trying to build a function that can 1) build a DOM tree, and 2) update it efficiently.

Once again, let’s start off with some dream code:

function render(state) {
  return html`
    <div class="${state.color}">${state.text}</div>
  `
}

As I mentioned, I’m using tagged template literals, ala Lit, because I found them to be a nice way to write HTML templates without needing a compiler. (We’ll see in a moment why we might actually want a compiler instead.)

We’re re-using our state object from before, this time with a color and text property. Maybe the state is something like:

state.color = 'blue'
state.text = 'Blue!'

When we pass this state into render, it should return the DOM tree with the state applied:

<div class="blue">Blue!</div>

Before we go any further, though, we need a quick primer on tagged template literals. Our html tag is just a function that receives two arguments: the tokens (array of static HTML strings) and expressions (the evaluated dynamic expressions):

function html(tokens, ...expressions) {
}

In this case, the tokens are (whitespace removed):

[
  "<div class=\"",
  "\">",
  "</div>"
]

And the expressions are:

[
  "blue",
  "Blue!"
]

The tokens array will always be exactly 1 longer than the expressions array, so we can trivially zip them up together:

const allTokens = tokens
    .map((token, i) => (expressions[i - 1] ?? '') + token)

This will give us an array of strings:

[
  "<div class=\"",
  "blue\">",
  "Blue!</div>"
]

We can join these strings together to make our HTML:

const htmlString = allTokens.join('')

And then we can use innerHTML to parse it into a <template>:

function parseTemplate(htmlString) {
  const template = document.createElement('template')
  template.innerHTML = htmlString
  return template
}

This template contains our inert DOM (technically a DocumentFragment), which we can clone at will:

const cloned = template.content.cloneNode(true)

Of course, parsing the full HTML whenever the html function is called would not be great for performance. Luckily, tagged template literals have a built-in feature that will help out a lot here.

For every unique usage of a tagged template literal, the tokens array is always the same whenever the function is called – in fact, it’s the exact same object!

For example, consider this case:

function sayHello(name) {
  return html`<div>Hello ${name}</div>`
}

Whenever sayHello is called, the tokens array will always be identical:

[
  "<div>Hello ",
  "</div>"
]

The only time tokens will be different is for completely different locations of the tagged template:

html`<div></div>`
html`<span></span>` // Different from above

We can use this to our advantage by using a WeakMap to keep a mapping of the tokens array to the resulting template:

const tokensToTemplate = new WeakMap()

function html(tokens, ...expressions) {
  let template = tokensToTemplate.get(tokens)
  if (!template) {
    // ...
    template = parseTemplate(htmlString)
    tokensToTemplate.set(tokens, template)
  }
  return template
}

This is kind of a mind-blowing concept, but the uniqueness of the tokens array essentially means that we can ensure that each call to html`...` only parses the HTML once.

Next, we just need a way to update the cloned DOM node with the expressions array (which is likely to be different every time, unlike tokens).

To keep things simple, let’s just replace the expressions array with a placeholder for each index:

const stubs = expressions.map((_, i) => `__stub-${i}__`)

If we zip this up like before, it will create this HTML:

<div class="__stub-0__">
  __stub-1__
</div>

We can write a simple string replacement function to replace the stubs:

function replaceStubs (string) {
  return string.replaceAll(/__stub-(\d+)__/g, (_, i) => (
    expressions[i]
  ))
}

And now whenever the html function is called, we can clone the template and update the placeholders:

const element = cloned.firstElementChild
for (const { name, value } of element.attributes) {
  element.setAttribute(name, replaceStubs(value))
}
element.textContent = replaceStubs(element.textContent)

Note: we are using firstElementChild to grab the first top-level element in the template. For our toy framework, we’re assuming there’s only one.

Now, this is still not terribly efficient – notably, we are updating textContent and attributes that don’t necessarily need to be updated. But for our toy framework, this is good enough.

We can test it out by rendering with different state:

document.body.appendChild(render({ color: 'blue', text: 'Blue!' }))
document.body.appendChild(render({ color: 'red', text: 'Red!' }))

This works!

Step 3: combining reactivity and DOM rendering

Since we already have a createEffect from the rendering system above, we can now combine the two to update the DOM based on the state:

const container = document.getElementById('container')

createEffect(() => {
  const dom = render(state)
  if (container.firstElementChild) {
    container.firstElementChild.replaceWith(dom)
  } else {
    container.appendChild(dom)
  }
})

This actually works! We can combine this with the “sum” example from the reactivity section by merely creating another effect to set the text:

createEffect(() => {
  state.text = `Sum is: ${state.sum}`
})

This renders “Sum is 3”:

You can play around with this toy example. If you set state.a = 5, then the text will automatically update to say “Sum is 7.”

Next steps

There are lots of improvements we could make to this system, especially the DOM rendering bit.

Most notably, we are missing a way to update content for elements inside a deep DOM tree, e.g.:

<div class="${color}">
  <span>${text}</span>
</div>

For this, we would need a way to uniquely identify every element inside of the template. There are lots of ways to do this:

Lit, when parsing HTML, uses a system of regexes and character matching to determine whether a placeholder is within an attribute or text content, plus the index of the target element (in depth-first TreeWalker order).
Frameworks like Svelte and Solid have the luxury of parsing the entire HTML template during compilation, which provides the same information. They also generate code that calls firstChild and nextSibling to traverse the DOM to find the element to update.

Note: traversing with firstChild and nextSibling is similar to the TreeWalker approach, but more efficient than element.children. This is because browsers use linked lists under the hood to represent the DOM.

Whether we decided to do Lit-style client-side parsing or Svelte/Solid-style compile-time parsing, what we want is some kind of mapping like this:

[
  {
    elementIndex: 0, // <div> above
    attributeName: 'class',
    stubIndex: 0 // index in expressions array
  },
  {
    elementIndex: 1 // <span> above
    textContent: true,
    stubIndex: 1 // index in expressions array
  }
]

These bindings would tell us exactly which elements need to be updated, which attribute (or textContent) needs to be set, and where to find the expression to replace the stub.

The next step would be to avoid cloning the template every time, and to just directly update the DOM based on the expressions. In other words, we not only want to parse once – we want to only clone and set up the bindings once. This would reduce each subsequent update to the bare minimum of setAttribute and textContent calls.

Note: you may wonder what the point of template-cloning is, if we end up needing to call setAttribute and textContent anyway. The answer is that most HTML templates are largely static content with a few dynamic “holes.” By using template-cloning, we clone the vast majority of the DOM, while only doing extra work for the “holes.” This is the key insight that makes this system work so well.

Another interesting pattern to implement would be iterations (or repeaters), which come with their own set of challenges, like reconciling lists between updates and handling “keys” for efficient replacement.

I’m tired, though, and this blog post has gone on long enough. So I leave the rest as an exercise to the reader!

Conclusion

So there you have it. In the span of one (lengthy) blog post, we’ve implemented our very own JavaScript framework. Feel free to use this as the foundation for your brand-new JavaScript framework, to release to the world and enrage the Hacker News crowd.

Personally I found this project very educational, which is partly why I did it in the first place. I was also looking to replace the current framework for my emoji picker component with a smaller, more custom-built solution. In the process, I managed to write a tiny framework that passes all the existing tests and is ~6kB smaller than the current implementation, which I’m pretty proud of.

In the future, I think it would be neat if browser APIs were full-featured enough to make it even easier to build a custom framework. For example, the DOM Part API proposal would take out a lot of the drudgery of the DOM parsing-and-replacement system we built above, while also opening the door to potential browser performance optimizations. I could also imagine (with some wild gesticulation) that an extension to Proxy could make it easier to build a full reactivity system without worrying about details like flushing, batching, or cycle detection.

If all those things were in place, then you could imagine effectively having a “Lit in the browser,” or at least a way to quickly build your own “Lit in the browser.” In the meantime, I hope that this small exercise helped to illustrate some of the things framework authors think about, and some of the machinery under the hood of your favorite JavaScript framework.

Thanks to Pierre-Marie Dartus for feedback on a draft of this post.

Footnotes

1. Now that we’ve built the framework, you can see why the content passed to innerHTML can be considered trusted. All HTML tokens either come from tagged template literals (in which case they’re fully static and authored by the developer) or are placeholders (which are also written by the developer). User content is only set using setAttribute or textContent, which means that no HTML sanitization is required to avoid XSS attacks. Although you should probably just use CSP anyway!

1 Sep

A tour of JavaScript timers on the web

Posted by Nolan Lawson in Web. Tagged: javascript, performance. 14 comments

Pop quiz: what is the difference between these JavaScript timers?

Promises
setTimeout
setInterval
setImmediate
requestAnimationFrame
requestIdleCallback

More specifically, if you queue up all of these timers at once, do you have any idea which order they’ll fire in?

If not, you’re probably not alone. I’ve been doing JavaScript and web programming for years, I’ve worked for a browser vendor for two of those years, and it’s only recently that I really came to understand all these timers and how they play together.

In this post, I’m going to give a high-level overview of how these timers work, and when you might want to use them. I’ll also cover the Lodash functions debounce() and throttle(), because I find them useful as well.

Promises and microtasks

Let’s get this one out of the way first, because it’s probably the simplest. A Promise callback is also called a “microtask,” and it runs at the same frequency as MutationObserver callbacks. Assuming queueMicrotask() ever makes it out of spec-land and into browser-land, it will also be the same thing.

I’ve already written a lot about promises. One quick misconception about promises that’s worth covering, though, is that they don’t give the browser a chance to breathe. Just because you’re queuing up an asynchronous callback, that doesn’t mean that the browser can render, or process input, or do any of the stuff we want browsers to do.

For example, let’s say we have a function that blocks the main thread for 1 second:

function block() {
  var start = Date.now()
  while (Date.now() - start < 1000) { /* wheee */ }
}

If we were to queue up a bunch of microtasks to call this function:

for (var i = 0; i < 100; i++) {
  Promise.resolve().then(block)
}

This would block the browser for about 100 seconds. It’s basically the same as if we had done:

for (var i = 0; i < 100; i++) {
  block()
}

Microtasks execute immediately after any synchronous execution is complete. There’s no chance to fit in any work between the two. So if you think you can break up a long-running task by separating it into microtasks, then it won’t do what you think it’s doing.

setTimeout and setInterval

These two are cousins: setTimeout queues a task to run in x number of milliseconds, whereas setInterval queues a recurring task to run every x milliseconds.

The thing is… browsers don’t really respect that milliseconds thing. You see, historically, web developers have abused setTimeout. A lot. To the point where browsers have had to add mitigations for setTimeout(/* ... */, 0) to avoid locking up the browser’s main thread, because a lot of websites tended to throw around setTimeout(0) like confetti.

This is the reason that a lot of the tricks in crashmybrowser.com don’t work anymore, such as queuing up a setTimeout that calls two more setTimeouts, which call two more setTimeouts, etc. I covered a few of these mitigations from the Edge side of things in “Improving input responsiveness in Microsoft Edge”.

Broadly speaking, a setTimeout(0) doesn’t really run in zero milliseconds. Usually, it runs in 4. Sometimes, it may run in 16 (this is what Edge does when it’s on battery power, for instance). Sometimes it may be clamped to 1 second (e.g., when running in a background tab). These are the sorts of tricks that browsers have had to invent to prevent runaway web pages from chewing up your CPU doing useless setTimeout work.

So that said, setTimeout does allow the browser to run some work before the callback fires (unlike microtasks). But if your goal is to allow input or rendering to run before the callback, setTimeout is usually not the best choice because it only incidentally allows those things to happen. Nowadays, there are better browser APIs that can hook more directly into the browser’s rendering system.

setImmediate

Before moving on to those “better browser APIs,” it’s worth mentioning this thing. setImmediate is, for lack of a better word … weird. If you look it up on caniuse.com, you’ll see that only Microsoft browsers support it. And yet it also exists in Node.js, and has lots of “polyfills” on npm. What the heck is this thing?

setImmediate was originally proposed by Microsoft to get around the problems with setTimeout described above. Basically, setTimeout had been abused, and so the thinking was that we can create a new thing to allow setImmediate(0) to actually be setImmediate(0) and not this funky “clamped to 4ms” thing. You can see some discussion about it from Jason Weber back in 2011.

Unfortunately, setImmediate was only ever adopted by IE and Edge. Part of the reason it’s still in use is that it has a sort of superpower in IE, where it allows input events like keyboard and mouseclicks to “jump the queue” and fire before the setImmediate callback is executed, whereas IE doesn’t have the same magic for setTimeout. (Edge eventually fixed this, as detailed in the previously-mentioned post.)

Also, the fact that setImmediate exists in Node means that a lot of “Node-polyfilled” code is using it in the browser without really knowing what it does. It doesn’t help that the differences between Node’s setImmediate and process.nextTick are very confusing, and even the official Node docs say the names should really be reversed. (For the purposes of this blog post though, I’m going to focus on the browser rather than Node because I’m not a Node expert.)

Bottom line: use setImmediate if you know what you’re doing and you’re trying to optimize input performance for IE. If not, then just don’t bother. (Or only use it in Node.)

requestAnimationFrame

Now we get to the most important setTimeout replacement, a timer that actually hooks into the browser’s rendering loop. By the way, if you don’t know how the browser event loops works, I strongly recommend this talk by Jake Archibald. Go watch it, I’ll wait.

Okay, now that you’re back, requestAnimationFrame basically works like this: it’s sort of like a setTimeout, except instead of waiting for some unpredictable amount of time (4 milliseconds, 16 milliseconds, 1 second, etc.), it executes before the browser’s next style/layout calculation step. Now, as Jake points out in his talk, there is a minor wrinkle in that it actually executes after this step in Safari, IE, and Edge <18, but let's ignore that for now since it's usually not an important detail.

The way I think of requestAnimationFrame is this: whenever I want to do some work that I know is going to modify the browser's style or layout – for instance, changing CSS properties or starting up an animation – I stick it in a requestAnimationFrame (abbreviated to rAF from here on out). This ensures a few things:

I'm less likely to layout thrash, because all of the changes to the DOM are being queued up and coordinated.
My code will naturally adapt to the performance characteristics of the browser. For instance, if it's a low-cost device that is struggling to render some DOM elements, rAF will naturally slow down from the usual 16.7ms intervals (on 60 Hertz screens) and thus it won't bog down the machine in the same way that running a lot of setTimeouts or setIntervals might.

This is why animation libraries that don't rely on CSS transitions or keyframes, such as GreenSock or React Motion, will typically make their changes in a rAF callback. If you're animating an element between opacity: 0 and opacity: 1, there's no sense in queuing up a billion callbacks to animate every possible intermediate state, including opacity: 0.0000001 and opacity: 0.9999999.

Instead, you're better off just using rAF to let the browser tell you how many frames you're able to paint during a given period of time, and calculate the "tween" for that particular frame. That way, slow devices naturally end up with a slower framerate, and faster devices end up with a faster framerate, which wouldn't necessarily be true if you used something like setTimeout, which operates independently of the browser's rendering speed.

requestIdleCallback

rAF is probably the most useful timer in the toolkit, but requestIdleCallback is worth talking about as well. The browser support isn't great, but there's a polyfill that works just fine (and it uses rAF under the hood).

In many ways rAF is similar to requestIdleCallback. (I'll abbreviate it to rIC from now on. Starting to sound like a pair of troublemakers from West Side Story, huh? "There go Rick and Raff, up to no good!")

Like rAF, rIC will naturally adapt to the browser's performance characteristics: if the device is under heavy load, rIC may be delayed. The difference is that rIC fires on the browser "idle" state, i.e. when the browser has decided it doesn't have any tasks, microtasks, or input events to process, and you're free to do some work. It also gives you a "deadline" to track how much of your budget you're using, which is a nice feature.

Dan Abramov has a good talk from JSConf Iceland 2018 where he shows how you might use rIC. In the talk, he has a webapp that calls rIC for every keyboard event while the user is typing, and then it updates the rendered state inside of the callback. This is great because a fast typist can cause many keydown/keyup events to fire very quickly, but you don't necessarily want to update the rendered state of the page for every keypress.

Another good example of this is a “remaining character count” indicator on Twitter or Mastodon. I use rIC for this in Pinafore, because I don't really care if the indicator updates for every single key that I type. If I'm typing quickly, it's better to prioritize input responsiveness so that I don't lose my sense of flow.

Screenshot of Pinafore with some text entered in the text box and a digit counter showing the number of remaining characters

In Pinafore, the little horizontal bar and the “characters remaining” indicator update as you type.

One thing I’ve noticed about rIC, though, is that it’s a little finicky in Chrome. In Firefox it seems to fire whenever I would, intuitively, think that the browser is “idle” and ready to run some code. (Same goes for the polyfill.) In mobile Chrome for Android, though, I’ve noticed that whenever I scroll with touch scrolling, it might delay rIC for several seconds even after I’m done touching the screen and the browser is doing absolutely nothing. (I suspect the issue I’m seeing is this one.)

Update: Alex Russell from the Chrome team informs me that this is a known issue and should be fixed soon!

In any case, rIC is another great tool to add to the tool chest. I tend to think of it this way: use rAF for critical rendering work, use rIC for non-critical work.

debounce and throttle

These two functions aren’t built in to the browser, but they’re so useful that they’re worth calling out on their own. If you aren’t familiar with them, there’s a good breakdown in CSS Tricks.

My standard use for debounce is inside of a resize callback. When the user is resizing their browser window, there’s no point in updating the layout for every resize callback, because it fires too frequently. Instead, you can debounce for a few hundred milliseconds, which will ensure that the callback eventually fires once the user is done fiddling with their window size.

throttle, on the other hand, is something I use much more liberally. For instance, a good use case is inside of a scroll event. Once again, it’s usually senseless to try to update the rendered state of the app for every scroll callback, because it fires too frequently (and the frequency can vary from browser to browser and from input method to input method… ugh). Using throttle normalizes this behavior, and ensures that it only fires every x number of milliseconds. You can also tweak Lodash’s throttle (or debounce) function to fire at the start of the delay, at the end, both, or neither.

In contrast, I wouldn’t use debounce for the scrolling scenario, because I don’t want the UI to only update after the user has explicitly stopped scrolling. That can get annoying, or even confusing, because the user might get frustrated and try to keep scrolling in order to update the UI state (e.g. in an infinite-scrolling list). throttle is better in this case, because it doesn’t wait for the scroll event to stop firing.

throttle is a function I use all over the place for all kinds of user input, and even for some regularly-scheduled tasks like IndexedDB cleanups. It’s extremely useful. Maybe it should just be baked into the browser some day!

Conclusion

So that’s my whirlwind tour of the various timer functions available in the browser, and how you might use them. I probably missed a few, because there are certainly some exotic ones out there (postMessage or lifecycle events, anyone?). But hopefully this at least provides a good overview of how I think about JavaScript timers on the web.

20 Mar

Smaller Lodash bundles with Webpack and Babel

Posted by Nolan Lawson in Web. Tagged: javascript, lodash, performance. 9 comments

One of the benefits of working with smart people is that you can learn a lot from them through osmosis. As luck would have it, a recent move placed my office next to John-David Dalton‘s, with the perk being that he occasionally wanders into my office to talk about cool stuff he’s working on, like Lodash and ES modules in Node.

Recently we chatted about Lodash and the various plugins for making its bundle size smaller, such as lodash-webpack-plugin and babel-plugin-lodash. I admitted that I had used both projects but only had a fuzzy notion of what they actually did, or why you’d want to use one or the other. Fortunately J.D. set me straight, and so I thought it’d be a good opportunity to take what I’ve learned and turn it into a short blog post.

TL;DR

Use the import times from 'lodash/times' format over import { times } from 'lodash' wherever possible. If you do, then you don’t need the babel-plugin-lodash. Update: or use lodash-es instead.

Be very careful when using lodash-webpack-plugin to check that you’re not omitting any features you actually need, or stuff can break in production.

Avoid Lodash chaining (e.g. _(array).map(...).filter(...).take(...)), since there’s currently no way to reduce its size.

babel-plugin-lodash

The first thing to understand about Lodash is that there are multiple ways you can use the same method, but some of them are more expensive than others:

import { times } from 'lodash'   // 68.81kB  :(
import times from 'lodash/times' //  2.08kB! :)

times(3, () => console.log('whee'))

You can see the difference using something like webpack-bundle-analyzer. Here’s the first version:

Screenshot of lodash.js taking up almost the entire bundle size

Using the import { times } from 'lodash' idiom, it turns out that lodash.js is so big that you can’t even see our tiny index.js! Lodash takes up a full parsed size of 68.81kB. (In the bundle analyzer, hover your mouse over the module to see the size.)

Now here’s the second version (using import times from 'lodash/times'):

Screenshot showing many smaller Lodash modules not taking up so much space

In the second screenshot, Lodash’s total size has shrunk down to 2.08kB. Now we can finally see our index.js!

However, some people prefer the second syntax to the first, especially since it can get more terse the more you import.

Consider:

import { map, filter, times, noop } from 'lodash'

compared to:

import map from 'lodash/map'
import filter from 'lodash/filter'
import times from 'lodash/times'
import noop from 'lodash/noop'

What the babel-plugin-lodash proposes is to automatically rewrite your Lodash imports to use the second pattern rather than the first. So it would rewrite

import { times } from 'lodash'

import times from 'lodash/times'

One takeway from this is that, if you’re already using the import times from 'lodash/times' idiom, then you don’t need babel-plugin-lodash.

Update: apparently if you use the lodash-es package, then you also don’t need the Babel plugin. It may also have better tree-shaking outputs in Webpack due to setting "sideEffects": false in package.json, which the main lodash package does not do.

lodash-webpack-plugin

What lodash-webpack-plugin does is a bit more complicated. Whereas babel-plugin-lodash focuses on the syntax in your own code, lodash-webpack-plugin changes how Lodash works under the hood to make it smaller.

The reason this cuts down your bundle size is that it turns out there are a lot of edge cases and niche functionality that Lodash provides, and if you’re not using those features, they just take up unnecessary space. There’s a full list in the README, but let’s walk through some examples.

Iteratee shorthands

What in the heck is an “iteratee shorthand”? Well, let’s say you want to map() an Array of Objects like so:

import map from 'lodash/map'
map([{id: 'foo'}, {id: 'bar'}], obj => obj.id) // ['foo', 'bar']

In this case, Lodash allows you to use a shorthand:

import map from 'lodash/map'
map([{id: 'foo'}, {id: 'bar'}], 'id') // ['foo', 'bar']

This shorthand syntax is nice to save a few characters, but unfortunately it requires Lodash to use more code under the hood. So lodash-webpack-plugin can just remove this functionality.

For example, let’s say I use the full arrow function instead of the shorthand. Without lodash-webpack-plugin, we get:

Screenshot showing multiple lodash modules under .map

In this case, Lodash takes up 18.59kB total.

Now let’s add lodash-webpack-plugin:

Screenshot of lodash with a very small map.js dependency

And now Lodash is down to 117 bytes! That’s quite the savings.

Collection methods

Another example is “collection methods” for Objects. This means being able to use standard Array methods like forEach() and map() on an Object, in which case Lodash gives you a callback with both the key and the value:

import forEach from 'lodash/forEach'

forEach({foo: 'bar', baz: 'quux'}, (value, key) => {
  console.log(key, value)
  // prints 'foo bar' then 'baz quux'
})

This is handy, but once again it has a cost. Let’s say we’re only using forEach for Arrays:

import forEach from 'lodash/forEach'

forEach(['foo', 'bar'], obj => {
  console.log(obj) // prints 'foo' then 'bar
})

In this case, Lodash will take up a total of 5.06kB:

Screenshot showing Lodash forEach() taking up quite a few modules

Whereas once we add in lodash-webpack-plugin, Lodash trims down to a svelte 108 bytes:

Screenshot showing a very small Lodash forEach.js module

Chaining

Another common Lodash feature is chaining, which exposes functionality like this:

import _ from 'lodash'
const array = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
console.log(_(array)
  .map(i => parseInt(i, 10))
  .filter(i => i % 2 === 1)
  .take(5)
  .value()
) // prints '[ 1, 3, 5, 7, 9 ]'

Unfortunately there is currently no good way to reduce the size required for chaining. So you’re better off importing the Lodash functions individually:

import map from 'lodash/map'
import filter from 'lodash/filter'
import take from 'lodash/take'
const array = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']

console.log(
  take(
    filter(
      map(array, i => parseInt(i, 10)),
    i => i % 2 === 1),
  5)
) // prints '[ 1, 3, 5, 7, 9 ]'

Using the lodash-webpack-plugin with the chaining option enabled, the first example takes up the full 68.81kB:

Screenshot showing large lodash.js dependency

This makes sense, since we’re still importing all of Lodash for the chaining to work.

Whereas the second example with chaining disabled gives us only 590 bytes:

Screenshot showing a handful of small Lodash modules

The second piece of code is a bit harder to read than the first, but it’s certainly a big savings in file size! Luckily J.D. tells me there may be some work in progress on a plugin that could rewrite the second syntax to look more like the first (similar to babel-plugin-lodash).

Edit: it was brought to my attention in the comments that this functionality should be coming soon to babel-plugin-lodash!

Gotchas

Saving bundle size is great, but lodash-webpack-plugin comes with some caveats. By default, all of these features – shorthands for the iteratee shorthands, collections for the Object collection methods, and others – are disabled by default. Furthermore, they may break or even silently fail if you try to use them when they’re disabled.

This means that if you only use lodash-webpack-plugin in production, you may be in for a rude surprise when you test something in development mode and then find it’s broken in production. In my previous examples, if you use the iteratee shorthand:

map([{id: 'foo'}, {id: 'bar'}], 'id') // ['foo', 'bar']

And if you don’t enable shorthands in lodash-webpack-plugin, then this will actually throw a runtime error:

map.js:16 Uncaught TypeError: iteratee is not a function

In the case of the Object collection methods, it’s more insidious. If you use:

forEach({foo: 'bar', baz: 'quux'}, (value, key) => {
  console.log(key, value)
})

And if you don’t enable collections in lodash-webpack-plugin, then the forEach() method will silently fail. This can lead to some very hard-to-uncover bugs!

Conclusion

The babel-plugin-lodash and lodash-webpack-plugin packages are great. They’re an easy way to reduce your bundle size by a significant amount and with minimal effort.

The lodash-webpack-plugin is particularly useful, since it actually changes how Lodash operates under the hood and can remove functionality that almost nobody uses. Support for edge cases like sparse arrays (guards) and typed arrays (exotics) is unlikely to be something you’ll need.

While the lodash-webpack-plugin is extremely useful, though, it also has some footguns. If you’re only enabling it for production builds, you may be surprised when something works in development but then fails in production. It might also be hard to add to a large existing project, since you’ll have to meticulously audit all your uses of Lodash.

So be sure to carefully read the documentation before installing the lodash-webpack-plugin. And if you’re not sure if you need a certain feature, then you may be better off enabling that feature (or disabling the plugin entirely) and just take the ~20kB hit.

Note: if you’d like to experiment with this yourself, I put these examples into a small GitHub repo. If you uncomment various bits of code in src/index.js, and enable or disable the Babel and Webpack plugins in .babelrc and webpack.config.js, then you can play around with these examples yourself.

22 May

A brief and incomplete history of JavaScript bundlers

Posted by Nolan Lawson in Web. Tagged: bundlers, javascript. 4 comments

Ever since I read Malte Ubl’s proposal for a JavaScript bundle syntax, I’ve been fascinated by the question: does JavaScript need a “bundle” standard?

Unfortunately that question will have to wait for another post, because it’s much more complicated than what I can cover here. But to at least make the first tentative stabs at answering it, I’d like to explore some more basic questions:

What is a JavaScript bundler?
What purpose do bundlers serve in the modern webdev stack?

To try to answer these questions, I’d like to offer my historical perspective on what are (arguably) the two most important bundlers of the last five years: Browserify and Webpack.

A bundle of bamboo, via Wikipedia

What is a bundle?

Conceptually, a JavaScript bundle is very simple: it’s a collection of multiple scripts, combined into a single file. The original bundler was called +=, i.e. concatenation, and for a long time it was all anyone really needed. The whole point was to avoid the 6-connections-per-origin limit and the built-in overhead of HTTP/1.1 connections by simply jamming all your JavaScript into a single file. Easy-peasy.

Disregarding some interesting but ultimately niche bundlers such as GWT, RequireJS, and Closure Compiler, concatenation was still the most common bundler until very recently. Even fairly modern scaffolding tools like Yeoman were still recommending concatenation as the default bundler well into 2013, using lightweight tools such as usemin.

It was only really when Browserify hit the scene in 2013 did non-concatenation bundlers start to go mainstream.

The rise of Browserify

Interestingly, Browserify wasn’t originally designed to solve the problem of bundling. Instead, it was designed to solve the problem of Node developers who wanted to reuse their code in the browser. (It’s right there in the name: “browser-ify” your Node code!)

Screenshot of Browserify homepage from 2013

Screenshot of the original Browserify homepage from January 2013 (via the Internet Archive)

Before Browserify, if you were writing a JavaScript module that was designed to work in both Node or the browser, you’d have to do something like this:

var MyModule = 'hello world';

if (typeof module !== 'undefined' && module.exports) {
  module.exports = MyModule;
} else {
  (typeof self !== 'undefined' ? self : window).MyModule = MyModule;
}

This works fine for single files, but if you’re accustomed to Node conventions, it becomes aggravating that you can’t do something like this:

var otherModule = require('./otherModule');

Or even:

var otherPackage = require('other-package');

By 2014, npm had already grown to over 50,000 modules, so the idea of reusing those modules within browser code was a compelling proposition. The problem Browserify solved was thus twofold:

Make the CommonJS module system work for the browser (by crawling the dependency tree, reading files, and building a single bundle file).
Make Node built-ins and conventions (process, Buffer, crypto, etc.) work in the browser, by implementing polyfills and shims for them.

This second point is an often-overlooked benefit of the cowpath that Browserify paved. At the time Browserify debuted, many of those 50,000 modules weren’t written with any consideration for how they might run in the browser, and Node-isms like process.nextTick() and setImmediate() ran rampant. For Browserify to “just work,” it had to solve the compatibility problem.

What this involved was a lot of effort to reimplement nearly all of Node’s standard library for the browser, tackling the inevitable issues of cross-browser compatibility along the way. This resulted in some extremely battle-tested libraries such as events, process, buffer, inherits, and crypto, among others.

If you want to understand the ridiculous amount of work that had to go into building all this infrastructure, I recommend taking a look at Calvin Metcalf’s series on implementing crypto for the browser. Or, if you’re too faint of heart, you can instead read about how he helped fix process.nextTick() to work with Sinon or avoid bugs in oldIE’s timer system. (Calvin is truly one of the unsung heroes of JavaScript. Look in your bundle, and odds are you will find his code in there somewhere!)

All of these libraries – buffer, crypto, process, etc. – are still in wide use today via Browserify, as well as other bundlers like Webpack and Rollup. They are the magic behind why new Buffer() and process.nextTick() “just work,” and are a big part of Browserify’s success story.

Enter Webpack

While Browserify was picking up steam, and more and more browser-ready modules were starting to be published to npm, Webpack rose to prominence in 2015, buoyed by the popularity of React and the endorsement of Pete Hunt.

Webpack and Browserify are often seen today as solutions to the same problem, but Webpack’s initial focus was a bit different from Browserify’s. Whereas Browserify’s goal was to make Node modules run in the browser, Webpack’s goal was to create a dependency graph for all of the assets in a website – not just JavaScript, but also CSS, images, SVGs, and even HTML.

The Webpack view of the world, with multiple types of assets all treated as part of the dependency graph

The Webpack view of the world, via “What is Webpack?”

In contrast to Browserify, which was almost dogmatic in its insistence on Node compatibility, Webpack was cheerful to break Node conventions and introduce code like this:

require('./styles.css');

Or even:

var svg = require('svg-url?limit=1024!./file.svg');

Webpack did this for a few different reasons:

Once all of a website’s assets can be expressed as a dependency graph, it becomes easy to define “components” (collections of HTML, CSS, JavaScript, images, etc.) as standalone modules, which can be easily reused and even published to npm.
Using a JavaScript-based module system for assets means that Hot Module Replacement is easy and natural, e.g. a stylesheet can automatically update itself by injection and replacement into the DOM via script.
Ultimately, all of this is configurable using loaders, meaning you can get the benefits of an integrated module system without having to ship a gigantic JavaScript bundle to your users. (Although how well this works in practice is debatable.).

Because Browserify was originally the only game in town, though, Webpack had to undergo its own series of compatibility fixes, so that existing Browserify-targeting modules could work well with Webpack. This wasn’t always easy, as a JavaScript package maintainer of the time might have told you.

Out of this push for greater Webpack-Browserify compatibility grew ad-hoc standards like the node-browser-resolve algorithm, which defines what the "browser" field in package.json is supposed to do. (This field is an extension of npm’s own package.json definition, which specifies how modules should be swapped out when building in “browser mode” vs “Node mode.”)

Closing thoughts

Today, Browserify and Webpack have largely converged in functionality, although Browserify still tends to be preferred by old-school Node developers, whereas Webpack is the tool of choice for frontend web developers. Up-and-comers such as Rollup, splittable, and fuse-box (among many others) are also making the frontend bundler landscape increasingly diverse and interesting.

So that’s my view of the great bundler wars of 2013-2017! Hopefully in a future blog post I’ll be able to cover whether or not bundlers like Browserify and Webpack demonstrate the need for a “standard” to unite them all.

Feel free to weigh in on Twitter or on Mastodon.

9 Jan

How to write a JavaScript package for both Node and the browser

Posted by Nolan Lawson in Web. Tagged: bundles, javascript, optimization. 9 comments

This is an issue that I’ve seen a lot of confusion over, and even seasoned JavaScript developers might have missed some of its subtleties. So I thought it was worth a short tutorial.

Let’s say you have a JavaScript module that you want to publish to npm, available both for Node and for the browser. But there’s a catch! This particular module has a slightly different implementation for the Node version compared to the browser version.

This situation comes up fairly frequently, since there are lots of tiny environment differences between Node and the browser. And it can be tricky to implement correctly, especially if you’re trying to optimize for the smallest possible browser bundle.

Let’s build a JS package

So let’s write a mini JavaScript package, called base64-encode-string. All it does is take a string as input, and it outputs the base64-encoded version.

For the browser, this is easy; we can just use the built-in btoa function:

module.exports = function (string) {
  return btoa(string);
};

In Node, though, there is no btoa function. So we’ll have to create a Buffer instead, and then call buffer.toString() on it:

module.exports = function (string) {
  return Buffer.from(string, 'binary').toString('base64');
};

Both of these should provide the correct base64-encoded version of a string. For instance:

var b64encode = require('base64-encode-string');
b64encode('foo');    // Zm9v
b64encode('foobar'); // Zm9vYmFy

Now we’ll just need some way to detect whether we’re running in the browser or in Node, so we can be sure to use the right version. Both Browserify and Webpack define a process.browser field which returns true, whereas in Node it’s falsy. So we can simply do:

if (process.browser) {
  module.exports = function (string) {
    return btoa(string);
  };
} else {
  module.exports = function (string) {
    return Buffer.from(string, 'binary').toString('base64');
  };
}

Now we just name our file index.js, type npm publish, and we’re done, right? Well, this works, but unfortunately there’s a big performance problem with this implementation.

Since our index.js file contains references to the Node built-in process and Buffer modules, both Browserify and Webpack will automatically include the polyfills for those entire modules in the bundle.

From this simple 9-line module, I calculated that Browserify and Webpack will create a bundle weighing 24.7KB minified (7.6KB min+gz). That’s a lot of bytes for something that, in the browser, only needs to be expressed with btoa!

“browser” field, how I love thee

If you search through the Browserify or Webpack documentation for tips on how to solve this problem, you may eventually discover node-browser-resolve. This is a specification for a "browser" field inside of package.json, which can be used to define modules that should be swapped out when building for the browser.

Using this technique, we can add the following to our package.json:

{
  /* ... */
  "browser": {
    "./index.js": "./browser.js"
  }
}

And then separate the functions into two different files, index.js and browser.js:

// index.js
module.exports = function (string) {
  return Buffer.from(string, 'binary').toString('base64');
};

// browser.js
module.exports = function (string) {
  return btoa(string);
};

After this fix, Browserify and Webpack provide much more reasonable bundles: Browserify’s is 511 bytes minified (315 min+gz), and Webpack’s is 550 bytes minified (297 min+gz).

When we publish our package to npm, anyone running require('base64-encode-string') in Node will get the Node version, and anyone doing the same thing with Browserify or Webpack will get the browser version. Success!

For Rollup, it’s a bit more complicated, but not too much extra work. Rollup users will need to use rollup-plugin-node-resolve and set browser to true in the options.

For jspm there is unfortunately no support for the “browser” field, but jspm users can get around it in this case by doing require('base64-encode-string/browser') or jspm install npm:base64-encode-string -o "{main:'browser.js'}". Alternatively, the package author can specify a “jspm” field in their package.json.

Advanced techniques

The direct "browser" method works well, but for larger projects I find that it introduces an awkward coupling between package.json and the codebase. For instance, our package.json could quickly end up looking like this:

{
  /* ... */
  "browser": {
    "./index.js": "./browser.js",
    "./widget.js": "./widget-browser.js",
    "./doodad.js": "./doodad-browser.js",
    /* etc. */
  }
}

So every time you want a browser-specific module, you’d have to create two separate files, and then remember to add an extra line to the "browser" field linking them together. And be careful not to misspell anything!

Also, you may find yourself extracting individual bits of code into separate modules, merely because you wanted to avoid an if (process.browser) {} check. When these *-browser.js files accumulate, they can start to make the codebase a lot harder to navigate.

If this situation gets too unwieldy, there are a few different solutions. My personal favorite is to use Rollup as a build tool, to automatically split a single codebase into separate index.js and browser.js files. This has the added benefit of de-modularizing the code you ship to consumers, saving bytes and time.

To do so, install rollup and rollup-plugin-replace, then define a rollup.config.js file:

import replace from 'rollup-plugin-replace';
export default {
  entry: 'src/index.js',
  format: 'cjs',
  plugins: [
    replace({ 'process.browser': !!process.env.BROWSER })
  ]
};

(We’ll use that process.env.BROWSER as a handy way to switch between browser builds and Node builds.)

Next, we can create a src/index.js file with a single function using a normal process.browser condition:

export default function base64Encode(string) {
  if (process.browser) {
    return btoa(string);
  } else {
    return Buffer.from(string, 'binary').toString('base64');
  }
}

Then add a prepublish step to package.json to generate the files:

{
  /* ... */
  "scripts": {
    "prepublish": "rollup -c > index.js && BROWSER=true rollup -c > browser.js"
  }
}

The generated files are fairly straightforward and readable:

// index.js
'use strict';

function base64Encode(string) {
  {
    return Buffer.from(string, 'binary').toString('base64');
  }
}

module.exports = base64Encode;

// browser.js
'use strict';

function base64Encode(string) {
  {
    return btoa(string);
  }
}

module.exports = base64Encode;

You’ll notice that Rollup automatically converts process.browser to true or false as necessary, then shakes out the unused code. So no references to process or Buffer will end up in the browser bundle.

Using this technique, you can have any number of process.browser switches in your codebase, but the published result is two small, focused index.js and browser.js files, with only the Node-related code for Node, and only the browser-related code for the browser.

As an added bonus, you can configure Rollup to also generate ES module builds, IIFE builds, or UMD builds. For an example of a simple library with multiple Rollup build targets, you can check out my project marky.

The actual project described in this post (base64-encode-string) has also been published to npm so that you can inspect it and see how it ticks. The source code is available on GitHub.

19 Oct

The struggles of publishing a JavaScript library

Posted by Nolan Lawson in Webapps. Tagged: javascript, modules, npm. 19 comments

If you’ve done any web development in the past few years, then you’ve probably typed something like this:

$ bower install jquery

Or maybe even:

$ npm install --save lodash

For anyone who remembers the dark days of combing Github for jQuery plugins, this is a miracle. But as with all software, somebody had to write that code in order for you to be able to download it. And in the case of tools like Bower and npm, somebody also had to do the legwork to publish it. This is one of their stories.

The Babelification of JavaScript

I tweeted this recently:

https://twitter.com/nolanlawson/status/653610332989059072

I got some positive feedback, but I also saw some incredulous responses from people telling me I only need to support npm and CommonJS, or more snarkily, that supporting “just JavaScript” is good enough. As a fairly active open-source JavaScript author, though, I’d like to share my thoughts on why it’s not so simple.

The JavaScript module ecosystem is a mess these days. For module definitions, we have AMD, UMD, CommonJS, globals, and ES6 modules ¹. For distribution, we have npm, Bower, and jspm, as well as CDNs like cdnjs, jsDelivr, and Github itself. For translating between Node and browser code, we have Browserify, Webpack, and Rollup.

Supporting each of these categories comes with its own headaches, but before I delve into that, here’s my take on how we got into this morass in the first place.

What is a JS module?

For the longest time, JavaScript didn’t have any commonly-accepted module system, so the most straightforward way to distribute your code was as a global variable. jQuery plugins also worked this way – they would just look for the global window.$ or window.jQuery and hook themselves onto that.

But thanks largely to Node and the influx of people who care about highfalutin computer-sciencey stuff like “not polluting the global namespace,” we now have a lot more ways of modularizing our code. npm is famous for using CommonJS, with its module.exports and require(), whereas other tools like RequireJS use an alternative format called AMD, known for its define() and asynchronous loading. (It’s never ceased to confuse me that RequireJS is the one that doesn’t use require().) There’s also UMD, which seeks to harmonize all of them (the “U” stands for “universal”).

In practice, though, there’s no good “universal” way to distribute your code. Many libraries try to dynamically determine at runtime what kind of environment they’re in (here’s a pretty gnarly example), but this makes modularizing your own code a headache, because you have to repeat that boilerplate anywhere you want to split up your code into separate files.

More recently, I’ve seen a lot of modules migrate to just using CommonJS everywhere, and then bundling it up for distribution with Browserify. This can be fraught with its own difficulties though, if you aren’t aware of the subtleties of how your code gets consumed. For instance, if you use Browserify’s --standalone flag (-s), then your code will get built as an AMD-ready, UMD-ready, and globals-ready bundle file, but you might not think to add it as a build step, because the stated use of the --standalone flag is to create a global variable ².

However, my new personal policy is to use this flag everywhere, even when I can’t think of a good global variable name, because that way I don’t get issues filed on me asking for AMD support or UMD support. (Speaking of which, it still tickles me that someone had to actually open an issue asking me to support a supposedly “universal” module system. Not so universal after all, is it!)

Package managers and pseudo-package managers

So let’s say you go the CommonJS + Browserify route: now you have an interesting problem, which is that you have both a “source” version and a “distributed” version of your code. (Commonly these are organized into a src/lib folder and a dist folder, but those are just conventions.) How do you make sure your users get the right one?

npm is a package manager that expects CommonJS modules, so typically in your package.json, you set the "main" key to point to whatever your source "src/index.js" file is. Bower, however, expects a bundle file that can be directly included as a <script> tag, so in that case you’ll want to set the "main" inside the bower.json to point instead to your "dist/mypackage.js" or "dist/mypackage.min.js" file. jspm complicates things further by defaulting to npm’s package.json file while actually expecting non-CommonJS modules, but you can override that behavior by including a {"jspm": "main": "dist/mypackage.js"}} in your package.json. Whew! We’re all done, right?

Not so fast. As it turns out, Bower isn’t really a package manager so much as a CLI over Github. What that means is that you actually need to check your bundle files into Git, to ensure that those dist/ files are available to Bower users. At the same time, you’ll have to be very cognizant not to check in anything you don’t want people to download, because Bower’s "ignore" list doesn’t actually avoid downloading anything; it just deletes the ignored files after they’re downloaded, which can lead to some enormous Bower downloads. Couple this with the fact that you’re probably also juggling .gitignore files and .npmignore files, and you can end up with some fairly complicated release scripts!

Of course, many users will also just download your bundle file from Github. So it’s important to be consistent with your Git tags, so that you can have a nice tidy Github releases page. As it turns out, Bower will also depend on those Git tags to determine what a “release” is – actually, it flat-out ignores the "version" field in bower.json. To make sense of all this complexity, our policy with PouchDB is to just do an explicit commit with the version tag that isn’t even a part of the project’s main master branch, purely as a “release commit” for Bower and Github.

What about CDNs?

Github discourages using their hosted JavaScript files directly from <script> tags (in fact their HTTP headers make it impossible), so often users will ask if they can consume your library via a CDN. CDNs are also great for code snippets, because you can just include a <script> tag pointing to the latest CDN release. So lots of libraries (including PouchDB) also support jsDelivr and cdnjs.

You can add your library manually, but in my experience this is a pain, because it usually involves checking out the entire source for the CDN (which can be many gigabytes) and then opening a pull request with your library’s code. So it’s better to follow their automated instructions so that they can automatically update whenever your code updates. Note that both jsDelivr and cdnjs rely on Git tags, so the above comments about Github/Bower also apply.

Correction: Both jsDelivr and cdnjs can be configured to point to npm instead of Github; my mistake! The same applies to jspm.

Browser vs Node

For anyone who’s written a popular JavaScript library, the situation inevitably arises that someone tries to use your Node-optimized library in the browser, or your browser-optimized library in Node, and invariably they run into issues.

The first trick you might employ, if you’re working with Browserify, is to add if/else switches anytime you want to do something differently in Node or the browser:

function md5(str) {
  if (process.browser) {
    return require('spark-md5').hash(str);
  } else {
    return require('crypto').createHash('md5').update(str).digest('hex');
  }
}

This is convenient at first, but it causes some unexpected problems down the line.

First off, you end up sending unnecessary Node code to the browser. And especially if the Browserified version of your dependencies is very large, this can add up to a lot of bytes. In the example above, Browserifying the entire crypto library comes out to 93KB (after uglify+gzip!), whereas spark-md5 is only 2.6KB.

The second issue is that, if you are using a tool like Istanbul to measure your code coverage, then properly measuring your coverage in Node can lead to a lot of /* istanbul ignore next */ comments all over the place, so that you can avoid getting penalized for browser code that never runs.

My personal method to avoid this conundrum is to prefer the "browser" field in package.json to tell Browserify/Webpack which modules to swap out when building. This can get pretty complicated (here’s an example from PouchDB), but I prefer to complicate my configuration code rather than my JavaScript code. Another option is to use Calvin Metcalf’s inline-process-browser, which can automatically strip out process.browser switches ³.

You’ll also want to be careful when using Browserify transforms in your code; any transforms need to be a regular dependency rather than a devDependency, or else they can cause problems for library users.

Wait, you tried to run my code where?

After you’ve solved Node/browser switching in your library, the next hurdle you’ll likely encounter is that there is some unexpected bug in an exotic environment, often due to globals.

One way this might manifest itself is that you expect a global window variable to exist in the browser – but oh no, it’s not there in a web worker! So you check for the web worker’s self as well. Aha, but NW.js has both a Node-style global and browser-style window as global variables, so you can’t know in advance which other globals (such as Promise or console) are attached to which! Then you can get into even stranger environments like iOS’s JSCore (which is used by React Native), or Electron, or Qt WebKit, or Rhino/Nashorn, or Java FXWebView, or Adobe Air…

If you want to see what kind of a mess this can create, check out these lines of code from Lodash, and weep for poor John-David Dalton!

My own solution to this issue is to never ever check for window or global or anything like that if I can avoid it, and instead use typeof whatever === 'undefined' to check. For instance, here’s my typical Promise shim:

function PromiseShim() {
  if (typeof Promise !== 'undefined') {
    return Promise;
  }
  return require('lie');
}

Trying to access a global variable that doesn’t exist is a runtime error in most JavaScript environments, but using the typeof check will prevent the error.

Browserify vs Webpack

100% of my experience with webpack is users opening issues on my libraries because it does something subtly different then @browserify

— Professional Calvin Architect (@CWMma) October 12, 2015

Most library authors I know tend to prefer Browserify for building JavaScript modules, but especially with the rise of React and Flux, Webpack is increasingly becoming a popular option.

Webpack is mostly consistent with Browserify, but there are points of divergence that can lead to unexpected errors when people try to require() your library from Webpack. The best way to test is to simply run webpack on your source CommonJS file and see if you get any errors.

In the worst case, if you have a dependency that doesn’t build with Webpack, you can always tell users to specify a custom loader to work around the issue. Webpack tends to give more control to the end-user than Browserify does, so the best strategy is to just let them build up your library and dependencies however they need to.

Enter ES6

This whole situation I’ve described above is bad enough, but once you add ES6 to the mix, it gets even more complicated. ES6 modules are the “future-proof” way of authoring JavaScript, but as it stands, there are very few tools that can consume ES6 directly, including most versions of Node.

(Yes, even if you are using Node 4.x with its many lovely ES6 features like Promises and arrow functions, there are still some missing features, like spread arguments and destructuring, that are not supported by V8 yet.)

So, what many ES6 authors will do is add a "prepublish" script to build the ES6 source into a version consumable by Node/npm (here’s an example). (Note that your "main" field in package.json must point to the Node-ready version, not the ES6 version!) Of course, this adds a huge amount of additional complexity to your build script, because now you have three versions of your code: 1) source, 2) Node version, and 3) browser version.

When you add an ES6 module bundler like Rollup, it gets even hairier. Rollup is a really cool bundler that offers some big benefits over Browserify and Webpack (such as smaller bundle sizes), but to use it, it expects your library’s dependencies to be exported in the ES6 format.

Now, because npm normally expects CommonJS, not ES6 modules, there is an informal “jsnext:main” field that some libraries use to point to their ES6 source. Usage is not very widespread, though, so if any of your dependencies don’t use ES6 or don’t have a "jsnext:main", then you’ll need to use Rollup’s --external flag when bundling them so that it knows to ignore them.

"jsnext:main" is a nice hack, but it also brings up a host of unanswered questions, such as: which features of ES6 are supported? Is it a particular stage of recommendation for the spec, ala Babel? What about popular ES7 features that are already starting to creep into codebases that use Babel, such as async/await? It’s not clear, and I don’t think this problem will be resolved until npm takes a stance one way or the other.

Making sense of this mess

At the end of the day, if your users want your code bad enough, then they will find a way to consume it. In the worst case scenario, they can just copy-paste your code from Github, which is how JavaScript was consumed for many years anyway. (StackOverflow was a decent package manager long before cooler kids like npm and Bower came along!)

Many folks have advised me to just support npm and CommonJS, and honestly, for my smaller modules I’m doing just that. It’s simply too much work to try to support everything at once. As an example of how complicated it is, I’ve created a hello-javascript module that only contains the code you need to support all the environments above. Hopefully it will help someone trying to figure out how to publish to multiple targets.

If you happen to be thinking about hopping into the world of JavaScript library authorship, though, I recommend starting with npm’s publishing guide and working your way up from there. Trying to support every JavaScript user on the planet is an ambitious proposition, and you don’t want to wear yourself out when you’re having enough trouble testing, writing documentation, checking code coverage, triaging issues, and hey – at some point, you’ll also need to write some code!

But as with everything in software, the best advice is to focus on the user and all else will follow. Don’t listen to the naysayers who tell you that Bower users are “wrong” and you’re doing them a favor by “educating” them ⁴. Work with your users to try to support their use case, and give them alternatives if they’re unsatisfied with your current publishing approach. (I really like wzrd.in for on-demand Browserification.)

To me, this is somewhat like accessibility. Some users only know Bower, not npm, or maybe they don’t even understand the difference between the two! Others might be unfamiliar with the command line, and in that case, a big reassuring “Download” button on a github.io page might be the best way to accommodate them. Still others might be power users who will try to include your ES6 code directly and then Browserify it themselves. (Ask those users for a pull request!)

At the end of the day, you are giving away your labor for free, so you shouldn’t feel obligated to bend over backwards for anybody. But if your driving motivation is to make your code as usable as possible for other people, then I’d say you can’t go wrong by supporting the two most popular options: direct downloads for casual users, and npm/CommonJS for power users. If your library grows in popularity, you can always worry about the thousand and one other methods later. ⁵

Thanks to Calvin Metcalf, Nick Colley, and Colin Skow for providing feedback on a draft of this post.

Footnotes

1. I’ve seen no compelling reason to call it “ES2015,” except to signal my own status as a smarty-pants. So I don’t.

2. Another handy tool is derequire, which can remove all require()s from your bundle to ensure it doesn’t get re-interpreted as a CommonJS module.

3. Calvin Metcalf pointed out to me that you can also work around this issue by using crypto sub-modules, e.g. require('crypto-hash'), or by fooling Browserify via require('cryp' + 'to').

4. With npm 3, many developers are starting to declare Bower to be obsolete. I think this is mostly right, but there are still a few areas where Bower beats npm. First off, for isomorphic libraries like PouchDB, an npm install can be more time-consuming and error-prone than a bower install, due to native LevelDB dependencies that you’ll never need if you’re only using PouchDB on the frontend. Second, not all libraries are publishing their dist/ code to npm, meaning that former Bower users would have to learn the whole Browserify/Webpack stack rather than just include a <script> tag. Third, not all Bower modules are even on npm – Ionic framework is a popular one that springs to mind. Fourth, there’s the social cost of migrating folks from Bower to npm, throwing away a wealth of tutorials and accumulated knowledge in the process. It’s not so simple to just tell people, “Okay, now start using npm instead of Bower.”

5. I’ve ragged a lot on the JavaScript community in this post, but I still find authoring for JavaScript to be a very pleasurable experience. I’ve been a consumer of Python, Java, and Perl modules, as well as a publisher of Java modules, and I still find npm to be the nicest to work with. The fact that my publish process is as simple as npm version patch|minor|major plus a npm publish is a real dream compared to the somewhat bureaucratic process for asking permission to publish to Maven Central. (If I ever have to see the Sonatype Nexus web UI again, I swear I’m going to hurl.)

5 May

JavaScript development and the paradox of choice

Posted by Nolan Lawson in Webapps. Tagged: javascript, software development. 1 comment

There are a lot of folks trying to sell their miracle cure for the problem of writing efficient, testable, maintainable JavaScript. And there’s an equal number of folks decrying the proliferation of almost-there libraries and flash-in-the-pan frameworks.

Bootstrap. Backbone. Handlebars. Angular. I’ve spent so much time hearing snatches of conversation about these tools, and trying to make sense of them, that after awhile it all starts to sound like some crazy beat poetry.

Listen:

angular backbone bootstrap cordova handlebars lawnchair underscore jasmine karma testacular grunt yeoman blueprint ember bower require sencha dojo mootools phonegap modernizr prototype meteor…

If you shouted that on a street corner while wielding a bottle of bourbon, you wouldn’t look out of place. I’ve seen the best minds of my generation destroyed trying to understand this mess.

Police Chief Wiggum and a raving derelict, from Simpsons episode 3F02.

Pictured: a seasoned JavaScript developer.

A good JavaScript is hard to find

Part of the reason there are so many snake-oil salesmen is that the cure is so badly needed. Web development is both 1) hard and 2) absolutely crucial. Facebook and Gmail have set the bar high enough that nowadays everyone expects beautiful, responsive, browser-based applications that take milliseconds to download and work on every rectangular-shaped device you can throw at it. It’s a tall order.

And the reason it feels like snake oil is that none of these tools solves the entire problem. I’ve tried many of them, hoping that I had finally found the JavaScript silver bullet, and I’ve always felt vaguely disappointed afterwards. The medicine tastes good going down, I get excited watching YouTube tutorials and reading GitHub pages and coding in a new paradigm, and then afterwards I still end up sweating feverishly over the Chrome Developer Tools, trying to center a disobedient div or figure out why my event isn’t firing. I exchange one set of problems for another.

And then I lay awake at night wondering, “Well, maybe instead of JQuery UI, I should have used YUI or Bootstrap or…”? Then it’s back for another dose of the same old medicine.

Grandpa Simpson selling some 'revitalizing tonic,' from Simpsons episode 2F07.

Step right up and put some fury in your JQuery, some zest in your CSS!

Another world is possible

This situation really frustrates me, primarily because I come from a Java background. And in Java Land, the platform is mature enough that there’s a basic suite of components that have emerged as the brain-dead, obvious solutions to common problems.

Need to test your app? Duh, use JUnit.
Need basic HTTP operations? Double duh, use Apache HTTP Client.
Need ORM? What are you, stupid? Use Hibernate.
Need a package manager? Cripes, it’s the 21st century: use Maven. Or Ivy, if you want something even simpler.

And if you use modern Java-based frameworks like Android or Grails, you’ll see that a lot of these third-party tools are already baked in: e.g. JUnit and HTTP Client for Android; Ivy, Hibernate, and JUnit for Grails. New Java developers pick up stuff like JUnit without thinking about it, as if it were just part of the language. And it practically is.

Even Java itself is mature enough that I’ve honestly felt satisfied since Java 6, and haven’t seen much need to upgrade. String switches in Java 7? Yawn, I’ve been using Enums since Java 5. Lambdas in Java 8? No need, Google Guava has me covered.

No silver JS bullet

JavaScript, on the other hand, is anything but mature. There is no “obvious” choice for third-party components – with the exception of JQuery, which is so omnipresent nowadays that it almost is JavaScript.

But aside from JQuery, there’s no one-stop solution that everyone rallies behind. For each of my “easy” questions for Java above, you get a forest of forked decision trees in JavaScript:

Need to test your app? Well, there’s Jasmine, which you should probably run with Grunt on PhantomJS. Or, if you want to test in a real browser, you could use Selenium or Karma (formerly known as Testacular)…
Need basic HTTP operations? Well, every framework sorta has their own implementation of “ajax,” and they can be wildly different depending on how you handle asynchronicity. So go have fun learning a new API every time you need to do a basic HTTP GET.
Need ORM? Well, that depends on whether you’re using JavaScript on the server (node.js? Meteor?) and in what kind of JavaScript-friendly database (MongoDB? CouchDB?) or if you’re just storing data on the client (cookies? LocalStorage? Web SQL? IndexedDB? Lawnchair?). JavaScript hasn’t even solved the problem of persistence yet, let alone ORM.
Need a package manager? Well, there’s NPM, and then there’s also Bower, and then you could use RequireJS, or you could just use one of the many available CDNs…

Mr. Burns contemplates Ketchup vs. Catsup, from Simpsons episode 2F07.

Ketchup or catsup? Karma or Selenium?

The paradox of choice

When you’ve got dozens of popular frameworks, many of them with overlapping or even conflicting goals, the choices can be overwhelming. And even after you choose one, it’s easy to end up second-guessing yourself and fretting endlessly over your decision. It’s a familiar case of the phenomenon popularly known as the paradox of choice.

So what’s a poor JavaScript developer to do?

Let’s say, for instance, that your boss tells your team that you need to write a mobile webapp. Do you choose JQuery Mobile, Sencha Touch, or Dojo Mobile? And what if you need to write a regular data-driven Ajax app? Do you choose Angular, Ember, or Backbone? Each of them has a snazzy self-laudatory website and fierce partisans on Stack Overflow. Looks like you’ve got some reading to do!

I’m new to web development, but I’ve come to believe that the only surefire solution to the problem of competing frameworks is to try them all. Not for a mission-critical project, of course – instead, you should just write a stub app. That way, you’ll discover each framework’s strengths and shortcomings, you’ll understand the problems it’s trying to solve, and you’ll be able to make an informed decision when it really counts.

In my opinion, it’s better to have three developers on your team take a week to write stub apps in three different frameworks, rather than blindly embark down a single path based on the attractiveness of a documentation page or the charisma of a YouTube evangelist.

My own stub app

I decided to try this approach recently with three frameworks I was curious about – Angular, Bootstrap, and PhoneGap. They seemed to have orthogonal goals, so in theory they should play nicely together.

My objective was to write a webapp with nice MVC features (Angular) that would look pretty (Bootstrap) and could work as a native Android or iOS app (PhoneGap). For the task itself, I chose to write an end-of-game score calculator for one of my favorite Euro-style board games, Imperial. This had the benefit of being a well-defined problem that scratched a personal itch, and plus it gave me something to show off to my board gamer buddies.

For the feature specifications, the usual suspects applied. I needed to persist user data, because presumably users would want to see their saved games, or resume a game if they accidentally closed the tab. The design had to be responsive in order to accomodate multiple screen sizes, because you could imagine using this app in your browser as well as on a smartphone. It had to support deep-linking, because what if you wanted to share the game results with your friends? And of course, the UI had to present the data in a useful way: who’s in first place, who came in second, are there any ties, etc.

When I first described this project to one of my coworkers, his reaction was “that sounds like way more than a stub app.” Which is true – as soon as you exceed a certain level of complexity, you run into interesting problems, for which the frameworks are supposed to provide useful solutions. This is exactly the point of writing the app.

The end result of this experiment is the Imperial Score Calculator. It’s available as both a mobile-friendly webapp and an Android app (iOS version coming soon). And of course the source code is on GitHub.

Imperial Score Calculator

I’ve learned something today

In the end, I’m very satisfied with the project. Not because the app itself is the best I’ve ever written (it’s not), but because it taught me some hard lessons that I’ll take with me to my next web project. For instance, here are some of the lessons learned:

Bootstrap does not magically make everything responsive. Do not design for the desktop and then hope that when you resize the viewport everything will “just work.” Some assembly required.
Angular is a godsend. It’s as if someone stepped out of a time machine and showed us what HTML6 will look like, today. I initially wrote the app in JQuery; a naïve Angular rewrite resulted in about 20% less code.
That being said, Angular does not instantly replace JQuery, unless you really grok directives. I still had to fall back on the good ol’ $ from time to time.
Lawnchair is a cool idea, but it’s poorly documented, and the asynchronous approach means you can’t save user data in the onbeforeunload event. In the end, I just went with LocalStorage.
PhoneGap is awesome. But man oh man, do not try debugging it without Weinre, unless you like pulling your hair out.

These are all opinions that I hold after working on this app. And I don’t expect you, dear reader, to swallow any of them just based on my say-so. The only way you can learn these lessons is to build a stub app yourself.

And perhaps you’ll have a totally different experience and come to totally different conclusions. Your mileage may vary. But you won’t know until you take the car out for a test drive.

Imperial Score Calculator mobile-sized screenshot

I ended up using a completely different layout for the mobile version.

Conclusion

JavaScript development is hard. The community is going through some growing pains, with everyone defending their cherished framework. The only solution to this problem of fragmentation and “There’s More Than One Million Ways To Do It” is time.

I do see some rays of hope in projects like Meteor and Yeoman, which are very opinionated meta-frameworks that attempt to combine multiple “best of class” JavaScript solutions into one easy package for web developers. In a sense, they’re trying to solve the problem that’s already been solved in Java Land.

But since Java Land is an increasingly irrelevant, fading power next to the ascendant hegemony that is the People’s Republic of JavaScript, the solution can’t come soon enough. In the meantime, I’ll keep writing stub apps.

Read the Tea Leaves Software and other dark arts, by Nolan Lawson

Posts Tagged ‘javascript’

Let’s learn how modern JavaScript frameworks work by building one

What is a “modern JavaScript framework”?

What sets modern frameworks apart?

Reactivity

Cloning DOM trees

Modern JavaScript APIs

Step 1: building reactivity

Step 2: DOM rendering

Step 3: combining reactivity and DOM rendering

Next steps

Conclusion

Footnotes

A tour of JavaScript timers on the web

Promises and microtasks

setTimeout and setInterval

setImmediate

requestAnimationFrame

requestIdleCallback

debounce and throttle

Conclusion

Smaller Lodash bundles with Webpack and Babel

TL;DR

babel-plugin-lodash

lodash-webpack-plugin

Iteratee shorthands

Collection methods

Chaining

Gotchas

Conclusion

A brief and incomplete history of JavaScript bundlers

What is a bundle?

The rise of Browserify

Enter Webpack

Closing thoughts

How to write a JavaScript package for both Node and the browser

Let’s build a JS package

“browser” field, how I love thee

Advanced techniques

The struggles of publishing a JavaScript library

The Babelification of JavaScript

What is a JS module?

Package managers and pseudo-package managers

What about CDNs?

Browser vs Node

Wait, you tried to run my code where?

Browserify vs Webpack

Enter ES6

Making sense of this mess

Footnotes

JavaScript development and the paradox of choice

A good JavaScript is hard to find

Another world is possible

No silver JS bullet

The paradox of choice

My own stub app

I’ve learned something today

Conclusion

Recent Posts

About Me

Archives

Tags

Links