Archive for the ‘performance’ Category

Building a modern carousel with CSS scroll snap, smooth scrolling, and pinch-zoom

Recently I had some fun implementing an image carousel for Pinafore. The requirements were pretty simple: users should be able to swipe horizontally through up to 4 images, and also pinch-zoom to get a closer look.

The finished product looks like this:

 

Often when you’re building something like this, it’s tempting to use an off-the-shelf solution. The problem is that this often adds a large dependency size, or the code is inflexible, or it’s framework-specific (React, Vue, etc.), or it may not be optimized for performance and accessibility.

Come on, it’s 2019. Isn’t there a decent way to build a carousel with native browser APIs?

As it turns out, there is. My carousel implementation uses a few simple building blocks:

  1. CSS scroll snap
  2. scrollTo() with smooth behavior
  3. The <pinch-zoom> custom element

CSS scroll snap

Let’s start off with CSS scroll snap. This is what makes the scrollable element “snap” to a certain position as you scroll it.

The browser support is pretty good. The only trick is that you have to write one implementation for the modern scroll snap API (supported by Chrome and Safari), and another for the older scroll snap points API (supported by Firefox[1]).

You can detect support using @supports (scroll-snap-align: start). As usual for iOS Safari, you’ll also need to add -webkit-overflow-scrolling: touch to make the element scrollable.

But lo and behold, we now have the world’s simplest carousel implementation. It doesn’t even require JavaScript – just HTML and CSS!

Note: for best results, you may want to view the above pen in full mode.

The benefit of having all this “snapping” logic inside of CSS rather than JavaScript is that the browser is doing the heavy lifting. We don’t have to use touchmove listeners or requestAnimationFrame to try to get the pixel-perfect snapping behavior with the right animation curve – the browser handles all of it for us, in native code.

And unlike touchmove, this scroll-snapping works for any method of scrolling – touchpad, touchscreen, scrollbar, you name it.

scrollTo() with smooth scrolling

The next piece of the puzzle is that most carousels have little indicator buttons that let you navigate between the items in the list.

Screenshot of a carousel containing an image of a cat with indicator buttons below showing 1 filled circle and 3 unfilled circles

For this, we will need a little bit of JavaScript. We can use the scrollTo() API with {behavior: 'smooth'}, which tells the browser to smoothly scroll to a given offset:

function scrollToItem(itemPosition, numItems, scroller) {
  scroller.scrollTo({
    scrollLeft: Math.floor(
      scroller.scrollWidth * (itemPosition / numItems)
    ),
    behavior: 'smooth'
  })
}

The only trick here is that Safari doesn’t support smooth scroll behavior and Edge doesn’t support scrollTo() at all. But we can detect support and fall back to a JavaScript implementation, such as this one.

Here is my technique for detecting native smooth scrolling:

function testSupportsSmoothScroll () {
  var supports = false
  try {
    var div = document.createElement('div')
    div.scrollTo({
      top: 0,
      get behavior () {
        supports = true
        return 'smooth'
      }
    })
  } catch (err) {}
  return supports
}

Being careful to set aria-labels and aria-pressed states for the buttons, and adding a debounced scroll listener to update the pressed state as the user scrolls, we end up with something like this:

View in full mode

You can also add generic “go left” and “go right” buttons; the principle is the same.

Hiding the scrollbar (optional)

Now, the next piece of the puzzle is that most carousels don’t have a scrollbar, and depending on the browser and OS, you might not like how the scrollbar appears.

Also, our carousel already includes all the buttons needed to scroll left and right, so it effectively has its own scrollbar. So we can consider removing the native one.

To accomplish this, we can start with overflow-x: auto rather than overflow-x: scroll, which ensures that at least if there’s only one image (and thus no possibility of scrolling left or right), the scrollbar doesn’t show.

Beyond that, we may be tempted to add overflow-x: hidden, but this actually makes the list entirely unscrollable. Bummer.

So we can use a little hack instead. Here is some CSS to remove the scrollbar, which works in Chrome, Edge, Firefox, and Safari:

.scroll {
  scrollbar-width: none;
  -ms-overflow-style: none;
}
.scroll::-webkit-scrollbar {
  display: none;
}

And it works! The scrollbar is gone:

View in full mode

Admittedly, though, this is a bit icky. The only standards-based CSS here is scrollbar-width, which is currently only supported by Firefox. The -webkit-scrollbar hack is for Chrome and Safari, and the -ms-overflow-style hack is for Edge/IE.

So if you don’t like vendor-specific CSS, or if you think scrollbars are better for accessibility, then you can just keep the scrollbar around. Follow your heart!

Pinch-zoom

For pinch-zooming, this is one case where I allowed myself an indulgence: I use the <pinch-zoom> element from Google Chrome Labs.

I like it because it’s extremely small (5.2kB minified) and it uses Pointer Events under the hood, meaning it supports mobile touchscreens, touchpads, touchscreen laptops, and any device that supports pinch-zooming.

However, this element isn’t totally compatible with a scrollable list, because dragging your finger left and right causes the image to move left and right, rather than scroll left and right.

 

I thought this was actually a nice touch, though, since it allows you to choose which part of the image to zoom in on. So I decided to keep it.

To make this work inside a scrollable carousel, though, I decided to add a separate mode for zooming. You have to tap the magnifying glass to enable zooming, at which point dragging your finger moves the image itself rather than the carousel.

Toggling the pinch-zoom mode was as simple as removing or adding the <pinch-zoom> element to toggle it [2]. I also decided to add some explicit “zoom in” and “zoom out” buttons for the benefit of users who don’t have a device that supports pinch-zooming.

 

Of course, I could have implemented this myself using raw Pointer Events, but <pinch-zoom> offers a small footprint, a nice API, and good browser compatibility (e.g. on iOS Safari, where Pointer Events are not supported). So it felt like a worthy addition.

Intrinsic sizing

The last piece of the puzzle (I promise!) is a way to keep the images from doing a re-layout when they load. This can lead to janky-looking reflows, especially on slow connections.

 

Assuming we know the dimensions of the images in advance, we can fix this by using the intrinsicsize attribute. Unfortunately this isn’t supported in any browser yet, but it’s coming soon to Chrome! And it’s way easier than any other (hacky) solution you may think of.

Here it is in Chrome 72 with the “experimental web platform features” flag enabled:

 

Notice that the buttons don’t jump around while the image loads. Much nicer!

Accessibility check

Looking over the WAI Carousel Concepts document, there are a few good things to keep in mind when implementing this carousel:

  1. To make the carousel more keyboard-navigable, you may add keyboard shortcuts, for instance and to navigate left and right. (Note though that a scrollable horizontal list can already be focused and scrolled with the keyboard.)
  2. Use <ul> and <li> elements instead of <div>s, so that a screen reader announces it as a list.
  3. The smooth-scrolling can be distracting or nausea-inducing for some folks, so respect prefers-reduced-motion or provide an option to turn it off.
  4. As mentioned previously, use aria-label and aria-pressed for the indicator buttons.

Compatibility check

But what about IE support? I can hear your boss screaming at you already.

If you’re one of the unfortunate souls who still has to maintain IE11 support, rest assured: a scroll-snap carousel is just a normal horizontal-scrolling list on IE. It doesn’t work exactly the same, but hey, does it need to? IE11 users probably don’t expect the best anymore.

Conclusion

So that’s it! I decided not to publish this as a library, and I’m leaving the pinch-zooming and intrinsic sizing as an exercise for the reader. I think the core building blocks are simple enough that folks really ought to just take the native APIs and run with them.

Any decisions I could bake into a library would only limit the flexibility of the carousel, and leave its users high-and-dry when they need to tweak something, because I’ve taught them how to use my library instead of the native browser API.

At some point, it’s just better to go native.

Footnotes

1. For whatever reason, I couldn’t get the old scroll snap points spec to work in Edge. Sarah Drasner apparently ran into the same issue. On the bright side, though, a horizontally scrolling list without snap points is just a regular horizontally scrolling list!

2. The first version of this blog post recommended using pointer-events: none to toggle the zoom mode on or off. It turns out that this breaks right-clicking to download an image. So it seems better to just remove or add the <pinch-zoom> element to toggle it.

The cost of small modules

Update (30 Oct 2016): since I wrote this post, a bug was found in the benchmark which caused Rollup to appear slightly better than it would otherwise. However, the overall results are not substantially different (Rollup still beats Browserify and Webpack, although it’s not quite as good as Closure anymore), so I’ve merely updated the charts and tables. Additionally, the benchmark now includes the RequireJS and RequireJS Almond bundlers, so those have been added as well. To see the original blog post without these edits, check out this archived version.

Update (21 May 2018): This blog post analyzed older versions of Webpack, Browserify, and other module bundlers. Later versions of these bundlers added support for features like module concatenation and flat packing, which address most of the concerns raised in this blog post. You can get an idea for the performance improvement from these methods in these PRs.

About a year ago I was refactoring a large JavaScript codebase into smaller modules, when I discovered a depressing fact about Browserify and Webpack:

“The more I modularize my code, the bigger it gets. 😕”
Nolan Lawson

Later on, Sam Saccone published some excellent research on Tumblr and Imgur‘s page load performance, in which he noted:

“Over 400ms is being spent simply walking the Browserify tree.”
Sam Saccone

In this post, I’d like to demonstrate that small modules can have a surprisingly high performance cost depending on your choice of bundler and module system. Furthermore, I’ll explain why this applies not only to the modules in your own codebase, but also to the modules within dependencies, which is a rarely-discussed aspect of the cost of third-party code.

Web perf 101

The more JavaScript included on a page, the slower that page tends to be. Large JavaScript bundles cause the browser to spend more time downloading, parsing, and executing the script, all of which lead to slower load times.

Even when breaking up the code into multiple bundles – Webpack code splitting, Browserify factor bundles, etc. – the cost is merely delayed until later in the page lifecycle. Sooner or later, the JavaScript piper must be paid.

Furthermore, because JavaScript is a dynamic language, and because the prevailing CommonJS module system is also dynamic, it’s fiendishly difficult to extract unused code from the final payload that gets shipped to users. You might only need jQuery’s $.ajax, but by including jQuery, you pay the cost of the entire library.

The JavaScript community has responded to this problem by advocating the use of small modules. Small modules have a lot of aesthetic and practical benefits – easier to maintain, easier to comprehend, easier to plug together – but they also solve the jQuery problem by promoting the inclusion of small bits of functionality rather than big “kitchen sink” libraries.

So in the “small modules” world, instead of doing:

var _ = require('lodash')
_.uniq([1,2,2,3])

You might do:

var uniq = require('lodash.uniq')
uniq([1,2,2,3])

Rich Harris has already articulated why the “small modules” pattern is inherently beginner-unfriendly, even though it tends to make life easier for library maintainers. However, there’s also a hidden performance cost to small modules that I don’t think has been adequately explored.

Packages vs modules

It’s important to note that, when I say “modules,” I’m not talking about “packages” in the npm sense. When you install a package from npm, it might only expose a single module in its public API, but under the hood it could actually be a conglomeration of many modules.

For instance, consider a package like is-array. It has no dependencies and only contains one JavaScript file, so it has one module. Simple enough.

Now consider a slightly more complex package like once, which has exactly one dependency: wrappy. Both packages contain one module, so the total module count is 2. So far, so good.

Now let’s consider a more deceptive example: qs. Since it has zero dependencies, you might assume it only has one module. But in fact, it has four!

You can confirm this by using a tool I wrote called browserify-count-modules, which simply counts the total number of modules in a Browserify bundle:

$ npm install qs
$ browserify node_modules/qs | browserify-count-modules
4

What’s going on here? Well, if you look at the source for qs, you’ll see that it contains four JavaScript files, representing four JavaScript modules which are ultimately included in the Browserify bundle.

This means that a given package can actually contain one or more modules. These modules can also depend on other packages, which might bring in their own packages and modules. The only thing you can be sure of is that each package contains at least one module.

Module bloat

How many modules are in a typical web application? Well, I ran browserify-count-modules on a few popular Browserify-using sites, and came up with these numbers:

For the record, my own Pokedex.org (the largest open-source site I’ve built) contains 311 modules across four bundle files.

Ignoring for a moment the raw size of those JavaScript bundles, I think it’s interesting to explore the cost of the number of modules themselves. Sam Saccone has already blown this story wide open in “The cost of transpiling es2015 in 2016”, but I don’t think his findings have gotten nearly enough press, so let’s dig a little deeper.

Benchmark time!

I put together a small benchmark that constructs a JavaScript module importing 100, 1000, and 5000 other modules, each of which merely exports a number. The parent module just sums the numbers together and logs the result:

// index.js
var total = 0
total += require('./module_0')
total += require('./module_1')
total += require('./module_2')
// etc.
console.log(total)
// module_0.js
module.exports = 0
// module_1.js
module.exports = 1

(And so on.)

I tested five bundling methods: Browserify, Browserify with the bundle-collapser plugin, Webpack, Rollup, and Closure Compiler. For Rollup and Closure Compiler I used ES6 modules, whereas for Browserify and Webpack I used CommonJS, so as not to unfairly disadvantage them (since they would need a transpiler like Babel, which adds its own overhead).

In order to best simulate a production environment, I used Uglify with the --mangle and --compress settings for all bundles, and served them gzipped over HTTPS using GitHub Pages. For each bundle, I downloaded and executed it 15 times and took the median, noting the (uncached) load time and execution time using performance.now().

Bundle sizes

Before we get into the benchmark results, it’s worth taking a look at the bundle files themselves. Here are the byte sizes (minified but ungzipped) for each bundle (chart view):

100 modules 1000 modules 5000 modules
browserify 7982 79987 419985
browserify-collapsed 5786 57991 309982
webpack 3955 39057 203054
rollup 1265 13865 81851
closure 758 7958 43955
rjs 29234 136338 628347
rjs-almond 14509 121612 613622

And the minified+gzipped sizes (chart view):

100 modules 1000 modules 5000 modules
browserify 1650 13779 63554
browserify-collapsed 1464 11837 55536
webpack 688 4850 24635
rollup 629 4604 22389
closure 302 2140 11807
rjs 7940 19017 62674
rjs-almond 2732 13187 56135

What stands out is that the Browserify and Webpack versions are much larger than the Rollup and Closure Compiler versions (update: especially before gzipping, which still matters since that’s what the browser executes). If you take a look at the code inside the bundles, it becomes clear why.

The way Browserify and Webpack work is by isolating each module into its own function scope, and then declaring a top-level runtime loader that locates the proper module whenever require() is called. Here’s what our Browserify bundle looks like:

(function e(t,n,r){function s(o,u){if(!n[o]){if(!t[o]){var a=typeof require=="function"&&require;if(!u&&a)return a(o,!0);if(i)return i(o,!0);var f=new Error("Cannot find module '"+o+"'");throw f.code="MODULE_NOT_FOUND",f}var l=n[o]={exports:{}};t[o][0].call(l.exports,function(e){var n=t[o][1][e];return s(n?n:e)},l,l.exports,e,t,n,r)}return n[o].exports}var i=typeof require=="function"&&require;for(var o=0;o<r.length;o++)s(r[o]);return s})({1:[function(require,module,exports){
module.exports = 0
},{}],2:[function(require,module,exports){
module.exports = 1
},{}],3:[function(require,module,exports){
module.exports = 10
},{}],4:[function(require,module,exports){
module.exports = 100
// etc.

Whereas the Rollup and Closure bundles look more like what you might hand-author if you were just writing one big module. Here’s Rollup:

(function () {
        'use strict';
        var module_0 = 0
        var module_1 = 1
        // ...
        total += module_0
        total += module_1
        // etc.

The important thing to notice is that every module in Webpack and Browserify gets its own function scope, and is loaded at runtime when require()d from the main script. Rollup and Closure Compiler, on the other hand, just hoist everything into a single function scope (creating variables and namespacing them as necessary).

If you understand the inherent cost of functions-within-functions in JavaScript, and of looking up a value in an associative array, then you’ll be in a good position to understand the following benchmark results.

Results

Update: as noted above, results have been re-run with corrections and the addition of the r.js and r.js Almond bundlers. For the full tabular data, see this gist.

I ran this benchmark on a Nexus 5 with Android 5.1.1 and Chrome 52 (to represent a low- to mid-range device) as well as an iPod Touch 6th generation running iOS 9 (to represent a high-end device).

Here are the results for the Nexus 5:

Nexus 5 results

And here are the results for the iPod Touch:

iPod Touch results

At 100 modules, the variance between all the bundlers is pretty negligible, but once we get up to 1000 or 5000 modules, the difference becomes severe. The iPod Touch is hurt the least by the choice of bundler, but the Nexus 5, being an aging Android phone, suffers a lot under Browserify and Webpack.

I also find it interesting that both Rollup and Closure’s execution cost is essentially free for the iPod, regardless of the number of modules. And in the case of the Nexus 5, the runtime costs aren’t free, but they’re still much cheaper for Rollup/Closure than for Browserify/Webpack, the latter of which chew up the main thread for several frames if not hundreds of milliseconds, meaning that the UI is frozen just waiting for the module loader to finish running.

Note that both of these tests were run on a fast Gigabit connection, so in terms of network costs, it’s really a best-case scenario. Using the Chrome Dev Tools, we can manually throttle that Nexus 5 down to 3G and see the impact:

Nexus 5 3G results

Once we take slow networks into account, the difference between Browserify/Webpack and Rollup/Closure is even more stark. In the case of 1000 modules (which is close to Reddit’s count of 1050), Browserify takes about 400 milliseconds longer than Rollup. And that 400ms is no small potatoes, since Google and Bing have both noted that sub-second delays have an appreciable impact on user engagement.

One thing to note is that this benchmark doesn’t measure the precise execution cost of 100, 1000, or 5000 modules per se, since that will depend on your usage of require(). Inside of these bundles, I’m calling require() once per module, but if you are calling require() multiple times per module (which is the norm in most codebases) or if you are calling require() multiple times on-the-fly (i.e. require() within a sub-function), then you could see severe performance degradations.

Reddit’s mobile site is a good example of this. Even though they have 1050 modules, I clocked their real-world Browserify execution time as much worse than the “1000 modules” benchmark. When profiling on that same Nexus 5 running Chrome, I measured 2.14 seconds for Reddit’s Browserify require() function, and 197 milliseconds for the equivalent function in the “1000 modules” script. (In desktop Chrome on an i7 Surface Book, I also measured it at 559ms vs 37ms, which is pretty astonishing given we’re talking desktop.)

This suggests that it may be worthwhile to run the benchmark again with multiple require()s per module, although in my opinion it wouldn’t be a fair fight for Browserify/Webpack, since Rollup/Closure both resolve duplicate ES6 imports into a single hoisted variable declaration, and it’s also impossible to import from anywhere but the top-level scope. So in essence, the cost of a single import for Rollup/Closure is the same as the cost of n imports, whereas for Browserify/Webpack, the execution cost will increase linearly with n require()s.

For the purposes of this analysis, though, I think it’s best to just assume that the number of modules is only a lower bound for the performance hit you might feel. In reality, the “5000 modules” benchmark may be a better yardstick for “5000 require() calls.”

Conclusions

First off, the bundle-collapser plugin seems to be a valuable addition to Browserify. If you’re not using it in production, then your bundle will be a bit larger and slower than it would be otherwise (although I must admit the difference is slight). Alternatively, you could switch to Webpack and get an even faster bundle without any extra configuration. (Note that it pains me to say this, since I’m a diehard Browserify fanboy.)

However, these results clearly show that Webpack and Browserify both underperform compared to Rollup and Closure Compiler, and that the gap widens the more modules you add. Unfortunately I’m not sure Webpack 2 will solve any of these problems, because although they’ll be borrowing some ideas from Rollup, they seem to be more focused on the tree-shaking aspects and not the scope-hoisting aspects. (Update: a better name is “inlining,” and the Webpack team is working on it.)

Given these results, I’m surprised Closure Compiler and Rollup aren’t getting much traction in the JavaScript community. I’m guessing it’s due to the fact that (in the case of the former) it has a Java dependency, and (in the case of the latter) it’s still fairly immature and doesn’t quite work out-of-the-box yet (see Calvin’s Metcalf’s comments for a good summary).

Even without the average JavaScript developer jumping on the Rollup/Closure bandwagon, though, I think npm package authors are already in a good position to help solve this problem. If you npm install lodash, you’ll notice that the main export is one giant JavaScript module, rather than what you might expect given Lodash’s hyper-modular nature (require('lodash/uniq'), require('lodash.uniq'), etc.). For PouchDB, we made a similar decision to use Rollup as a prepublish step, which produces the smallest possible bundle in a way that’s invisible to users.

I also created rollupify to try to make this pattern a bit easier to just drop-in to existing Browserify projects. The basic idea is to use imports and exports within your own project (cjs-to-es6 can help migrate), and then use require() for third-party packages. That way, you still have all the benefits of modularity within your own codebase, while exposing more-or-less one big module to your users. Unfortunately, you still pay the costs for third-party modules, but I’ve found that this is a good compromise given the current state of the npm ecosystem.

So there you have it: one horse-sized JavaScript duck is faster than a hundred duck-sized JavaScript horses. Despite this fact, though, I hope that our community will eventually realize the pickle we’re in – advocating for a “small modules” philosophy that’s good for developers but bad for users – and improve our tools, so that we can have the best of both worlds.

Bonus round! Three desktop browsers

Normally I like to run performance tests on mobile devices, since that’s where you see the clearest differences. But out of curiosity, I also ran this benchmark on Chrome 54, Edge 14, and Firefox 48 on an i5 Surface Book using Windows 10 RS1. Here are the results:

Chrome 54

Chrome 54 Surfacebook results

Edge 14 (tabular results)

Edge 14 Surfacebook results

Firefox 48 (tabular results)

Firefox 48 Surfacebook results

The only interesting tidbits I’ll call out in these results are:

  1. bundle-collapser is definitely not a slam-dunk in all cases.
  2. The ratio of network-to-execution time is always extremely high for Rollup and Closure; their runtime costs are basically zilch. ChakraCore and SpiderMonkey eat them up for breakfast, and V8 is not far behind.

This latter point could be extremely important if your JavaScript is largely lazy-loaded, because if you can afford to wait on the network, then using Rollup and Closure will have the additional benefit of not clogging up the UI thread, i.e. they’ll introduce less jank than Browserify or Webpack.

Update: in response to this post, JDD has opened an issue on Webpack. There’s also one on Browserify.

Update 2: Ryan Fitzer has generously added RequireJS and RequireJS with Almond to the benchmark, both of which use AMD instead of CommonJS or ES6.

Testing shows that RequireJS has the largest bundle sizes but surprisingly its runtime costs are very close to Rollup and Closure. (See updated results above for details.)

Update 3: I wrote optimize-js, which alleviates some of the performance costs of parsing functions-within-functions.