High-performance input handling on the web

Update: In a follow-up post, I explore some of the subtleties across browsers in how they fire input events.

There is a class of UI performance problems that arise from the following situation: An input event is firing faster than the browser can paint frames.

Several events can fit this description:

  • scroll
  • wheel
  • mousemove
  • touchmove
  • pointermove
  • etc.

Intuitively, it makes sense why this would happen. A user can jiggle their mouse and deliver precise x/y updates faster than the browser can paint frames, especially if the UI thread is busy and thus the framerate is being throttled (also known as “jank”).

Screenshot of Chrome Dev Tools showing that a long frame of 546ms can contain as many as four pointermove events

In the above screenshot, pointermove events are firing faster than the framerate can keep up.[1] This can also happen for scroll events, touch events, etc.

Update: In Chrome, pointermove is actually supposed to align/throttle to requestAnimationFrame automatically, but there is a bug where it behaves differently with Dev Tools open.

The performance problem occurs when the developer naïvely chooses to handle the input directly:

element.addEventListener('pointermove', () => {
  doExpensiveOperation()
})

In a previous post, I discussed Lodash’s debounce and throttle functions, which I find very useful for these kinds of situations. Recently however, I found a pattern I like even better, so I want to discuss that here.

Understanding the event loop

Let’s take a step back. What exactly are we trying to achieve here? Well, we want the browser to do only the work necessary to paint the frames that it’s able to paint. For instance, in the case of a pointermove event, we may want to update the x/y coordinates of an element rendered to the DOM.

The problem with Lodash’s throttle()/debounce() is that we would have to choose an arbitrary delay (e.g. 20 milliseconds or 50 milliseconds), which may end up being faster or slower than the browser is actually able to paint, depending on the device and browser. So really, we want to throttle to requestAnimationFrame():

element.addEventListener('pointermove', () => {
  requestAnimationFrame(doExpensiveOperation)
})

With the above code, we are at least aligning our work with the browser’s event loop, i.e. firing right before style and layout are calculated.

However, even this is not really ideal. Imagine that a pointermove event fires three times for every frame. In that case, we will essentially do three times the necessary work on every frame:

Chrome Dev Tools screenshot showing an 82 millisecond frame where there are three pointermove events queued by requestAnimationFrame inside of the frame

This may be harmless if the code is fast enough, or if it’s only writing to the DOM. However, if it’s both writing to and reading from the DOM, then we will end up with the classic layout thrashing scenario,[2] and our rAF-based solution is actually no better than handling the input directly, because we recalculate the style and layout for every pointermove event.

Chrome Dev Tools screenshot of layout thrashing, showing two pointermove events with large Layout blocks and the text "Forced reflow is a likely performance bottleneck"

Note the style and layout recalculations in the purple blocks, which Chrome marks with a red triangle and a warning about “forced reflow.”

Throttling based on framerate

Again, let’s take a step back and figure out what we’re trying to do. If the user is dragging their finger across the screen, and pointermove fires 3 times for every frame, then we actually don’t care about the first and second events. We only care about the third one, because that’s the one we need to paint.

So let’s only run the final callback before each requestAnimationFrame. This pattern will work nicely:

function throttleRAF () {
  let queuedCallback
  return callback => {
    if (!queuedCallback) {
      requestAnimationFrame(() => {
        const cb = queuedCallback
        queuedCallback = null
        cb()
      })
    }
    queuedCallback = callback
  }
}

We could also use cancelAnimationFrame for this, but I prefer the above solution because it’s calling fewer DOM APIs. (It only calls requestAnimationFrame() once per frame.)

This is nice, but at this point we can still optimize it further. Recall that we want to avoid layout thrashing, which means we want to batch all of our reads and writes to avoid unnecessary recalculations.

In “Accurately measuring layout on the web”, I explore some patterns for queuing a timer to fire after style and layout are calculated. Since writing that post, a new web standard called requestPostAnimationFrame has been proposed, and it fits the bill nicely. There is also a good polyfill called afterframe.

To best align our DOM updates with the browser’s event loop, we want to follow these simple rules:

  1. DOM writes go in requestAnimationFrame().
  2. DOM reads go in requestPostAnimationFrame().

The reason this works is because we write to the DOM right before the browser will need to calculate style and layout (in rAF), and then we read from the DOM once the calculations have been made and the DOM is “clean” (in rPAF).

If we do this correctly, then we shouldn’t see any warnings in the Chrome Dev Tools about “forced reflow” (i.e. a forced style/layout outside of the browser’s normal event loop). Instead, all layout calculations should happen during the regular event loop cycle.

Chrome Dev Tools screenshot showing one pointermove per frame and large layout blocks with no "forced reflow" warning

In the Chrome Dev Tools, you can tell the difference between a forced layout (or “reflow”) and a normal one because of the red triangle (and warning) on the purple style/layout blocks. Note that above, there are no warnings.

To accomplish this, let’s make our throttler more generic, and create one that can handle requestPostAnimationFrame as well:

function throttle (timer) {
  let queuedCallback
  return callback => {
    if (!queuedCallback) {
      timer(() => {
        const cb = queuedCallback
        queuedCallback = null
        cb()
      })
    }
    queuedCallback = callback
  }
}

Then we can create multiple throttlers based on whether we’re doing DOM reads or writes:[3]

const throttledWrite = throttle(requestAnimationFrame)
const throttledRead = throttle(requestPostAnimationFrame)

element.addEventListener('pointermove', e => {
  throttledWrite(() => {
    doWrite(e)
  })
  throttledRead(() => {
    doRead(e)
  })
})

Effectively, we have implemented something like fastdom, but using only requestAnimationFrame and requestPostAnimationFrame!

Pointer event pitfalls

The last piece of the puzzle (at least for me, while implementing a UI like this), was to avoid the pointer events polyfill. I found that, even after implementing all the above performance improvements, my UI was still janky in Firefox for Android.

After some digging with WebIDE, I found that Firefox for Android currently does not support Pointer Events, and instead only supports Touch Events. (This is similar to the current version of iOS Safari.) After profiling, I found that the polyfill itself was taking up a lot of my frame budget.

Screenshot of Firefox WebIDE showing a lot of time spent in pointer-events polyfill

So instead, I switched to handling pointer/mouse/touch events myself. Hopefully in the near future this won’t be necessary, and all browsers will support Pointer Events! We’re already close.

Here is the before-and-after of my UI, using Firefox on a Nexus 5:

 

When handling very performance-sensitive scenarios, like a UI that should respond to every pointermove event, it’s important to reduce the amount of work done on each frame. I’m sure that this polyfill is useful in other situations, but in my case, it was just adding too much overhead.

One other optimization I made was to delay updates to the store (which trigger some extra JavaScript computations) until the user’s drag had completed, instead of on every drag event. The end result is that, even on a resource-constrained device like the Nexus 5, the UI can actually keep up with the user’s finger!

Conclusion

I hope this blog post was helpful for anyone handling scroll, touchmove, pointermove, or similar input events. Thinking in terms of how I’d like to align my work with the browser’s event loop (using requestAnimationFrame and requestPostAnimationFrame) was useful for me.

Note that I’m not saying to never use Lodash’s throttle or debounce. I use them all the time! Sometimes it makes sense to just let a timer fire every n milliseconds – e.g. when debouncing window resize events. In other cases, I like using requestIdleCallback – for instance, when updating a non-critical part of the UI based on user input, like a “number of characters remaining” counter when typing into a text box.

In general, though, I hope that once requestPostAnimationFrame makes its way into browsers, web developers will start to think more purposefully about how they do UI updates, leading to fewer instances of layout thrashing. fastdom was written in 2013, and yet its lessons still apply today. Hopefully when rPAF lands, it will be much easier to use this pattern and reduce the impact of layout thrashing on web performance.

Footnotes

1. In the Pointer Events Level 2 spec, it says that pointermove events “may be coalesced or aligned to animation frame callbacks based on UA decision.” So hypothetically, a browser could throttle pointermove to fire only once per rAF (and if you need precise x/y events, e.g. for a drawing app, you can use getCoalescedEvents()). It’s not clear to me, though, that any browser actually does this. Update: see comments below, some browsers do! In any case, throttling the events to rAF in JavaScript accomplishes the same thing, regardless of UA behavior.

2. Technically, the only DOM reads that matter in the case of layout thrashing are DOM APIs that force style/layout, e.g. getBoundingClientRect() and offsetLeft. If you’re just calling getAttribute() or classList.contains(), then you’re not going to trigger style/layout recalculations.

3. Note that if you have different parts of the code that are doing separate reads/writes, then each one will need its own throttler function. Otherwise one throttler could cancel the other one out. This can be a bit tricky to get right, although to be fair the same footgun exists with Lodash’s debounce/throttle.

16 responses to this post.

  1. […] Производительность: • Особенности Google PageSpeed: улучшение оценки сайта и его рейтинга в поиске • Largest Contentful Paint (LCP). Новая метрика производительности, которая поможет измерить время загрузки основного содержимого веб-страницы. • Time to First Byte: что это такое и почему это важно • Нативная ленивая загрузка в вебе • Нативный Lazy-Loading уже работает в Chrome 76! • Установите сетевые подключения заранее, для улучшения воспринимаемой скорости страницы • Высокопроизводительная обработка инпута в вебе […]

    Reply

  2. Posted by jaffathecake on August 11, 2019 at 10:28 PM

    Chrome and Firefox already sync move and touch movement to the render steps. Here’s a demo http://event-timing.glitch.me/

    Reply

    • Posted by jaffathecake on August 11, 2019 at 10:33 PM

      Oh and scroll events are supposed to be sync’d with raf, but I haven’t tested.

      Reply

      • Thanks for the feedback! When you say “synced with raf,” do you mean that it fires exactly one pointermove event per rAF?

        The screenshots of the Dev Tools I took above were actually on Chrome for Android v76 (on a Nexus 5), and unless I’m reading the timeline wrong, it looks like it’s firing up to 4 pointermove events per frame.

        Edit OK interestingly, your demo seems to show that Firefox 68 is indeed only firing one pointermove per rAF, whereas Chrome 76 is firing multiple (screenshot). I’m testing on Ubuntu 18.04 with a mouse BTW.

        Edit 2 OK, I see that your demo can optionally show coalesced events as separate dots on the timeline, and then when it’s unchecked, Chrome only fires once per rAF. Looks like I have some more research to do on this!

      • Posted by jaffathecake on August 12, 2019 at 8:56 AM

        Huh, your Chrome screenshot does indeed show that mouse callbacks are happening out of sync with requestAnimationFrame. The callback time should be exactly on the frame boundary. It’s working as expected on OSX. I’ll dig into it

      • Posted by jaffathecake on August 12, 2019 at 10:06 AM

        Yeah, we’ve been able to recreate the bug. I think it’s limited to Linux. https://bugs.chromium.org/p/chromium/issues/detail?id=992954

      • In the screenshot I provided, I had “Show uncoalesced events” checked, which may be why it’s showing extra events. That screenshot is indeed from Ubuntu, though.

        As for the Dev Tools screenshot, that one was on Chrome for Android (debugged via chrome:inspect on Ubuntu). Maybe it’s just an artifact of the Dev Tools? When I run your event-timing demo on Chrome for Android, it does appear to correctly align with rAF.

  3. Posted by haggen on August 12, 2019 at 7:48 AM

    Nice roundup, thank you! Could you clarify something for me though, when you present the algorithm for batching event triggers you save and call “queuedCallback” inside a test for “not queuedCallback”, i.e. only when it’s not a function. Is that intended or am I reading this wrong? Thanks again!

    Reply

    • Yes that is intended. If the queuedCallback is null or undefined, then that means we haven’t queued a timer yet. The goal is to queue exactly one timer regardless of how many times the function is called.

      Note that queuedCallback is also set back to null once the timer is called, so the process can start all over again. Maybe not the clearest way I could have written that function! :)

      Reply

  4. On Chrome I noticed pointer events firing faster than rAF cycle when using Wacom stylus (my default pointer device) and touch but not mouse.

    Reply

  5. Posted by Lu Nelson on August 14, 2019 at 3:54 AM

    These are fascinating insights. I’m curious to know what you think of:

    (A) whether there might be an advantage to having a “Pre(Pre)AnimationFrame” hook? I realize, this is what RAF already is; so I mean effectively: a way of splitting RAF in to two queues, one which runs before the other one? I’m thinking of scroll-event handlers where you want an as-accurate-as-possible current measurement of window.scrollY right before you calculate in JS. But maybe I’m not quite understanding where the ideal points in the cycle are for different kinds of manipulations…🤷🏻‍♂️

    (B) a pattern which synchronizes events to the AF loop, both those which fire too fast and those which fire too slow (i.e. would otherwise miss frames). I know different browsers fire UI events at different rates and it seems to me one might want to ensure updates happen on every frame even if they are coming at a slower rate. There’s a sort of inversion-of-control possible here, that drives updates from an AF-loop and just uses the UI events to debounce the de-activation of the loop, I could dig it up

    Reply

    • A) I haven’t tested, but I suspect that (for browsers that align input events to rAF), that they are already effectively using a “pre-rAF” hook. (You might check out my follow-up post where I explore some of these details.) In general though I think you can calculate window.scrollY in your case anytime within rAF – keep in mind it hardly matters if you can get a slightly more accurate measure before rAF, because you can’t render until after rAF anyway. (Unless your goal is for analytics or something non-UI related. :) )

      B) Are you saying that if the input events fire more slowly than rAF, that you may want to calculate a “tween” between the two and render the UI state at that point? Yeah, I can see use cases for that. I think game engines actually do something similar. Having a rAF callback that runs on every turn of the event loop and does these calculations can probably fit the bill for what you want (although IMO it would probably be too much overhead for anything but the most precision-sensitive of scenarios, e.g. a game engine).

      Reply

      • Posted by Lu Nelson on August 15, 2019 at 2:35 AM

        Thanks for those replies Nolan.

        Re (A) what I had in mind was combining a synchronous emitter with a self-looping rAF callback, to run functions in a certain order within the overall AF callback, by ’emitting’ sub-frame hooks; but I’m not sure if it’s useful at all 🤷🏻‍♂️

        Re (B) yes what i have in mind is a bit like a game engine approach, but not a loop that runs continuously, rather it starts looping on the first event and debounces the event to determine when to stop. The timeout window on this debouncing is effectively the minimum event frequency. For rapid-firing events it inverts the control such that all actions are triggered in the loop and the debounced event-stream merely toggles the loop/dont-loop state. An implementation, combining your technique of adding a PostAnimationFrame callback, looks like this:

  6. Posted by Alexander Farkas on August 18, 2019 at 9:46 AM

    I find your additional use of requestPostAnimationFrame quite interesting. I started to use the rAF + setTimeout 0 pattern in 2015 as a save to read pattern and additionally moved all my DOM changes into rAF.

    However due to the fact that you normally want to read layout first and then react with a dom write and rPAF moves you to the next frame I considered this pattern to be only suitable for things where you are fine with a certain latency. Now I just realize that you will either only loose the first frame of your pointermove cycle (not big of a deal/wich shouldn’t be recognized by the user) or that you can do the style reads already in the pointerdown event. So thanks for this re-thinking.

    However my general answer – until now – to this was that I simply only throttle my DOM changes with rAF and leave my layout reads untreated inside of the event handler. From my testing this already did the performance trick because scheduling the DOM changes with rAF fully solved the main issue of layout thrashing and then calling getBoundingClientRect or clientX/Y multiple times didn’t really make a difference compared to calling a throttled method instead. (I’m curious wether you can test this with your setup, of course any layout neutral stuff should go into the throttled rAF part.)

    At the end from a pure layout thrashing standpoint (not the throttle-ing one): By moving all DOM changing code into a rAF every other place becomes a save reading zone, which makes rPAF only needed if you have to deal with “bad” third party scripts which do not use rAF to change DOM.

    About your comparison to fastdom: This is a little bit misleading because fastdom does something fundamentally different here. Instead of moving you directly after the style/layout calculation of the browser as rPAF/afterframe does. fastdom moves you inside of a rAF callback and makes sure that the read/measure queue is simply executed before the write/mutate queue. This makes fastdom.read quite useless in most cases and maximizes layout thrashing if used with third party scripts that are not using fastdom themself.

    Consider you have two different scripts that are doing reads and writes and do hook into the scroll event.

    Let’s say the first executed script “a” is a third party script that does most things the right way and schedules its DOM changes with rAF. Now the later executed script “b” wants to do everything right and uses fastdom.read. Due to the fact that fastdom.read schedules a rAF at a later point the read callback of script b is executed after script a has changed the DOM (inside the rAF!!!) and therefore produces layout thrashing. In case script b would not use fastdom.read and only uses fastdom.write it will never thrash layout.

    Or assume that the third party script is done not that good and doesn’t schedule DOM changes using rAF. In that case script b with fastdom.read will always produce layout thrashing no matter wich of those scripts are imported/executed first. In case script b doesn’t use fastdom.read on the other hand it depends in which order they are executed.

    Or assume both scripts are using fastdom: In case they are using fast.write the right way. They will never thrash layout no matter wether they are using fastdom.read or they are executing code immediately in the event listener.

    If you think this through with different use cases and different code patterns. The usage of fastdom.read/measure has either a neutral or a bad influence on layout thrashing simply because it moves all layout reads right before the layout is calculated. This is a big big difference to your afterframe implementation. Because with afterframe/rPAF you should be always save to read (until some idiot is using rPAF to change the CSSOM or DOM).

    About throttle-ing scroll events in contrary to touchmove/mousemove/pointermove with rAF. I take this for not necessary while all my DOM changeing code is already bound to rAF and mostly throttled. The scroll event should be synced with the v sync and if not it is a quite big browser bug that should be fixed by the browser. I’m also not fully convinced by your newer article that your tests seem to show that this bug occurs in Safari because testing this in Safari is near to impossible due to the fact that the rAF implementation of this browser is broken and is executed too late. At the end if your tests are right this is something that should be fixed by Safari in the first place.

    Reply

  7. Excellent Article, thanks for the deep div and mentioning useful links like

    which methods will cause sync layout in the article.
    requestPostAnimationFrame
    And ur followup article on the coalsced events by browser

    Nice to know about the technique to use only last updated raf call without cancelling the previous raf schedules event handlers.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.