Dialogs and shadow DOM: can we make it accessible?

14 Jun

Dialogs and shadow DOM: can we make it accessible?

Posted June 14, 2022 by Nolan Lawson in accessibility, Web. Tagged: shadow dom. 4 Comments

Last year, I wrote about managing focus in the shadow DOM, and in particular about modal dialogs. Since the <dialog> element has now shipped in all browsers, and the inert attribute is starting to land too, I figured it would be a good time to take another look at getting dialogs to play nicely with shadow DOM.

This post is going to get pretty technical, especially when it comes to the nitty-gritty details of accessibility and web standards. If you’re into that, then buckle up! The ride may be a bit bumpy.

Quick recap

Shadow DOM is weird. On paper, it doesn’t actually change what you can do in the DOM – with open mode, at least, you can access any element on the page that you want. In practice, though, shadow DOM upends a lot of web developer expectations about how the DOM works, and makes things much harder.

Image of Lisa Simpson in front of a sign saying "Keep out. Or enter, I'm a sign not a cop."

I credit Brian Kardell for this description of open shadow DOM, which is maybe the most perfect distillation of how it actually works.

For accessibility reasons, modal dialogs need to implement a focus trap. However, the DOM doesn’t have an API for “give me all the elements on the page that the user can Tab through.” So web developers came up with creative solutions, most of which amount to:

dialog.querySelectorAll('button, input, a[href], ...')

Unfortunately this is the exact thing that doesn’t work in the shadow DOM. querySelectorAll only grabs elements in the current document or shadow root; it doesn’t deeply traverse.

Like a lot of things with shadow DOM, there is a workaround, but it requires some gymnastics. These gymnastics are hard, and have a complexity and (probably) performance cost. So a lot of off-the-shelf modal dialogs don’t handle shadow DOM properly (e.g. a11y-dialog does not).

Note: My goal here isn’t to criticize a11y-dialog. I think it’s one of the best dialog implementations out there. So if even a11y-dialog doesn’t support shadow DOM, you can imagine a lot of other dialog implementations probably don’t, either.

Update: After this post was published, a11y-dialog added support for open shadow DOM!

A constructive dialog

“But what about <dialog>?”, you might ask. “The dang thing is called <dialog>; can’t we just use that?”

If you had asked me a few years ago, I would have pointed you to Scott O’Hara’s extensive blog post on the subject, and said that <dialog> had too many accessibility gotchas to be a practical solution.

If you asked me today, I would again point you to the same blog post. But this time, there is a very helpful 2022 update, where Scott basically says that <dialog> has come a long way, so maybe it’s time to give it a second chance. (For instance, the issue with returning focus to the previously-focused element is now fixed, and the need for a polyfill is much reduced.)

Note: One potential issue with <dialog>, mentioned in Rob Levin’s recent post on the topic, is that clicking outside of the dialog should close it. This has been proposed for the <dialog> element, but the WAI ARIA Authoring Practices Guide doesn’t actually stipulate this, so it seems like optional behavior to me.

To be clear: <dialog> still doesn’t give you 100% of what you’d need to implement a dialog (e.g. you’d need to lock the background scroll), and there are still some lingering discussions about how to handle initial focus. For that reason, Scott still recommends just using a battle-tested library like a11y-dialog.

As always, though, shadow DOM makes things more complicated. And in this case, <dialog> actually has some compelling superpowers:

It automatically limits focus to the dialog, with correct Tab order, even in shadow DOM.
It works with closed shadow roots as well, which is impossible in userland solutions.
It also works with user-agent shadow roots. (E.g. you can Tab through the buttons in a <video controls> or <audio controls>.) This is also impossible in userland, since these elements function effectively like closed shadow roots.
It correctly returns focus to the previously-focused element, even if that element is in a closed shadow root. (This is possible in userland, but you’d need an API contract with the closed-shadow component.)
The Esc key correctly closes the modal, even if the focus is in a user-agent shadow root (e.g. the pause button is focused when you press Esc). This is also not possible in userland.

Here is a demo:

So should everybody just switch over to <dialog>? Not so fast: it actually doesn’t perfectly handle focus, per the WAI ARIA Authoring Practices Guide (APG), because it allows focus to escape to the browser chrome. Here’s what I mean:

You reach the last tabbable element in the dialog and press Tab.
- Correct: focus moves to the first tabbable element in the dialog.
- Incorrect (<dialog>): focus goes to the URL bar or somewhere else in the browser chrome.
You reach the first tabbable element in the dialog and press Shift+Tab.
- Correct: focus moves to the last tabbable element in the dialog.
- Incorrect (<dialog>): focus goes to the URL bar or somewhere else in the browser chrome.

This may seem like a really subtle difference, but the consensus of accessibility experts seems to be that the WAI ARIA APG is correct, and <dialog> is wrong.

So we’ve reached (yet another!) tough decision with <dialog>. Do we accept <dialog>, because at least it gets shadow DOM right, even though it gets some other stuff wrong? Do we try to build our own thing? Do we quit web development entirely and go live the bucolic life of a potato farmer?

Inert matter

While I was puzzling over this recently, it occurred to me that inert may be a step forward to solving this problem. For those unfamiliar, inert is an attribute that can be used to mark sections of the DOM as “inert,” i.e. untabbable and invisible to screen readers:

<main inert></main>
<div role="dialog"></div>
<footer inert></footer>

In this way, you could mark everything except the dialog as inert, and focus would be trapped inside the dialog.

Here is a demo:

As it turns out, this works perfectly for tabbing through elements in the shadow DOM, just like <dialog>! Unfortunately, it has exactly the same problem with focus escaping to the browser chrome. This is no accident: the behavior of <dialog> is defined in terms of inert.

Can we still solve this, though? Unfortunately, I’m not sure it’s possible. I tried a few different techniques, such as listening for Tab events and checking if the activeElement has moved outside of the modal, but the problem is that you still, at some point, need to figure out what the “first” and “last” tabbable elements in the dialog are. To do this, you need to traverse the DOM, which means (at the very least) traversing open shadow roots, which doesn’t work for closed or user-agent shadow roots. And furthermore, it involves a lot of extra work for the web developer, who has probably lost focus at this point and is daydreaming about that nice, quiet potato farm.

Note: inert also, sadly, does not help with the Esc key in user-agent shadow roots, or returning focus to closed shadow roots when the dialog is closed, or setting initial focus on an element in a closed shadow root. These are <dialog>-only superpowers. Not that you needed any extra convincing.

Conclusion

Until the spec and browser issues have been ironed out (e.g. browsers change their behavior so that focus doesn’t escape to the browser chrome, or they give us some entirely different “focus trap” primitive), I can see two reasonable options:

Use something like a11y-dialog, and don’t use shadow DOM or user-agent shadow components like <video controls> or <audio controls>. (Or do some nasty hacks to make it partially work.)
Use shadow DOM, but don’t bother solving the “focus escapes to the browser chrome” problem. Use <dialog> (or a library built on top of it) and leave it at that.

For my readers who were hoping that I’d drop some triumphant “just npm install nolans-cool-dialog and it will work,” I’m sorry to disappoint you. Browsers are still rough around the edges in this area, and there aren’t a lot of great options. Maybe there is some mad-science way to actually solve this, but even that would likely involve a lot of complexity, so it wouldn’t be ideal.

Alternatively, maybe some of you are thinking that I’m focusing too much on closed and user-agent shadow roots. As long as you’re only using open shadow DOM (which, recall, is like the sign that says “I’m a sign, not a cop”), you can do whatever you want. So there’s no problem, right?

Personally, though, I like using <video controls> and <audio controls> (why ship a bunch of JavaScript to do something the browser already does?). And furthermore, I find it odd that if you put a <video controls> inside a <dialog>, you end up with something that’s impossible to make accessible per the WAI ARIA APG. (Is it too much to ask for a little internal consistency in the web platform?)

In any case, I hope this blog post was helpful for others tinkering around with the same problems. I’ll keep an eye on the browsers and standards space, and update this post if anything promising emerges.

4 responses to this post.

Posted by Martin on June 15, 2022 at 4:14 AM

Another technique that I’ve been using successfully is to add two divs to the DOM, one before the first focusable element of the dialog, and the other after the last. Make them focusable, then on focus they redirect respectively to the last and the first focusable elements. Then you don’t need to figure out all the focusable elements on the page…

Reply
- Posted by Nolan Lawson on June 15, 2022 at 6:18 AM
  
  Yep, this is the “nasty hack” I mentioned. The problem is that it’s not perfect: if you Shift-Tab from the last “bookend” element, then you can’t reach the last tabbable element inside of the closed or user-agent shadow root (e.g. the last button in the <video controls >).
  
  Technically you also can’t focus to the first element inside of a closed shadow root, but in the case of video or audio controls it doesn’t matter because the whole thing is the first tabbable element.
  
  Reply
Posted by Chris on July 7, 2022 at 5:52 AM

I do not agree with this statement at all:
Incorrect (): focus goes to the URL bar or somewhere else in the browser chrome.

This is just default browser behaviour, you have this also with the rest of the page. If browsers would trap the focus inside the document, this would be horrible user experience. How to change URL if you are a keyboard user and used to tabbing and don’t know about shortcut like ctrl + L.
So this will never change and will never be implemented in the browsers.

Reply
- Posted by Nolan Lawson on July 7, 2022 at 6:51 AM
  
  That is actually exactly the conclusion of the majority of the folks in the thread I linked to: that the default browser behavior should be for the focus to never escape to the browser chrome through tabbing.
  
  I don’t consider myself an accessibility expert, but I did note that there was some disagreement on this topic. And I agree that it seems unlikely that browsers will change this.
  
  Reply

Read the Tea Leaves Software and other dark arts, by Nolan Lawson