Sometimes you have to use illegal WAI-ARIA to make stuff work

In this blog post, I’d like to recap an experience I just had while trying to apply some accessibility enhancements to the NoodleApp client.

The problem

NoodleApp uses keyboard shortcuts to allow users to switch back and forth between posts, messages etc. that are displayed on the screen. Using the j and k keys, one can move down and up through the lists respectively. However, this will only change a visual indicator, done in CSS, but not give any indication that a real focus change occurred. If one presses tab, for example, focus will move to the next item depending on where keyboard focus last was, and not where the j and k shortcuts took the user.

This is not new: Twitter uses similar shortcuts, too, and even GMail has them, allowing to move among message threads.

All of these implementations, however, only change a visual indicator. They neither adjust keyboard focus, nor do they communicate a focus change to screen readers. In addition, at least the screen readers on Windows would not immediately be able to use these keyboard shortcuts anyway, since their virtual buffers and quick navigation keys would be captured before they reached the web application.

The easy part

The easy part of the solution to the problem is this:

  1. Add tabindex=”0″ to the ol-element that comprises the whole list to make it keyboard focus-able and include it in the tab order at the order determined by the flow of elements. Since its default location is appropriate, 0 is the correct value here.
  2. Add tabindex=”-1″ to each child li element, of which each contains a single post. This is so they become focus-able, but are not included in the tab order individually. Such an extra tab stop is unnecessary here.
  3. Add a .focus() call to the next reachable message, which is determined in handling the j and k keys, to set focus to the element, or the first post when none is not focused yet, but the user presses j or k.

These give us keyboard focus-ability. What one can do now is press j or k to move through the list of posts, and then press tab to actually move into the post details and onto items such as the user name link or one of the reply, re-post etc. actions. Very handy if one only uses the keyboard to work NoodleApp.

The tricky, AKA screen reader part

All the above does not yet give us any speech when it comes to screen readers. Generic HTML list items, normally non-focus-able, are not something screen readers would speak on focus. Moreover, the list would not be treated as a widget anyway yet.

The latter is easily solved by just adding an appropriate role=”listbox” to the ol element we already added the tabindex+”0″ attribute to above. This causes screen readers on Windows to identify this list as a widget one can enter focus or forms mode on, allowing keys to pass directly to the browser and web app instead of being captured by the screen reader’s virtual buffer.

And here is where it gets nasty. According to the documentation on the listbox role, child elements have to be of role option.

OK, I thought. Great, let’s just add role=”option” to each li element, then.

In Firefox and NVDA, this worked nicely. Granted, there was not any useful speech yet, since the list item spoke all text contained within, giving me the user name and such a couple of times, but hey, for a start, that was not bad at all! NVDA switched into focus mode when it was supposed to, tabbing gave me the child accessibles, all was well.

And then came my test with Safari and VoiceOver on Mac OS X.

And what I found was that role=”option”, despite it being said that this could contain images, caused all child items to disappear. The text was concatenated, but the child accessibles were all flattened straight. Tabbing yielded silence, VoiceOver could interact with text that it then found was not there anyway, etc., etc.

So, while my solution worked great on Windows with Firefox and NVDA, Safari and VoiceOver, a popular combination among blind people, failed miserably.

The solution

I then tried some things to see what effect they would have on VoiceOver>

  • I just added an aria-label to the existing code to see if that would make things better. It did not.
  • I tried the tree and treeitem roles. Result: List was gone completely. Apparently VoiceOver and Safari do not support tree views at present.

Out of desperation, I then thought of the group role. Those list items are essentially grouping elements for several child widgets. So I changed role=”option” to role=”group” and made an aria-label (the name has to be specified by the author) containing the user name, post text and relative time stamp.

And miraculously, it works! It works in both Firefox and NVDA, and Safari and VoiceOver combinations. Screen reader users now get speech when they navigate through the list with j and k, after they have switched their screen reader to focus or forms mode.

Yes, I know it is illegal to have group elements as child elements of a listbox role. But the problem is: neither WAI-ARIA nor HTML5 give me an equivalent to a rich list item known, for example, to XUL. And there is no other equivalent. Grid, gridrow, treegrid, rowgroup etc. are all not applicable, since we are not dealing with tabular, editable content.

Moreover, I cannot even be sure which of the browser/screen reader combinations is right with regards to flattening or not flattening content in role=”option”. The spec is not a hundred percent clear, so either could be right or wrong.

So, to have a solution that works now in popular browser/screen reader combinations, I had to resort to this admittedly illegal construct. Fortunately, it works nicely! Next step is obviously to advocate for a widget type either in HTML or WAI-ARIA that is conceptually an option item, but can hold rich compound child content.

What am I doing this for, anyway?

You may ask yourself: “If I can just read through with my virtual cursor, wyh do I want to use the other navigation method?”

The answer is: Yes, you can read through your list of posts using the virtual cursor. Problem is: Once the view is refreshed, either because you’ve reached the bottom and it loads older posts, or because there were new posts arriving at the top and the view needed refreshing, you lose your place. Using forms/focus mode and the j and k keys will remember your choice even if you load older posts or newer posts arrive after you started reading. You can also use other quick keys like r to reply, f to follow a user, and more, documented at the top of the page of NoodleApp once you open it for the first time. This is not unimportant for efficient reading, to have a means for the screen reader to keep track of where you are. And the virtual buffer concept does not always make this easy with dynamic content.

If you have suggestions

Please feel free to comment if you feel that I’m going about this the wrong way altogether, or if you think there are existing roles more suitable for the task than what I’ve chosen. Just remember that it has to meet the above stated criteria of focus-ability and interact-ability on the browser, not the virtual buffer level.

The code

If you’re interested in looking at the actual code, my commit can be found on Github.


9 thoughts on “Sometimes you have to use illegal WAI-ARIA to make stuff work

  1. As developers, I think it must be par for the course that we use stuff against the spec to work around existing bugs, knowing that if eventually the bugs get fixed (or the buggy versions’ usage stats drop below some certain magical number) we would then be responsible for going back and rewriting that code. I don’t like to think how often that doesn’t happen.

    I’d rather not write crappy code to get around bugs, but if it’s got to work *today* with everyone you expect to be involved, then it has to be done. 🙁

  2. Interesting. One question, though. You note:

    Moreover, I cannot even be sure which of the browser/screen reader combinations is right with regards to flattening or not flattening content in role=”option”. The spec is not a hundred percent clear, so either could be right or wrong.

    Where in the spec is the possibility of flattening suggested? I saw nothing that would even hint at this, which leads me to think VoiceOver is just wrong here.

  3. “They neither adjust keyboard focus, nor do they communicate a focus change to screen readers”

    I am a gmail user, and a keyboard user, so the above statement raises questions. With the keyboard shortcuts on in gMail, by default the focus on the first mail ite. I hit O(“oh”) or enter, and I open the first e-mail. I hit J, then hit enter, I open the second message. I wonder if focus is actually is actually moved for me, or is it some facade and the messages are loaded in some virtual space.

  4. Marco,

    I have begun to look at Safari on IOS and MacOSX. Up to now I have spent the bulk of my time testing EVERY CR test case on Firefox. I had assumed that Apple would do the testing on their platforms. If you think MacOSX has problems try Safari on IOS. They don’t even map aria-label to a name. At IBM we have begun to report a lot of bugs to Apple. Developers should not be expected to work around bugs that are basic like these.

    I have no idea why Safari is having this many bugs this late in the game. So, my recommendation is that we start reporting bugs at We are doing this at IBM. We are looking to see if Apple fixes the bugs. They have said they will look at them.

    The problems we see extend beyond ARIA. On IOS, VoiceOver moves focus without an author setting the tab index. Focus events do get generated as you gesture for moves to the next item. This draws the visual focus but when you move focus to content obscured by the window border, Safari draws the focus rectangle outside the window forcing authors to monitor focus changes and having to scroll the window to keep the focus ring in the window. Adding to the problem, the active element is still on the body tag. These are things windows desktop browsers take care of. We were quite surprised to see problems of this magnitude.

    I am very concerned about the costs that could be incurred by our development community trying to support accessible IOS web content. I am hoping Apple fixes these issues quickly.

  5. Hi Marco,

    Thanks for sharing this solution. It’s a useful discovery, if a somewhat sad expression of the state of support for these things.

    I did some research on treeviews recently and VoiceOver does in fact support tree views, but only if they use the aria-activedescendant approach, which involves keyboard and focus management not entirely dissimilar to what you are after, if I understand correctly. Some quick tests show that the aria-activedescendant approach works pretty well for VoiceOver with list boxes, too. Unfortunately, it doesn’t appear to help the concatenation or flattening problem, and VoiceOver doesn’t seem to like aria-label (or aria-labelledby) on elements with role=”option” or role=”treeitem”. In fact, adding aria-label to a treeitem node, believe it or not, actually turns the whole treeview into a listbox as far as VoiceOver/Safari are concerned. So looks like role=”group” will have to do . For simpler listboxes, though, aria-activedescendant might be a way to go, as support for it is quite good in most other screen reader and browser combos.

    Did you happen to test with NVDA or JAWS and Internet Explorer at all? Would be interesting to know if the differences between screen readers in Firefox and IE with list boxes align with those they show for tree views.

  6. Hi jason,

    oh yes I did test with IE 10 and NVDA, but the listbox was exposed as if the ARIA attributes weren’t there at all, so it was an ordered list without any numbering. No forms mode, and the items appeared like in the older version without my additions. I then decided to not even include comments on that here, because I know IE forces screen readers to do their own parsing, in essence making them another ARIA user agent. I was planning on talking to Mick and Jamie about this, though.

    I didn’t use aria-activedescendant because that requires IDs. Since NoodleApp doesn’t use IDs at all, I would have had to add them to each item and then manage the active descendant on the parent list box. Using the real focus and tabindex values was much more straight-forward, and made sure that keyboard users without screen readers benefitted from this, too. 🙂

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.