Review of the WebVisum Firefox extension

Today, a post announcing the WebVisum Firefox extension was posted to the newsgroup. The things talked about in this post and on the WebVisum homepage almost sound too good to be true. Among the features are:

  • Ability to tag graphics, form fields, links, and other page elements. While some or all of these features have been available in some screen readers already, this feature is unique in that it works across platforms. It also sends the data back to the WebVisum web service so other members of the community can benefit from the labels someone provided.
  • Optical Character Recognition (OCR) to try and identify those images that absolutely won’t tell us through their SRC what they’re all about.
  • Visual page enhancements such as a high-contrast profile.
  • Suppression of automatic page refreshes or Flash content
  • And most astonishingly: CAPTCHA solving!

A few days ago, I was approached by the WebVisum development team if I would consider beta testing their extension. So, I had a bit of a head start with this tool, and I was very surprised when I started testing some of the features.

The tests

From my main screen reader, I already knew the capability to label graphics or HTML form elements that have missing alt text or labels. Instead of using those techniques, I applied a few labels to the main navigation images on the CakeWalk homepage using WebVisum. After labelling the graphics from my Windows computer, I fired up my Linux box, installed the extension there and surfed to Orca immediately picked up the labels I had given the graphics and used them as the link text.

I then went ahead and labelled the Search combobox on the German Heise Newsticker site. Again, after visiting the page from the other computer, the label for the combobox was read aloud.

And then I actually tried a CAPTCHA. I chose as my first target since I know they also offer an audio CAPTCHA. Of course this is not a 100% satisfactory solution because deaf-blind people are still left dead in the water with this, but it gave me a good reference to compare the results. I went into the new account creation process on digg, and when it came to the CAPTCHA, I let WebVisum do its magic. Within less than 30 seconds, I got a result back, placed on my clipboard by the extension, ready to paste in. I compared it to what the audio CAPTCHA told me, and the results matched!

I repeated this step two more times because I had first chosen a user name that was already taken, and then goofed up something else in the form, and each time, the result was correct. Totally stunning!

I tried the same on Technorati who also offer an audio CAPTCHA, and got the same results: The CAPTCHA was correctly resolved.

As my third target, I chose MozillaZine, who, despite a couple of attempts on my part, still do not offer an audio CAPTCHA for registration or sending a reply to a forum without being logged in. Without this fall-back mechanism, this is a real-world scenario that visually impaired people are being faced with on an almost daily basis. And I’ll be darned, it worked out! I could register with the MozillaZine forums without any sighted assistance.

The conclusion

There are actually a couple of conclusions, concerns and questions that this extension raises.

The educational aspect

So here we are, having been trying to educate web developers all over the world to use W3C accessibility authoring guidelines, comply with section 508 and what not, and now an accessibility comes along that allows for labelling controls, providing alternative text for graphics, and even share this with the community. So did we do all this educational endeavor invain?

The answer can only be a firm and resolute: “No, we didn’t!” While this extension allows to correct for obvious mistakes like a missing alt attribute on an image, it cannot correct all the requirements there are to meet for section 508 compliance. And it should not! On the contrary: All mistakes one has to correct should be counted against a ranking on a “Wall of shame” kind of statistic that depicts the sites requiring the most corrections. Similarly to the Firefox “Report a broken website feature”, that in Firefox 3.0 also has a “Disability Access” component that allows to report an inaccessible web site, this data should be used to advertise for better accessibility in a future relaunch of that particular site.

Furthermore, there are so many websites that are part of the so-called web 2.0 that are not publically-owned or from a big company, but which are just as compelling to participate. These can usually either not be bothered or cannot financially make it to be 100% sec 508 compliant. Having the possibility to enhance these pages will make the web 2.0 a much more compelling place than it already is in the future.

The CAPTCHA solver

This is probably the most controversial feature. The fact alone that WebVisum is able to solve the CAPTCHAs will probably send shivers up and down the spine of many web developers, website administrators, blog owners etc. that have to fight spam every day. The fact that WebVisum can do it probably means that spambots will sooner or later also be able to do it. Even worse, some could argue that the WebVisum service may be abused by spammers to get CAPTCHA resolution for free.

The WebVisum developers assured me that they’ll make sure that only real people will be able to use their service. Furthermore, the number of CAPTCHAs that can be solved per day per site is limited.

While it is correct to advertise for alternatives to visual CAPTCHAs, the reality is that audio CAPTCHAs, which are the most common alternative, do not allow every person to use them. I already mentioned deaf-blind surfers. But also people who have a hearing impairment and have difficulty deciphering the distorted audio have trouble with this alternative. The CAPTCHA resolution feature allows to solve the problems of these people and also anyone who has trouble reading or hearing the text who is not visually impaired.

Also, this allows access to those private sites and blogs that are under no pressure government- or image-wise to implement an audio CAPTCHA. It definitely lowers the barrier for participation in the web 2.0 world!

Aside from all that, CAPTCHAs only offer a false sense of security. There are much more effective ways of fighting spam than imposing these things upon everybody. My blog, for example, has no CAPTCHA entry for commenting, and still my spam fighting measures have kept this blog clean for as long as it has been in existence. But the sad reality is that CAPTCHAs are an “evil” we currently have to cope with, and WebVisum certainly helps a lot in circumventing these artificial barriers.

My hope is that the WebVisum folks manage to keep their user base spambot-free and that there won’t be any other way to abuse the feature for unsolicited activities.

A few wishes for the future

I see for this extension the potential to become much more than “just” a web helper for the visually impaired. For example, I can imagine this being enhanced to allow hearing-enabled people to provide a textual transcription of an audio clip for deaf surfers, sighted people giving a textual description of not just an image, but a video clip or the like, and other similar cross-impairment possibilities. After all, any hearing-enabled blind could provide such textual transcription of an audio clip for a sighted deaf person.

Aside from this larger-scale vision of mine, a few more basic features such as an undo feature that allows to revoke a server-submitted enhancement will hopefully make its way into near-future versions of the extension.

So: To be able to make up your mind for yourself, go check out the website and extension at


52 thoughts on “Review of the WebVisum Firefox extension

  1. This sounds very interesting, but I want to draw your attention to one
    scenario, I meet on my homebanking page: “Volksbank/Raiffeisenbank”:
    Since they’ve shifted to the I-tan procedure, they provide the I-tan number
    you’ve to put in in a CAPCHA like graphic; however you have to read it
    instead of copying it. Then you have to search for that certain I-tan in
    the paper form, sent by the bank.
    Do you think it’s possible to identify this by Webvisum? You cannot play
    around with this, since the certain I-tan is gone when it’s placed once on
    the homebanking form, even if you don’t do any action.
    Moreover: This mentioned CAPCHA like graphic contains more information. The
    date of your birthday is given as “Wasserzeichen”, (sorry I don’t know the
    word in English and have no dictionary at hand). The effect can be
    confusion of the Webvisum feature.
    All this should be tested in a simulation scenario, but from my experience
    the bank people are not that cooperative.
    Sorry for this longer text, but if we want to extend this amazing feature
    we have to deal with real life scenarios.

  2. Hermann, thanks for your feedback! The way WebVisum works is this: You focus the field the captcha solution needs to go in, press a hot key, normally Ctrl+6, and the captcha is found and submitted to the WebVisum server for analysis. The result is later transferred back and placed on your clipboard. It is also shown in the alert at the top of the page so you immediately know what the text is. Instead of pasting it in, you could look up the iTan from your paperwork and type the relefvant number in instead.

    I guess the only way to find out whether this works is by trying it out. You can then coordinate with the WebVisum folks to see if you get a result back and if these image types give their service a problem.

  3. In answer to comment 1:
    Fortunately the banking site offers a demonstration of their accounts. So I
    checked it out and I’ve to report, that no CAPTCHA was recognized.
    I started the demo version and sent money to an imaginary person.
    looks like in the real transaction form, including the demand for a special
    I-Tan. The control graphic is shown including the message to check its
    content and put in the desired I-Tan.
    When pressing Control+6, I hear “No CAPTCHAs found on this page”.
    You can try it by visiting:
    I think you should land on the demo page.

  4. Pingback: buzz
  5. This extention is amazing. I started using it after seeing a post someone made to a certain list. 😀 , Love it! If only there was a way to undo a label completely. I’ve axidently labeled stuff wrong on a site. I’ve fixed it to the best of my ability , but you know… It certainly will help a lot of people, and I hope this spreads far and wide throughout the connected blind community.
    My username is serrebi on the extention.

  6. I saw this add on last week and am also very impressed with the possibilities. There is also plenty opportunity to be malicious such as changing a web pages title to some nasty message about being infected with a virus etc. Think there needs to be a new option to use your own modifications first and then if you have none use the database or ignore completely. Other than that I am very impressed

  7. @Mishu70: these are important points which we have considered as well. Very soon we will launch several features that can help the community “police” itself. We would also possibly have stuff like warning uses who misbehave and even temporarily (or permanently) banning them. I don’t think anybody would like to get banned from the service and therefore most should behave.

    This stuff is not ready yet as we have more important things that we need to finish, first.


  9. Hello,
    I have sent a few things to the WebVisum guys using their contact us form. I am seeying they are also discussing here and I prefer to have things shared so they can be discussed better. Here are some ideas and issues I have thought about:
    1) It is not possible to use accented letters to name form controls and graphic properly. When naming a control using some non-english letters it’s correctly reported but when coming back to the site I have prewiously labeled a control at some of the accented characters are not printed correctly. This might have something to do with character encoding of a particular website or I don’t know because results are varying between sites.
    2) There are keyboard layouts which can’t be used to produce some of the WebVisum shortcut keys E.G. ctrl + ctrl 2 etc. I have tested with czech and slovak keyboard layouts under windows. The upper row on a us keyboard is used to type numbers. With these 2 layouts upper row prints some national characters and numbers are typed by holding down shift key and pressing corresponding key in the upper row of keys.
    3) On my system I have got several other firefox extensions installed besides WebVisum. I think while Video download helper and WebVisum are installed together it takes very long time to start the firefox. Shal I rather try reporting this issue to the Video download helper author?
    4) I have seen a post from Aaron Leventhal in the blind programming mailing list that WebVisum is looking for the contributors. If you would like to localize the extension to various languages then please cont me in for slovak language if you like. I believe the extension language might be sinced with the localization of firefox where available.

    Other than these issues I am amazed. The most helpfull is captcha solver so I wish it keeps functioning properly for a maximum possible period.
    Please note all my tests were done with firefox 3.0 and NVDA as my screen reader of choice.

  10. In answer to comment 10:
    Have you tried to activate Webvisum from within the tools menu? There’s a
    submenu which contains an “activate” button.
    I wonder if Webvisum works with FF before 3.0. And I think FF3 should work
    with older Windows versions, but I might be wrong. Which Windows do you

  11. By default FF3 won’t work with non-NT based systems at all. There are some unoficial projects adding some functionality to the win 9x kernel just to make FF3 runing though.

  12. @DJC: Please follow Hermann’s advice. Also, you can try the open source screen reader, NVDA, which is totally free and works quite well with Firefox: take version 0.6p1 as 0.5 will not work well.

    @pvagner: We’re already addressing all of your excellent and well detailed feedback. Hopefully by the next version it will all be corrected. We’ll send you the files to translate over email. Thanks for the offer.

  13. The pull-down menu in the tools area of fire fox is grayed out and that’s why I can’t get to it. I’ll try nvda and see if I can turn it back on that way. Running WinXP Corperate.

  14. Well I’m sorry to report that I still can’t get this working. For some reason when I pressed control-shift-f2 it disabled the add on even though firefox says it’s enabled. Does WebVisun write the login info to an ini file or something that I can delete to get the add on to behave so I can log in with it and if so where should I look?

  15. @djc the only way I know how to see the individual settings is to go to the about:config page and entering webvisum into the filter. They can be removed though.

  16. Ok I finally have Webvisun working. I had to install firefox3 which resolved the issue. The menu that was grayed out in firefox2 is now working so I was able to enable Webvisun and it’s working correctly Thanks to everyone who tried to help.

  17. @djc: Congratulations! Our next version will require Firefox 3.0 or later – this should make it easier on some people who happen to be using Firefox 2.

  18. Hi, I think this is a fantastic addon. I’ve had no luck however with my bank’s website. I tried to use the capture feature and nothing happened. I’m using orca and ff3. Worse than that I labeled the actual capture image and apparently it no longer shows the actual capture text so even with sighted assistance I can’t log in with visum enabled. Another odd thing is that on that particular site when I disable visum i am unable to renable it. I’m sure these are all just kinks though and will get sorted out. Well done guys!!

  19. Marco’s blog is not a support forum 🙂 Its best that you contact us through the contact form on the site.

    I believe this issue will be fixed by our next release and you should be able to solve this CAPTCHA.

    Let us know how it goes if you’re still unable to work it out with the new release.


  20. Pingback: AccessTech
  21. Hi, I just found this blog via the WebVisum website. I’m glad to find another accessibility blog! I am a Talking Books Librarian and blog about various resources for people with disabilities and also about library-related issues. Feel free to check it out at

    (Sorry if this is a double post, I realized I had the address wrong in the previous submission!)

  22. I would have loved to try this out. Unfortunately Firefox 3 won’t run on my system, I have xp Home with SP2. When I looked at the ff forums there were so many reasons why ff3 would crash at startup and so many complicated things to try to fix it, and so many people for whom no suggestion worked that I just gave up in disgust. I’d love to have that earlier version that DJC was using that worked with FF2. As it is I’ve signed up with Social Accessibility and will stick with that until a more stable and less finicky version of Firefox comes along. FF3 apparently crashes on Macs and systems running Linux as well. Before I gave up I ruled out any plugins or add ons causing the problem, and I don’t have Google Desktop or the roboform plugin or the Yahoo Ap that was said to cause the problem on some systems. I also tried ff3.1 beta with no luck.

  23. Hi.
    I noticed that WebVisun does not manage to work well in Captcha when it is required the letters capital letters fingering and minuscule. But it managed to behave itself well in lots of sites that before were inaccessible. However I had difficulties for captchas in or in the site
    Also do not manage to change the shortcut keys starting from a keyboard in Portuguese, because he does not assimilate anything different from standard or shifit tab. Then I use the context menu and solve the thing.
    Oh! I can help with the translation for Portuguese of Brazil.

  24. I have some question about webvisum engine tecnology.
    I look around the site and I didn’t find technical information about engine only about plug-in.

    some link?

  25. This plugin really sounds unbelievable. On the browser side there was not much done in the past for blind and visually impaired people because they were not the main target group. So I would like to thank the developers who are considering these people who are enchancing accessability and usability!

  26. It is a real dilemma to know what to do with Captcha. Even as a sighted user I find it frustating to use because some are much less readable than others (Google’s aren’t particularly easy to read for example).
    Other ways around it (apart from audio versions which have limitations as mentioned previously) can work but can have their own difficulties. One way we have used is to have a code that can get around the Captcha which is published on the site and read out by a screen reader. It has the advatantage of being easy to use and not immediately obvious to a non-human spammer, but I guess it is only a matter of time before the spammers start using artificial intelligence to be able to get around these things.

  27. I am interested in using Web Visum to enable visually impaired staff at social service agencies to use our software out of the box. Is it possible to use this technology in languages other than english from the same web site?
    Is there someone I can speak with that has this experience?

  28. Dear Marco,
    I’m a new Firefox user and have also been seduced by Webvisum. The captcha solver has been a really incredible breakthrough and I hope it lasts.
    However, I have tried to use the OCR function but not succeeded. I wonder if I understood it well, but pointing at a Flash animation and performing OCR only gave me the honk error sound and nothing else.
    There is also a problem I would like to mention here which concerns blind musicians who want to use MySpace. This site has been criticised quite a bit for not being accessible and well it should. But I wonder if Webvisum could do something about it. There are quite a few blind musicians in the world who wonder why the few buttons on the music player present on every MySpace music profile are not shown by Jaws. I even ask Freedom scientific who said there was nothing they could do. Do you think there could be a solution?
    All the best,

  29. Hi JPR,

    Unfortunately, Flash is out of the scope of WebVisum currently. The OCR part only works on images (GIF, JPG, PNG etc.) included in web pages, not on Flash content. Also, the labelling feature does not currently work on Flash objects, either. Whether this will happen in a future version or if this is even technically possible I do not know.

  30. Dear Marco,
    I have tried to contact Webvisum a number of times and never had any reply. I am logged in with my user name and used the “contact us” form for my enquiries.
    I have no idea if they do not get my messages or if they feel they don’t have to reply to them.
    I wonder if by any chance my messages might be landing in their spam box.
    I don’t necessarily want to make this question public, but You are my only hope, as I think my questions have an interest.
    You can of course reply to me personally if you want.
    Thank you for your help,

  31. Dear Marco,
    some websites still seem to be more accessible with Internet Explorer than with Firefox. When I navigate on my MySpace page, the music player link is hardly accessible with I E but it still is if you use the jaws cursor. When I do the same with Firefox, I cannot reach this link no matter what I try. Curiously some of my friends claim that they can reach this link with Firefox two and so I wander if you could try it for yourself and tell me the results. As always, you can answer me personally and not publish this comment if you think that’s best.

  32. Hi JPR, the reason might be MySpace’s heavy use of Flash. It is known that many, especially the commercial screen readers, don’t interact with the Adobe Flash plugin for Firefox as good as they do with the one for Internet Explorer. To see if this is really a Firefox issue, which I don’t believe it is, you could try out NVDA and their support for Flash. See this blog post of mine for more info:


  33. This is getting off-topic, but just a short note that I know the name, but have never seen Cobra in action. Don’t even know if it’s in a release state at all yet. I also know tht it does not support Firefox, at least I’ve never been contacted by any Cobra developer about Mozilla accessibility questions.

  34. jUST A NOTE ON CAPTCHA SOLVING-iN COUNTRIES WITH LOW WAGES HUMAN CAPTCHA SOLVING IS A BIG BIZ, AND AT 30SECS PER CAPTCHA MUCH SLOWER. uNTIL THIS TECH BECOMES WIDELY AVAILABLE, AND FAST, nothing will change, and if captcha solving in bulk becomes common and fast then finally maybe web designers will start using another, and hopefully more blind friendly alt to bleeping captcha.

  35. some websites still seem to be more accessible with Internet Explorer than with Firefox. When I navigate on my MySpace page, the music player link is hardly accessible with I E but it still is if you use the jaws cursor. When I do the same with Firefox, I cannot reach this link no matter what I try. Curiously some of my friends claim that they can reach this link with Firefox two and so I wander if you could try it for yourself and tell me the results. As always, you can answer me personally and not publish this comment if you think that’s best.


  36. Hellooo!

    I need the invitation…

    I have tried on the site, but nothing…

    Sorry, I don’t speak English!

    Help me!

    Give me a invitation, pleaseee!



  37. Raquel, you can only get an invitation if you’re blind. Find someone you know who has WebVisum and get an invitation from that person, please.

  38. I’m very sad to report that Webvisum is a dead and now disfunctional project. The website has expired, gives you an expiration notice when you visit, and the addon fails to work at all now, giving users an unable to log in message.

    This means that most of Webvisum will no longer work, including the automatic shared labels and most importantly, there’s now no way at all for blind people to solve captchas with no audio, at least that I know of.
    It’s a shame it had to just die like this. If anyone can get in contact with a Webvisum developer, let me know on Twitter @KevanGC. Sorry, don’t want to give my email out publically on here. I’ll gladly ask around the blind comunity and see if anyone’s got any interest in keeping the CAPTCHA service alive, I’d do it myself if I knew how to code at all.
    What would be awesome is a stand-alone Windows application to solve CAPTCHAS, only available to registered users of course.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.