As far back as August 2008, I’ve been aware of Ian Hickson’s fallacious arguments that elements and attributes of HTML4 should not be included in HTML5 merely because web authors misuse them. In the case of the longdesc attribute for img, similar arguments go back even farther.
My feeling on such an approach has been rather clear: The mere fact that web authors misuse/ abuse something in HTML isn’t an argument for removal from the spec. It is an argument for more education of web authors. The web has an extraordinarily low barrier to entry, which has contributed significantly to its growth. The fact that errors are committed in the creation of markup is a poor argument against inclusion of features to the language – this fact being recently acknowledged by Sam Ruby, co-chair of the HTML Working Group.
Is Longdesc Misused?
I recently decided to start gathering “global” data on accessibility and reporting on my findings. The first such inquiry is in the use of the ‘longdesc’ attribute. What I found was pretty interesting:
- Across 10,200 URLs, I found 129 instances of the longdesc attribute
- Of those, 89% of the instances were non-empty
- 21% of the longdesc attributes were incorrect (read as: not of type URI, per the HTML 4.01 Recommendation)
- 79% of the longdesc attributes were correct. In other words they did point to a URI
I did not do any further inquiry into the accuracy of the URI pointer or whether the long description was sufficient or even if the image was complicated enough to warrant having a longdesc in the first place. Those are subjective judgments and the issue I was looking into was primarily whether or not the attribute was, as Ian Hickson has repeatedly asserted, misused at such a rate as to warrant not including it into HTML5.
The Vast Majority of longdesc Values Are Conforming
As my data show, the vast majority of longdesc values are conforming, thus rendering false any claims of author misuse. The longdesc attribute – like the rest of HTML elements and attributes – does get misused at times by web authors who do not understand how to correctly develop web sites. This is not a trait unique to the longdesc attribute, however, and I’d certainly argue that 79% is a pretty good indication that misuse is not prevalent enough to not include it in future versions of HTML
Download the Excel spreadsheet of examples