Grauw’s blog

Well-formedness

June 17th, 2005

Eric Meyer wrote a post on his weblog that being liberal in what you accept as is currently done with HTML is bad, and that it would have been better if the browsers threw errors instead of trying to make something out of messy HTML. I of course wholeheartedly agree.

XHTML

As you perhaps know, XHTML web pages should really be properly served as an application/xhtml+xml file, when the browser supports it. This will cause the browser to use its XML parser instead of its HTML parser to read the page, and as XML parsers are really strict about errors, they will give you immediate feedback when the XHTML is not well-formed (that is, all tags are closed properly, attributes are quoted, etc), in the form of a really big error message.

These kinds of messages make you fix your code, and help in keeping the code of your page correct, and cross-browser-compatible. At least as far as well-formedness goes.

One argument that is particularly often used against such strict error handling practices, is that would such error checking have been there in HTML, the web would not be as popular as it would be today. However that’s just plain nonsense. Eric Meyer says the following about that argument, and he is absolutely right:

After all, it isn’t like well-formed HTML is substantially more difficult to grasp than is tag-soup HTML. It’s a pretty simple language either way; I’d even argue it’s simpler with rigor than without.

Hear hear.

SVG

But let’s look at another example of where liberalness in what you accept causes the well-known interoperability mess: SVG. Take a look at the following simple SVG snippet, which is a typical example of what can be found on the web, created for use with Adobe’s SVG plugin, the most widely used SVG browser-plugin:

<svg width="100px" height="100px" viewBox="-10 -10 20 20">
   <circle cx="0" cy="0" r="5" xlink:href="http://www.grauw.nl/">
</svg>

This however is invalid XML, the proper code should be:

<svg width="100px" height="60px" viewBox="-10 -6 20 12"
   xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
   <circle cx="0" cy="0" r="5" xlink:href="http://www.grauw.nl/" />
</svg>
(the example)

Like in this example, there is a lot of broken SVG out there on web pages now, because Adobe’s XML parser is non-compliant and works with non-well-formed XML, without the SVG namespace and using undefined namespace prefixes (xlink:), unclosed tags (in this example, the / in circle), etc.

However, there are now browsers around which have a native SVG implementation (Firefox 1.1 alpha, Opera 8) which follows the standards, and in those browsers the broken SVG doesn’t work. That’s a perfect example of why being liberal in what you accept (like the Adobe SVG plugin does) is harmful to the interoperability of the content it processes, even though it didn’t create the content itself. Once you do liberal processing, people will create invalid content, and more than you would think.

But what’s so bad about that? As long as it works, right? No, the point of XML is that it is an interchangeable format. That means that it can be used and processed by many different tools (browser, data miners, transformers, etc), and that those tools can re-use any generic XML parser for reading the data. But if it isn’t well-formed, all that goes down the drain.

To be interoperable, other SVG viewers will have to create complex error-handling routines which mimick the behaviour of the competition as precisely as possible. Especially when that competition is market leader, they may be forced to do so. In the end, it’s a lot of wasted effort and an unnecessary complication of what used to be and should be very simple.

Fortunately, there is a bright side to this: I think that given that those browsers require the SVG to be well-formed for it to work, and that Firefox 1.1 will be the first major browser distribution with SVG built-in, probably forming the majority of SVG-enabled users, that Firefox will be able to force this problem in the right direction, and cause web site owners to fix their SVG.

So hopefully with this, we will at avoid having SVG as yet another tag-soup language.

Grauw

0 comments [reply]

Comments

None.

» Add a comment…