Grauw’s blog

Why innerHTML is bad

May 22nd, 2008

One of my pet peeves is the use of innerHTML. I find it an abomination, and would be happy if it disappeared from the internet. Preferably, you should use DOM operations to create node trees.

Use of innerHTML and string concatenation of HTML is bad practice because one often forgets to escape values properly. This leads to bugs that are not easily discovered while testing, such as the one that I had to fix today. The reason for this is that HTML usually (silently) recovers from these kind of errors, but not always in the way that was intended. Some examples of edge cases that you wouldn’t normally encounter while testing:

div.innerHTML = "Tom & Peter" – Shows as intended
div.innerHTML = "Tom&Peter," – Shows as intended
div.innerHTML = "Tom&Peter" – Shows as “Tom” (in Internet Explorer only)
div.innerHTML = "Tom<Peter" – Shows as “Tom”

Also note that when the data is editable and can (unintentionally) contain HTML, it is easy for a malicious user to add a line like <img src="http://www.grauw.nl/images/sitelogo.png" onload="alert('steal login information')"/>, which if visible to other users would be a big security issue.

When you use DOM operations to create node trees, such as document.createTextNode('Tom & Peter'), these type of issues can not occur, because the escaping of strings is built-in the DOM methods, and using them will always create a well-formed tree.

This is by the way also a prime example of why the ‘non-draconian error handling’ of HTML that certain people laud so much (compared to XML’s) leads to unrobust applications, as many indications of serious errors are not caught while testing.

Grauw

Comments

in defense of by Reggie Drake at 2008-06-07 00:25

At this point, a lot of things (client-side templating, refreshing part of a page with an xmlhttprequest) can only be done in a quick and portable way by using innerHTML, so blanket statements condeming innerHTML are pointless.

Of course you should not build your HTML with ad-hoc string concatenation, but it is not hard to build ‘safe’ HTML-generating helpers that will take care of proper escaping for you.

Re: in defense of by Grauw at 2008-06-09 15:07

Of course innerHTML has its uses, but actually with your two examples, I disagree :). I would do both using XHTML, that is, XML.

For client-side templating you can add XML tags in the template (e.g. <ex:textnode value="dataSource.name">) which you can quickly find and replace using DOM methods such as getElementsByTagNameNS and replaceChild. Or even better, use use XSLT instead of a custom templating language, which is lightning-fast, because you can transform directly from an XML datasource to HTML, the browser does it for you in a few milliseconds.

Refreshing a part of a page is as easy as using importNode on the already-parsed responseXML from the XMLHTTPRequest, and then appending it with appendChild. After all, the X in AJAX stands for XML :).

I’d say the main use of innerHTML is when you are getting some HTML string input from something (e.g. the Google API). Basically, when you are handed HTML by a source that you can not control.

When you say you should not build your HTML with ad-hoc string concatenation, that is actually exactly the reason why I think use of innerHTML should be discouraged, because people do that all the time. Even with helper functions, ones that use the DOM are by default safer, although with sufficient scrutiny and escaping, it is true that innerHTML can be made just as safe.