Mostly practical advices

Is XHTML good enough?

There are many false benefits of XHTML promoted on the Web. Let’s clear up some of them at a glance:

  • XHTML does not promote separation of content and presentation any more than HTML does. XHTML has all of the same elements and attributes (including presentational ones) that HTML has, and it doesn’t offer any additional CSS features. Semantic markup and separation of content and presentation is absolutely possible in HTML and is equally easy. In terms of semantics, HTML 4.01 and XHTML 1.0 are exactly the same.
  • Most XHTML pages on the Web are not parsed as XML by today’s web browsers. The vast majority of XHTML pages on the Web cannot be parsed as XML. Even many valid XHTML pages cannot be parsed as XML. See the Validity and Well-Formedness article for details and examples.
  • HTML is not deprecated and is not being phased out at this time. In fact, the World Wide Web Consortium recently renewed the HTML working group which is working to develop HTML 5. The developers of Firefox, Opera, and Safari have pushed very hard for the development of HTML 5 and have largely ignored the development of XHTML 2. The Safari development team has even opted to take no part in the XHTML 2 development process.
  • XHTML 1.x is not “future-compatible”. XHTML 2, currently in the drafting stages, is not backwards-compatible with XHTML 1.x. XHTML 2 will have lots of major changes to the way documents are written and structured, and even if you already have your site written in XHTML 1.1, a complete site rewrite will usually be necessary in order to convert it to proper XHTML 2. A simple XSL transformation will not be sufficient in most cases, because some semantics won’t translate properly.
  • HTML 4.01 is actually more future-compatible. An HTML 4.01 document written to modern support levels will be valid HTML 5, and HTML 5 is where the majority of attention is from browser developers and the W3C.
  • XHTML does not have good browser support. In typical setups, most browsers simply pretend that your XHTML pages are regular HTML (which presents a number of
    problems). Some major browsers like Firefox, Opera, and Safari may attempt to handle the page as proper XHTML if and only if you include a certain special HTTP header. However, when you do so, Internet Explorer and a number of other user agents will choke on it and won’t display a page at all. Even when handled as XHTML, the supporting browsers have a number of additional bugs.
  • Most browsers do not parse valid XHTML dramatically faster than valid HTML, even when they’re parsing XHTML correctly. This is partly because most browsers only support a small subset of the HTML/SGML standard to begin with, so the real complexities of proper HTML parsing are mostly ignored anyway. The only major additional complexity of HTML that is well supported is tag omission, but most browsers use hardcoded rules specific to HTML in order to cheat through that with minimal performance impact. The browser can lose some minor shorthand logic with XML, but it now has to use extra logic to confirm that the document is well-formed. Although XHTML, when parsed with an XML parser, may be slightly faster to parse than typical HTML, the difference isn’t very significant in most cases. And either way, download speed is usually the bottleneck when it comes to document parsing. Whether it’s HTML or XHTML, by the time the page finishes downloading, the whole thing is already parsed. The users won’t notice any speed difference.
  • XHTML is not extensible if you hope to support Internet Explorer or the number of other user agents which can’t parse XHTML as XML. They will handle the document as HTML and you will have no extensibility benefit.
  • XHTML source is not necessarily any “cleaner” than HTML source. If your prefer using lower-case tag names and attribute names, you can do so in HTML. If you prefer having quotes around all attribute values, you may do so in HTML. If you prefer making sure all of your non-empty elements have end tags, you may use end tags in HTML, too. In fact, these are considered best practice markup principles with HTML. The only real markup differences between an HTML document following best practices and an XHTML document following the legacy compatibility guidelines are the doctype, the attributes on the HTML tag, and the /> empty element tag ends (which are actually just SGML shorthand constructs). It’s strange that so many people seem to think shorthand constructs in HTML cause the markup to be “unclean”, while many of the same people seem to love these shorthand constructs in XML. There’s no objective reason behind it; it’s just a matter of perception.
  • Using XHTML does not encourage better support by web browsers and it is not “a vote for a better Web” if you are still supporting Internet Explorer and various search engines and other user agents which require text/html. If you serve it with the typical text/html content type, you are giving all browsers a thumbs-up to treat it exactly like classic HTML, meaning absolutely no progress is made. Even if you use only application/xhtml+XML and shut out Internet Explorer and various other user agents entirely, it won’t mean anything: Microsoft already plans to support real XHTML in an upcoming release of Internet Explorer; they just want to make sure they support it correctly from the initial launch. Even still, XHTML 1.x is a dead-end standard, since it’s completely incompatible with XHTML 2.0 and all other future HTML/XHTML standards, as explained aboved, and since the majority of XHTML content on the Web today cannot be safely parsed as XML.


Tags: , , , , , , , ,


Posted in HTML, XHTML. 778 views
You can leave a response, or trackback from your own site.

No Comments

Leave a comment

Note for spammers! Don't waste your time - all comments being moderated!