Towards an HCMC policy for HTML5
Posted by mholmes on 10 Nov 2010 in R & D, Activity log, Documentation
I've spent a while over the last few days working on HTML5, with the hope of developing a policy for HCMC with regard to its adoption and use. These are my conclusions so far:
- For a lot of the sites we develop, HTML5 will be a definite advantage, because of a variety of useful new features such as media support, navigation and menu tags, etc.
- HTML5 also enables us to avoid the difficulties associated with XHTML vs. HTML mime types, which have always afflicted us with XHTML 1.1, through the use of a polyglot document. This enables us to generate well-formed and valid XHTML5 code, but serve it with a text/html mime type if necessary. This has several important implications:
- "Processing Instructions and the XML Declaration are both forbidden in polyglot markup", so we must dispense with the XML declaration and avoid processing instructions.
- "When polyglot markup uses UTF-8, it must not include a BOM." We will always be using UTF-8, so we must never include BOMs.
- The charset should probably be declared in the http header if possible (easy with PHP and with Cocoon). We should also include the meta
<charset="utf-8"/>
tag, although this is only meaningful in the context of HTML, not XML. - We should use the minimal doctype declaration:
<!DOCTYPE html>
. - The polyglot markup specification imposes a number of recommendations which we already tend to favour:
- Elements which CANNOT contain content MUST be self-closing (
<br/>
). - Elements which CAN contain content must NOT be self-closing, even if empty (
<p>
</p>
). - Only the five core named entities (quote, ampersand, angle brackets, apostrophe) are used; all others must be numeric, and SHOULD be hexadecimal.
- Elements which CANNOT contain content MUST be self-closing (
For new projects, or projects undergoing a rewrite, which have a short development horizon, this is the recommended approach:
- Create HTML5 polyglot documents, as above. See the linked article for more detail on the rules for namespaces, elements, etc.
- From among the available HTML5 elements and attributes, choose only those which are unchanged from XHTML 1.1, UNLESS there is a need to use a new HTML5 feature. This should allow older browsers to display pages unchanged, but will allow the inclusion of new HTML5 features in future without reworking the whole document. Here are simple lists of elements and attributes.
- Serve as text/html, allowing support for IE7 and 8.
For longer-term projects and those which will benefit significantly from HTML5 features, and for which it can be argued that support for older browsers (IE7 and 8) is not important, these are the recommendations:
- Again, create polyglot documents. I can see no reason why we should not do this. We have all the transformability of XML, should we need it, but we get reliable browser behaviour. If a compelling reason emerges for switching to application/xhtml+xml in the future, it's a simple change which will not require alteration of the actual documents.
- Wherever possible, use elements and attributes which already have broad browser support. There are two reasons for this: first, you can test your code on a wider variety of browsers, and identify any differences in behaviour which might indicate instability in the specification (still a big danger), and second, your collaborators will be able to test features without having to depend on a specific browser. Wikipedia has a good browser support matrix page.
- For video and audio, include both MP4/H.264 and Theora, or MP3 and OGG, versions of files on the site. Firefox and Opera support the open standards but not the closed, and vice versa with other browsers. See my earlier post on this.
- Where unusual character ranges are used on the page, or where a particular font is preferred for its display characteristics, use a WOFF file and block of CSS to make it available to the browser without the user needing to do anything:
@font-face{ font-family: "GentiumPlus"; src: url(GentiumPlus-R.woff); } @font-face{ font-family: "GentiumPlus"; font-style: italic; src: url(GentiumPlus-I.woff); }
Needless to say, make sure you have the right to distribute the font in this way. - Use CSS3 features, but don't depend on them if they don't yet have broad support. I have not been able to find a well-maintained browser support matrix for CSS3, so trial and error is our best guide at this point, although the Acid3 test can help with this. After running the text, click on the A in the graphic to get detailed results.
- Use SVG with care and testing. It should be most useful for small images which function as part of a scalable GUI; browser support for animation and other features is less extensive, so will need more testing.
- Use MathML only if necessary. IE does not support it natively. In IE9 it has support through a plugin, but is known to be buggy.