XHTML 5, vnu and validation

01/09/17

Permalink 02:15:39 pm, by mholmes, 349 words, 15 views   English (CA)
Categories: Academic; Mins. worked: 180

XHTML 5, vnu and validation

Today I worked through a stack of issues in building and validating the site, and I now have some recommendations and insights worth recording.

First, I determined that vnu was parsing our documents as HTML because they had the .html extension. The HTML parser does a bunch of pre-validation things (like lower-casing custom data attributes) which we would prefer to avoid. I also discovered that using the XHTML output method in Saxon was paradoxically adding a meta tag to the header specifying content type as text/html, which was also pushing vnu into treating the documents as HTML rather than XHTML. Solutions:

  1. Use this for the xsl:output element:
    <xsl:output method="xhtml" include-content-type="no" encoding="UTF-8" omit-xml-declaration="yes"
        exclude-result-prefixes="#all" normalization-form="NFC"/>
    
    The method attribute gives you correct results in terms of not producing things like self-closed empty div tags. The include-content-type="no" value suppresses the unwanted meta tag with the wrong content type.
  2. Do the HTML5 doctype like this:
    <xsl:text disable-output-escaping="yes"><!DOCTYPE html>
        </xsl:text>
    
    It's ugly but it works.
  3. Always include the charset meta tag:
    <meta charset="UTF-8"/>
    
  4. Before validating, copy only the HTML files to a fresh empty directory and validate them there. This is because of what is explained below.
  5. For validation using vnu.jar, use this command-line setting:
    -Dnu.validator.client.content-type=application/xhtml+xml
    
    In an ant task, it looks like this:
       <java jar="utilities/vnu/vnu.jar" failonerror="true" fork="true">
          <arg value="-Dnu.validator.client.content-type=application/xhtml+xml"/>
          <arg value="--format text"/>
          <arg value="--skip-non-html"/>
          <arg value="tmpValidation/"/>
        </java>
    
    The problem is that when you set the content type as in the first argument, the --skip-non-html flag no longer seems to work; it sets about validating every jpeg and javascript file in the tree. I think this must be a vnu bug, but I haven't tested thoroughly yet.

Following these steps should produce good XHTML5 (assuming your XSLT is right) and validate it as XHTML.

Pingbacks:

No Pingbacks for this post yet...

Mariage

Faut-il se marier? La question de Panurge s’avère incontournable en Occident, surtout à partir de la contre-réforme. Des débuts de la Concile de Trente en 1545 jusqu’à la fin du règne de Louis XIV, la tentative de renouveler le mariage se heurte en France à l’intervention croissante de la monarchie dans cette institution dominée auparavent par l’Église. La rencontre entre ces deux autorités fut tumultueuse mais propice au foisonnement des documents qui font l’objet de ce site : « l’imaginaire nuptial » se compose de divers genres textuels, chacun ayant son caractère propre, mais tous traitant des peurs, des désirs et des fantasmes de plus en plus visibles dans la société d’Ancien Régime grâce aux débats soulevés par la nouvelle problématique de l’union conjugale. L’accent pour le moment est sur les textes et images misogames qui font partie d’un renouveau de la Querelle des femmes pendant les 25 premières années du XVIIe siècle.

Reports

XML Feeds