The problem:
You want to document your encoding practices in a TEI document with lots
of example code, and you want to render it into HTML for display on the
web, with all the code nicely indented and syntax-highlighted.
The solution:
Mark up your code with the specialist TEI element <egXML>
, and use the
fragments of XSLT and CSS code below to render the embedded XML into
attractively-indented and coloured output.
What is <egXML>
and how do I use it?
Many ordinary TEI users will never have come across the <egXML>
element,
but whenever you look at the TEI Guidelines, you're seeing the results
of it. Every time a piece of TEI code is shown in the Guidelines, it's
embedded in an <egXML>
element.
<egXML>
is special in that it's not in the normal TEI namespace; it's
always in its own special namespace, which is
http://www.tei-c.org/ns/Examples. Inside <egXML>
, you can place any
well-formed fragment of TEI code you like, like this:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<p>
This is a paragraph.
</p>
</egXML>
Now, you would think that the <p>
tag inside the <egXML>
is a TEI <p>
,
but it's not. It's in the http://www.tei-c.org/ns/Examples namespace.
That means two things:
- When you validate your XML file against a TEI P5 schema, this <p>
will not cause problems. It's outside the TEI namespace.
- When you write XSLT to process your example TEI code, this
<p>
is
completely distinct from your regular P5 <p>, because it's in a
different namespace.
Now shut up explaining things, and give me the code already.
Point taken, here it is:
First, there's some CSS. Put this in your site stylesheet, or in a
separate stylesheet if you're afraid it might be infectious. This is
intended to be CSS3 connecting with an XHTML5 web page, but it should
work fine with earlier versions of XHTML.
/* Handling of example XML code embedded in pages. */
pre.teiCode{
white-space: pre-wrap;
}
/* We want our XML code to look like code. */
.xmlTag, .xmlAttName, .xmlAttVal, .teiCode{
font-family: monospace;
}
/* We want our XML code text to be bold. */
.xmlTag, .xmlAttName, .xmlAttVal{
font-weight: bold;
}
/* We want syntax highlighting. */
/* I think I stole these colour values from oXygen. Sorry George! */
.xmlTag{
color: #000099;
}
.xmlAttName{
color: #f5844c;
}
.xmlAttVal{
color: #993300;
}
Next, there's some XSLT.
The first thing you need to do is to put two things inside the root
element of your XSLT file:
xmlns:teix="http://www.tei-c.org/ns/Examples"
[That's the TEI examples namespace.]
exclude-result-prefixes="xs xd xhtml hcmc exist teix"
[You're going to have to customize that a bit for yourself. What that's
saying is: don't output xmlns nodes for these namespaces. In this
example from my project, I'm suppressing unwanted namespaces from a
range of different domains. You'll probably want to do something
similar, but your list of namespace prefixes will be different.]
Now you need to add a couple of templates to the XSLT file(s) that
process your TEI XML.
I always use XSLT 2.0 with Saxon 9+, but I think this would work
perfectly well with XSLT 1.0. Add the following to your stylesheet. Bear
in mind that most of the output finds itself inside an XHTML5 <pre>
element, where whitespace matters, so please forgive the absence of
human-readable whitespace. More competent XSLT coders will certainly
find ways of making the code more readable without screwing up the
whitespace. This could probably be categorized as a crude hack, but it
works and I'm not going to elaborate it right now because I don't have time.
Note: the key here is that all the TEI elements found inside an <egXML>
element are in the Examples namespace.
<!-- Handling of <egXML> elements in the TEI example namespace. -->
<xsl:template match="teix:egXML">
<pre class="teiCode">
<xsl:apply-templates/>
</pre>
</xsl:template>
<!-- Escaping all tags and attributes within the teix (examples)
namespace except for
the containing egXML. -->
<xsl:template match="teix:*[not(local-name(.) = 'egXML')]">
<!-- Indent based on the number of ancestor elements. -->
<xsl:variable name="indent"><xsl:for-each
select="ancestor::teix:*"> </xsl:for-each></xsl:variable>
<!-- Indent before every opening tag if not inside a paragraph. -->
<xsl:if test="not(ancestor::teix:p)"><xsl:value-of
select="$indent"/></xsl:if>
<!-- Opening tag, including any attributes. -->
<span class="xmlTag"><<xsl:value-of
select="name()"/></span><xsl:for-each select="@*"><span
class="xmlAttName"><xsl:text> </xsl:text><xsl:value-of
select="name()"/>=</span><span class="xmlAttVal">"<xsl:value-of
select="."/>"</span></xsl:for-each><span class="xmlTag">></span>
<!-- Return before processing content, if not inside a p. -->
<xsl:if test="not(ancestor::teix:p)"><xsl:text>
</xsl:text></xsl:if><xsl:apply-templates select="* | text() | comment()"/>
<!-- Closing tag, following indent if not in a p. -->
<xsl:if test="not(ancestor::teix:p)"><xsl:value-of
select="$indent"/></xsl:if><span class="xmlTag"></<xsl:value-of
select="local-name()"/>></span>
<!-- Return after closing tag, if not in a p. -->
<xsl:if test="not(ancestor::teix:p)"><xsl:text>
</xsl:text></xsl:if>
</xsl:template>
<!-- For good-looking tree output, we need to include a return after any
text content, assuming
we're not inside a paragraph tag. -->
<xsl:template match="teix:*/text()">
<xsl:if test="not(ancestor::teix:p)"><xsl:for-each
select="ancestor::teix:*"> </xsl:for-each></xsl:if><xsl:value-of
select="replace(., '&', '&amp;')"></xsl:value-of><xsl:if test="not(ancestor::teix:p) or
not(following-sibling::* or following-sibling::text())"><xsl:text>
</xsl:text></xsl:if>
</xsl:template>
Further customization
This is a relatively simple hack, and it's crude in its assumptions. For
instance, I assume that content inside a TEI <p>
tag is "inline"
content, and other tag content is not. Obviously, in the context of a
real project, you would want to make such assumptions more explicit and
elaborate.
Also note that if you use the <gi>
, <att>
and <val>
tags in your documentation, you can hook them up with the same class attributes in your CSS, so that XML fragments in your code are also styled and coloured.