A TEI Project

How to edit the TEI Guidelines

Table of contents

This document is intended to set out the way things are currently managed in the editing of the TEI Guidelines. General notes on the rationale for this state -- why it is the way it is -- may be added here later. The intention is to provide information for Council members wishing to contribute actively to the continued development and maintenance of the text of the Guidelines.

1. Logical organisation of the Guidelines

It cannot have escaped your notice that each chapter (almost) of the Guidelines defines a distinct module. In theory at least, each chapter is organised in more or less the same way:

The only chapters not organised in this way are those which do not introduce or define particular modules.

2. Physical organization: the ODD files

Each element, class, and macro defined in the Guidelines is declared within its own XML file, containing an <elementSpec>, <classSpec>, or <macroSpec> as appropriate. These files are in the directory Source/Specs. For example, the file Source/Specs/abbr.xml contains the element spec for the <abbr> element.

Note that all translations share a single file in Specs. As a general rule, don't update a translation for any language of which you are not a native speaker. If you feel confident enough to adjust the translation, leave the @versionDate attribute unchanged in order to ensure the translation will be reviewed eventually.

Each chapter of the Guidelines is stored in a file called Source/Guidelines/xx/YY-name.xml where xx is the language (currently only en or fr), YY is the two letter identifier for each chapter (see 7.1. Chapter codes) and name is the name of the module being defined by that chapter.

The file Source/guidelines-xx.xml (where xx is either en or fr) is the ‘driver file’ for the whole shebang. It contains system entity declarations for each of the documents making up the P5 source. These entities are then referenced throughout the source to embed the required component at the right place.1

Hence, to add a new element (say <saintName>) you might proceed as follows:
  1. Write a new file saintName.xml containing an <elementSpec> for your new element and add it to the Specs folder.
  2. Add a declaration like this to the existing driver file
    <!ENTITY saintName SYSTEM "Specs/saintName.xml">
  3. Edit the source of the relevant chapter (presumably ND-namesdates.xml) to include a reference to the element spec (like this &saintName;), and also some discussion of its usage. The former can appear anywhere, but good practice is to include it in an alphabetic list of such declarations near the end of the relevant section. You can also use a <specList> to reference the description from your new spec within the body of the text, like this:
    <p>This module also defines the following canonical element: <specList>   <specDesc key="saintName"/>  </specList> </p>

3. Style Notes

3.1. General

The Guidelines are a reference manual, not a tutorial. You should not talk down to the reader, but assume they have a reasonably well-informed knowledge of the subjects under discussion. Make copious use of cross references, rather than repetition.

Bear in mind however that your reader may not have English as their first language. Avoid needlessly complex sentences and unnecessarily obscure terminology. Make sure that technical terms are glossed on their first appearance: this should be in the chapter on XML in the case of XML-related terminology. If you want to provide other references, do so as footnotes, using the <note> element.

Provide bibliographic citations for any other standards (etc) referenced, following the existing style. Do not introduce bibliographic citations simply in order to demonstrate your learning.

See the Style Guide for Editing the TEI Guidelines, which attempts to state preferred practice on vexed issues issues about spelling, punctuation, etc. The goal of these rules is to avoid inconsistency, and also (wherever possible) to avoid producing text which is markedly either British or American English.

3.2. Examples

The purpose of an example is to illustrate a specific element or feature. Do not include irrelevant encoding which does not contribute to this primary goal. If such encoding is unavoidable (eg to make your example valid), then it must be explained in the supporting text.

Wherever possible, choose your examples from real documents and provide bibliographic citations for them in the file BIB-Bibliography.xml. Use the @corresp attribute on the <egXML> element to link an example to its source note. Note that the @xml:lang attribute is mandatory on <exemplum>: this is to ensure that the ODD processor knows which examples to choose in a given context.

All examples should be valid against a modified TEI schema in which any element can act as a root element: this validity is checked during the build process.

3.3. Good encoding practice

Good encoding practice will ensure not only valid but also highly functional Guidelines.

When referencing figures and to other sections of the Guidelines, use <ptr>, not <ref>, to ensure that the title and number of the referenced item is automatically inserted when the Guidelines are compiled.

The build process validates cross-references. Since the Guidelines is compiled into a single XML document at build time, IDs must be unique across the text and the examples. Consequently, any @xml:id attribute values appearing in your examples must be unique within the text of the whole of the Guidelines. Furthermore, any @target (etc.) values which do not point to anything in the source will be flagged with a warning during the build process.

4. Making a change to the Guidelines

Error messages may appear at any stage. Please do not leave the source in an invalid state (it makes life unnecessarily difficult for others). If you cannot immediately fix a validity error, revert your change while you think about it.

The Jenkins servers monitor the Subversion repository, and when they detect a change, they check it out and commence building several targets, just as you would build them on your local machine. There are a couple of advantages to letting the Jenkins servers check your build for you:

If you submit a change, and later get an email from one of the Jenkins servers telling you that the build failed, it will provide a link to the build information on the server. Here's what to do:

Error messages appearing during the make test phase (the ‘TEIP5-Test’ job on Jenkins) usually indicate that your changes are in conflict with the Birnbaum Doctrine, which decrees that changes in the Guideline schemas should not invalidate existing documents. You may wish to discuss the specific issue with other Council members.

5. Adding Schematron constraints to specifications

The TEI ODD system is primarily concerned with generating schemas in the form of RelaxNG or XML Schema. However, there are often circumstances in which you want to apply constraints to elements and attributes which cannot easily be captured by normal XML schemas. For instance, you might want to apply a co-occurrence constraint on some attributes. The @targetLang attribute is a good example. @targetLang is an optional attribute which ‘specifies the language of the content to be found at the destination referenced by @target, using a ‘language tag’ generated according to BCP 47.’ Obviously, there is no point in using @targetLang if you're not also using @target. However, many such co-occurrence constraints are difficult to express in RelaxNG schemas, and may not survive conversion to other schema formats such as XML Schema or DTD.

For this reason, we often use ISO Schematron to express constraints like this. If you look in att.pointing.xml, where the @targetLang attribute is defined, you'll find this constraint, inside the <attDef> for @targetLang:

<constraintSpec ident="targetLang" scheme="isoschematron"    xmlns:sch="http://purl.oclc.org/dsdl/schematron">  <constraint>   <sch:rule     context="tei:*[not(self::tei:schemaSpec)][@targetLang]">    <sch:assert test="count(@target)">@targetLang can only be used if @target is specified.</sch:assert></sch:rule>  </constraint> </constraintSpec>

This Schematron rule is an assertion that if @targetLang is used, @target should also be present. <constraintSpec> has an attribute @scheme (normally set to isoschematron). Inside <constraintSpec>, <constraints>s have <assert> elements, which have @test attributes, which are XPath; if the XPath tests false, the assertion will be fired, and its contents will appear on the console when you build or validate. There is also a <report> element which is similar, but fires when true. In Roma, you can also generate a Schematron schema which you can also use to test your document against. This document is essentially a compilation in Schematron of all the TEI constraints.

<constraintSpec> can appear as a child of <attDef>, <classSpec>, <elementSpec>, <macroSpec>, and <schemaSpec>. We'll go through the process of adding a constraint like this. The constraint we're going to add relates to dating elements (<date>, <birth> etc.) and the @calendar attribute. @calendar ‘indicates the system or calendar to which the date represented by the content of this element belongs.’ In other words, @calendar should only be used if the dating element has textual content. This makes sense (assuming that @calendar points at a valid <calendar> element):

<date calendar="#julian">January, 1622</date>
whereas this is not:
<date when="1622" calendar="#julian"/>>
because the <date> element has no textual content to which the @calendar attribute could apply. We're going to express this in the form of a Schematron constraint, along the lines of the one we've examined above. First, we open the att.datable.xml file, and find the <attDef> element which defines @calendar. We can add the <constraintSpec> element immediately after the <datatype> element, like this:
<constraintSpec ident="calendar" scheme="isoschematron"    xmlns:sch="http://purl.oclc.org/dsdl/schematron">  <constraint>   <sch:rule context="tei:*[@calendar]">    <sch:assert test="string-length(.) gt 0">@calendar indicates the system or calendar to which the date represented by the content of this element belongs, but this element has no textual content.</sch:assert></sch:rule>  </constraint> </constraintSpec>
(Obviously, by the time you're reading this, the <constraintSpec> is already in the TEI source, so you'll see it there.) We'll also have to make sure we add the Schematron namespace to the <classSpec> root element, so that the sch: prefix is defined: xmlns:sch="http://purl.oclc.org/dsdl/schematron". Then we commit our changes, and let the TEI build process build all the products, and make sure that we didn't get anything wrong.

That should do the job. However, it's quite difficult for us to test whether this constraint is in fact doing exactly what it should be, unless we build a new copy of Roma and use it to generate a Schematron schema, then validate a test document against it. This is probably not practical for most of us. Fortunately, the TEI build system provides a way for us to do this; in fact, we can put in place a couple of tests that will always be run whenever P5 is built, checking that our schematron constraint is intact and functioning as we expect.

The first thing we're going to do is add a couple of tests that should pass. We'll add a dating element which has both @calendar and some textual content, as well as an empty dating element with no textual content. If these tests pass, then we know that our constraint is not doing anything wrong. (We don't yet know whether it's doing anything at all, of course; that comes later.)

If you look at trunk/P5/Text, you'll see there is a whole folder full of files whose purpose is to test various aspects of the TEI build process and products. We want to add our tests to one of these files. The question is which one? We'll add it to the basic test file, which is testbasic.xml; this is tested against schemas generated from testbasic.odd, which should contain all the dating features we're interested in testing. If we look at that file, we find there are already several date elements in there, so we can try adding our calendar attribute to one of those. Let's choose the date of 1685 on a dictionary entry sense:

<sense>  <date    calendar="http://en.wikipedia.org/wiki/Julian_calendar">1685</date>  <form>   <orth>pamplemousse </orth>  </form> </sense>
We could go to the trouble of adding <calendarDesc> and <calendar> to the header of the file so we can point to a calendar element in the same document, but since @calendar is a data.pointer, we can point to an external source of calendar information.

We also want to add, somewhere, a date element which has no textual content and no @calendar calendar attribute. We might as well do this in the header, by adding a simple <revisionDesc> element, which gives us the added bonus of being able to describe our change:

<revisionDesc>  <change>   <date when="2012-09-06"/>MDH: Added @calendar to one date, and the date    element in here, for testing a new Schematron constraint.</change> </revisionDesc>
Now we can commit our change, and see if the build of TEIP5-Test completes successfully on our Jenkins servers.

If that build successfully completes, we haven't broken anything. But we still don't know whether our constraint will actually fire when something is wrong. In order to do that, we have to use the "detest" system. In trunk/P5/Test, you'll find the following files: detest.odd and detest.xml are test files like the ones we've seen above, but the purpose of the ‘detest’ files is to introduce deliberate errors and make sure that the testing process throws up the expected error results. What happens is basically this: So what we need to do is to add some new markup to detest.xml which is designed to fail our Schematron test. The problem is that we cannot reliably predict how it will fail—in other words, we can't know in advance what the resulting detest.log file should look like, because we can't know in what order the tests will run, and what the precise error messages might be. We could find this out if we had a working local build environment of our own, but it's far simpler to let Jenkins do the job for us. So this is what we'll do: We'll add this div to the detest.xml file:
<div>  <p>Added by MDH. This tests the Schematron constraint that any element with @calendar must have some textual content.</p>  <p>   <date     when="2012-09-06"     calendar="http://en.wikipedia.org/wiki/Gregorian_calendar"/>  </p> </div>
Now we commit the change to SVN, and Jenkins will start building. The build should fail, and it does. If we now go to the Jenkins workspace here: http://bits.nsms.ox.ac.uk:8080/jenkins/job/TEIP5-Test/ws/Test/ we'll see a file called detest.log, and if we look inside it, we'll find this bit, generated by our constraint: ‘@calendar indicates the system or calendar to which the date represented by the content of this element belongs, but this element has no textual content. (string-length(.) gt 0)’ This line is obviously missing from expected-results/detest.log, so the build failed when the two files were compared. We can fix that very simply:

6. Building the release

Note: the original content of this section has been removed, because a longer document dedicated to documenting the release process has been created. Please refer to TCW22: Building a TEI Release.

7. Reference section

7.1. Chapter codes

Following a lengthy debate in the Council as to whether the two-character codes originally used to identify individual chapters should be dropped in favour of longer more human-readable names, a compromise solution was reached in which the two character codes were retained as prefixes to longer human-readable names. The same two-character codes are also used to identify the HTML and PDF files generated during the release process.

The following table shows the correspondence between the printed organization of the Guidelines and the corresponding filenames. The order is determined by the driver file Source/guidelines-xx.xml, from which the table is derived.
SectionTitlefilename
[i]Releases of the TEI GuidelinesTitlePageVerso.xml
[ii]DedicationDedication.xml
[iii]Preface and AcknowledgmentsFM1-IntroductoryNote.xml
[iv]About These GuidelinesAB-About.xml
[v]A Gentle Introduction to XMLSG-GentleIntroduction.xml
[vi]Languages and Character SetsCH-LanguagesCharacterSets.xml
[1]The TEI InfrastructureST-Infrastructure.xml
[2]The TEI HeaderHD-Header.xml
[3]Elements Available in All TEI DocumentsCO-CoreElements.xml
[4]Default Text StructureDS-DefaultTextStructure.xml
[5]Representation of Non-standard Characters and GlyphsWD-NonStandardCharacters.xml
[6]VerseVE-Verse.xml
[7]Performance TextsDR-PerformanceTexts.xml
[8]Transcriptions of SpeechTS-TranscriptionsofSpeech.xml
[9]DictionariesDI-PrintDictionaries.xml
[10]Manuscript DescriptionMS-ManuscriptDescription.xml
[11]Representation of Primary SourcesPH-PrimarySources.xml
[12]Critical ApparatusTC-CriticalApparatus.xml
[13]Names, Dates, People, and PlacesND-NamesDates.xml
[14]Tables, Formulæ, and GraphicsFT-TablesFormulaeGraphics.xml
[15]Language CorporaCC-LanguageCorpora.xml
[16]Linking, Segmentation, and AlignmentSA-LinkingSegmentationAlignment.xml
[17]Simple Analytic MechanismsAI-AnalyticMechanisms.xml
[18]Feature StructuresFS-FeatureStructures.xml
[19]Graphs, Networks, and TreesGD-GraphsNetworksTrees.xml
[20]Non-hierarchical StructuresNH-Non-hierarchical.xml
[21]Certainty, Precision, and ResponsibilityCE-CertaintyResponsibility.xml
[22]Documentation ElementsTD-DocumentationElements.xml
[23]Using the TEIUSE.xml
[A1]Model ClassesREF-CLASSES-MODEL.xml
[A2]Attribute ClassesREF-CLASSES-ATTS.xml
[A3]ElementsREF-ELEMENTS.xml
[A4]AttributesREF-ATTRIBUTES.xml
[A5]Datatypes and Other MacrosREF-MACROS.xml
[A6]BibliographyBIB-Bibliography.xml
[A7]Prefatory NotesPrefatoryNote.xml
[A8]ColophonCOL-Colophon.xml

In most chapters, the two character code is also used as a prefix for the @xml:id values given to each <div> element. Note that every <div> element carries an @xml:id value, whether or not it is actually referenced explicitly elewhere in the Guidelines.

Note that files with names beginning REF contain only <divGen> elements: their content, which provides the reference documentation (sections A1 to A5 inclusive), is automatically generated during the build process.

7.2. Naming conventions

TEI naming conventions have evolved over time, but remain fairly consistent.

generic identifiers
An element and attribute identifiers should be a single natural language word in lowercase if possible. If more than one word is conjoined to form a name, then the first letter of the second and any subsequent word should be uppercased. Hyphens, underscores, dots etc are not used within element or attribute names.
class names
Class names are made up three parts: a name, constructed like an element name, with a prefix and optionally a suffix. The prefix is one of model. or att. and indicates whether this is a model or an attribute class. The suffix, if present, is used to indicate subclassing: for example att.linking.foo is the foo subclass of the attribute class att.linking
xml:id values
The conventions for these vary somewhat. Most of the older chapters of the guidelines have consistently constructed identifiers, derived from the individual section headings. Identifiers must be provided for:-
  • every <div>, whether or not it is explicitly linked to elsewhere
  • every bibliographic reference in the BIB.xml file

7.3. File release structure

Currently, the organisation of the /usr/share/xml/tei and /usr/share/doc/tei-* directories on the TEI web site is as follows:
tei |-- Test |-- custom | |-- odd | |-- schema | | |-- dtd | | |-- relaxng | | `-- xsd | `-- templates |-- odd | |-- Exemplars | |-- ReleaseNotes | |-- Source | | |-- Guidelines | | | |-- en | | | | `-- Images | | | `-- fr | | | `-- Images | | |-- Images | | `-- Specs | | |-- 18decembre | | `-- exemples | |-- Utilities | `-- webnav | `-- icons |-- schema | |-- dtd | `-- relaxng |-- stylesheet | |-- common | |-- common2 | |-- docx | | |-- from | | | |-- dynamic | | | | `-- tests | | | | `-- xspec | | | |-- graphics | | | |-- lists | | | |-- marginals | | | |-- maths | | | |-- paragraphs | | | |-- pass0 | | | |-- pass2 | | | |-- tables | | | |-- templates | | | |-- textruns | | | |-- utils | | | `-- wordsections | | |-- misc | | |-- to | | | |-- docxfiles | | | |-- drama | | | |-- dynamic | | | |-- graphics | | | |-- lists | | | |-- maths | | | |-- templates | | | `-- wordsections | | `-- utils | | |-- graphics | | |-- identity | | |-- maths | | `-- verbatim | |-- epub | |-- fo | |-- fo2 | |-- html | |-- latex | |-- latex2 | |-- nlm | |-- odds | |-- odds2 | |-- oo | |-- profiles | | |-- bodley | | | `-- epub | | |-- default | | | |-- csv | | | |-- docbook | | | |-- docx | | | |-- dtd | | | |-- epub | | | |-- fo | | | |-- html | | | |-- latex | | | |-- lite | | | |-- oddhtml | | | |-- oo | | | |-- p4 | | | `-- relaxng | | |-- enrich | | | |-- docx | | | |-- fo | | | |-- html | | | `-- latex | | |-- iso | | | |-- docx | | | | `-- model | | | |-- epub | | | |-- fo | | | |-- html | | | |-- latex | | | |-- schema | | | `-- tbx | | |-- ota | | | |-- epub | | | `-- html | | |-- oucs | | | |-- docx | | | |-- epub | | | `-- p4 | | |-- oucscourses | | | `-- docx | | |-- podcasts | | | |-- docx | | | `-- epub | | `-- tei | | `-- epub | |-- slides | |-- slides2 | |-- tite | |-- tools2 | |-- xhtml | `-- xhtml2 `-- xquery

Some other (mostly superceded) documents on the topic

  1. TEI ED W9 Points of Style For Drafts of TEI Guidelines 2 Mar 1990 in Waterloo Script format
  2. Notes on House Style TEI ED W11 14 Sep 1992 in Waterloo script formatted text
  3. TEI ED W55 Form for Draft Chapters of the TEI Guidelines 5 june 1996 in TEI P2 format in HTML format in ODD format
  4. TEI ED W57 Procedures for Correcting Errors in the TEI Guidelines July 23, 1994 in TEI P2 format in HTML format
Notes
1
At one point we considered using xInclude to do this instead, but decided against it for reasons which now escape me.
Lou Burnard. Date: 2011
This page is copyrighted