This document is intended to set out the way things are currently managed in the editing of the TEI Guidelines. General notes on the rationale for this state -- why it is the way it is -- may be added here later. The intention is to provide information for Council members wishing to contribute actively to the continued development and maintenance of the text of the Guidelines.
The only chapters not organised in this way are those which do not introduce or define particular modules.
Each element, class, and macro defined in the Guidelines is declared within its own XML file, containing an <elementSpec>, <classSpec>, or <macroSpec> as appropriate. These files are in the directory Source/Specs. For example, the file Source/Specs/abbr.xml contains the element spec for the <abbr> element.
Note that all translations share a single file in Specs. As a general rule, don't update a translation for any language of which you are not a native speaker. If you feel confident enough to adjust the translation, leave the @versionDate attribute unchanged in order to ensure the translation will be reviewed eventually.
Each chapter of the Guidelines is stored in a file called Source/Guidelines/xx/YY-name.xml where xx is the language (currently only en or fr), YY is the two letter identifier for each chapter (see 7.1. Chapter codes) and name is the name of the module being defined by that chapter.
The file Source/guidelines-xx.xml (where xx is either en or fr) is the ‘driver file’ for the whole shebang. It contains system entity declarations for each of the documents making up the P5 source. These entities are then referenced throughout the source to embed the required component at the right place.1
&saintName;
), and also some discussion of its usage. The former
can appear anywhere, but good practice is to include it in an alphabetic list of such
declarations near the end of the relevant section. You can also use a
<specList> to reference the description from your new spec within the body of
the text, like this:
The Guidelines are a reference manual, not a tutorial. You should not talk down to the reader, but assume they have a reasonably well-informed knowledge of the subjects under discussion. Make copious use of cross references, rather than repetition.
Bear in mind however that your reader may not have English as their first language. Avoid needlessly complex sentences and unnecessarily obscure terminology. Make sure that technical terms are glossed on their first appearance: this should be in the chapter on XML in the case of XML-related terminology. If you want to provide other references, do so as footnotes, using the <note> element.
Provide bibliographic citations for any other standards (etc) referenced, following the existing style. Do not introduce bibliographic citations simply in order to demonstrate your learning.
See the Style Guide for Editing the TEI Guidelines, which attempts to state preferred practice on vexed issues issues about spelling, punctuation, etc. The goal of these rules is to avoid inconsistency, and also (wherever possible) to avoid producing text which is markedly either British or American English.
The purpose of an example is to illustrate a specific element or feature. Do not include irrelevant encoding which does not contribute to this primary goal. If such encoding is unavoidable (eg to make your example valid), then it must be explained in the supporting text.
Wherever possible, choose your examples from real documents and provide bibliographic citations for them in the file BIB-Bibliography.xml. Use the @corresp attribute on the <egXML> element to link an example to its source note. Note that the @xml:lang attribute is mandatory on <exemplum>: this is to ensure that the ODD processor knows which examples to choose in a given context.
All examples should be valid against a modified TEI schema in which any element can act as a root element: this validity is checked during the build process.
Good encoding practice will ensure not only valid but also highly functional Guidelines.
When referencing figures and to other sections of the Guidelines, use <ptr>, not <ref>, to ensure that the title and number of the referenced item is automatically inserted when the Guidelines are compiled.
The build process validates cross-references. Since the Guidelines is compiled into a single XML document at build time, IDs must be unique across the text and the examples. Consequently, any @xml:id attribute values appearing in your examples must be unique within the text of the whole of the Guidelines. Furthermore, any @target (etc.) values which do not point to anything in the source will be flagged with a warning during the build process.
Error messages may appear at any stage. Please do not leave the source in an invalid state (it makes life unnecessarily difficult for others). If you cannot immediately fix a validity error, revert your change while you think about it.
The Jenkins servers monitor the Subversion repository, and when they detect a change, they check it out and commence building several targets, just as you would build them on your local machine. There are a couple of advantages to letting the Jenkins servers check your build for you:
If you submit a change, and later get an email from one of the Jenkins servers telling you that the build failed, it will provide a link to the build information on the server. Here's what to do:
Error messages appearing during the make test phase (the ‘TEIP5-Test’ job on Jenkins) usually indicate that your changes are in conflict with the Birnbaum Doctrine, which decrees that changes in the Guideline schemas should not invalidate existing documents. You may wish to discuss the specific issue with other Council members.
The TEI ODD system is primarily concerned with generating schemas in the form of RelaxNG or XML Schema. However, there are often circumstances in which you want to apply constraints to elements and attributes which cannot easily be captured by normal XML schemas. For instance, you might want to apply a co-occurrence constraint on some attributes. The @targetLang attribute is a good example. @targetLang is an optional attribute which ‘specifies the language of the content to be found at the destination referenced by @target, using a ‘language tag’ generated according to BCP 47.’ Obviously, there is no point in using @targetLang if you're not also using @target. However, many such co-occurrence constraints are difficult to express in RelaxNG schemas, and may not survive conversion to other schema formats such as XML Schema or DTD.
For this reason, we often use ISO Schematron to express constraints like this. If you look in att.pointing.xml, where the @targetLang attribute is defined, you'll find this constraint, inside the <attDef> for @targetLang:
This Schematron rule is an assertion that if @targetLang is used, @target should also be present. <constraintSpec> has an attribute @scheme (normally set to isoschematron). Inside <constraintSpec>, <constraints>s have <assert> elements, which have @test attributes, which are XPath; if the XPath tests false, the assertion will be fired, and its contents will appear on the console when you build or validate. There is also a <report> element which is similar, but fires when true. In Roma, you can also generate a Schematron schema which you can also use to test your document against. This document is essentially a compilation in Schematron of all the TEI constraints.
<constraintSpec> can appear as a child of <attDef>, <classSpec>, <elementSpec>, <macroSpec>, and <schemaSpec>. We'll go through the process of adding a constraint like this. The constraint we're going to add relates to dating elements (<date>, <birth> etc.) and the @calendar attribute. @calendar ‘indicates the system or calendar to which the date represented by the content of this element belongs.’ In other words, @calendar should only be used if the dating element has textual content. This makes sense (assuming that @calendar points at a valid <calendar> element):
xmlns:sch="http://purl.oclc.org/dsdl/schematron"
. Then we commit our changes, and let the TEI build process build all the products, and make sure that we didn't get anything wrong.That should do the job. However, it's quite difficult for us to test whether this constraint is in fact doing exactly what it should be, unless we build a new copy of Roma and use it to generate a Schematron schema, then validate a test document against it. This is probably not practical for most of us. Fortunately, the TEI build system provides a way for us to do this; in fact, we can put in place a couple of tests that will always be run whenever P5 is built, checking that our schematron constraint is intact and functioning as we expect.
The first thing we're going to do is add a couple of tests that should pass. We'll add a dating element which has both @calendar and some textual content, as well as an empty dating element with no textual content. If these tests pass, then we know that our constraint is not doing anything wrong. (We don't yet know whether it's doing anything at all, of course; that comes later.)
If you look at trunk/P5/Text, you'll see there is a whole folder full of files whose purpose is to test various aspects of the TEI build process and products. We want to add our tests to one of these files. The question is which one? We'll add it to the basic test file, which is testbasic.xml; this is tested against schemas generated from testbasic.odd, which should contain all the dating features we're interested in testing. If we look at that file, we find there are already several date elements in there, so we can try adding our calendar attribute to one of those. Let's choose the date of 1685 on a dictionary entry sense:
We also want to add, somewhere, a date element which has no textual content and no @calendar calendar attribute. We might as well do this in the header, by adding a simple <revisionDesc> element, which gives us the added bonus of being able to describe our change:
expected-results/detest.log
detest.odd
detest.xml
detest.xml
is validated against those schemas.detest.log
(in the Test directory).detest.log
file in the expected-results
subdirectory.detest.xml
which is designed to fail our Schematron test. The problem is that we cannot reliably predict how it will fail—in other words, we can't know in advance what the resulting detest.log
file should look like, because we can't know in what order the tests will run, and what the precise error messages might be. We could find this out if we had a working local build environment of our own, but it's far simpler to let Jenkins do the job for us. So this is what we'll do:
detest.xml
.detest.log
on Jenkins, and copy it to our local expected-results/detest.log
.detest.xml
file:
detest.log
, and if we look inside it, we'll find this bit, generated by our constraint:
‘@calendar indicates the system or calendar to which the date represented by the content of this element belongs, but this element has no textual content. (string-length(.) gt 0)’
This line is obviously missing from expected-results/detest.log
, so the build failed when the two files were compared. We can fix that very simply:
detest.log
file from the TEIP5-Test workspace on the Jenkins server (job/TEIP5-Test/ws/Test/
).expected-results/detest.log
.Note: the original content of this section has been removed, because a longer document dedicated to documenting the release process has been created. Please refer to TCW22: Building a TEI Release.
Following a lengthy debate in the Council as to whether the two-character codes originally used to identify individual chapters should be dropped in favour of longer more human-readable names, a compromise solution was reached in which the two character codes were retained as prefixes to longer human-readable names. The same two-character codes are also used to identify the HTML and PDF files generated during the release process.
Section | Title | filename |
[i] | Releases of the TEI Guidelines | TitlePageVerso.xml |
[ii] | Dedication | Dedication.xml |
[iii] | Preface and Acknowledgments | FM1-IntroductoryNote.xml |
[iv] | About These Guidelines | AB-About.xml |
[v] | A Gentle Introduction to XML | SG-GentleIntroduction.xml |
[vi] | Languages and Character Sets | CH-LanguagesCharacterSets.xml |
[1] | The TEI Infrastructure | ST-Infrastructure.xml |
[2] | The TEI Header | HD-Header.xml |
[3] | Elements Available in All TEI Documents | CO-CoreElements.xml |
[4] | Default Text Structure | DS-DefaultTextStructure.xml |
[5] | Representation of Non-standard Characters and Glyphs | WD-NonStandardCharacters.xml |
[6] | Verse | VE-Verse.xml |
[7] | Performance Texts | DR-PerformanceTexts.xml |
[8] | Transcriptions of Speech | TS-TranscriptionsofSpeech.xml |
[9] | Dictionaries | DI-PrintDictionaries.xml |
[10] | Manuscript Description | MS-ManuscriptDescription.xml |
[11] | Representation of Primary Sources | PH-PrimarySources.xml |
[12] | Critical Apparatus | TC-CriticalApparatus.xml |
[13] | Names, Dates, People, and Places | ND-NamesDates.xml |
[14] | Tables, Formulæ, and Graphics | FT-TablesFormulaeGraphics.xml |
[15] | Language Corpora | CC-LanguageCorpora.xml |
[16] | Linking, Segmentation, and Alignment | SA-LinkingSegmentationAlignment.xml |
[17] | Simple Analytic Mechanisms | AI-AnalyticMechanisms.xml |
[18] | Feature Structures | FS-FeatureStructures.xml |
[19] | Graphs, Networks, and Trees | GD-GraphsNetworksTrees.xml |
[20] | Non-hierarchical Structures | NH-Non-hierarchical.xml |
[21] | Certainty, Precision, and Responsibility | CE-CertaintyResponsibility.xml |
[22] | Documentation Elements | TD-DocumentationElements.xml |
[23] | Using the TEI | USE.xml |
[A1] | Model Classes | REF-CLASSES-MODEL.xml |
[A2] | Attribute Classes | REF-CLASSES-ATTS.xml |
[A3] | Elements | REF-ELEMENTS.xml |
[A4] | Attributes | REF-ATTRIBUTES.xml |
[A5] | Datatypes and Other Macros | REF-MACROS.xml |
[A6] | Bibliography | BIB-Bibliography.xml |
[A7] | Prefatory Notes | PrefatoryNote.xml |
[A8] | Colophon | COL-Colophon.xml |
In most chapters, the two character code is also used as a prefix for the @xml:id values given to each <div> element. Note that every <div> element carries an @xml:id value, whether or not it is actually referenced explicitly elewhere in the Guidelines.
Note that files with names beginning REF
contain
only <divGen> elements: their content, which provides the
reference documentation (sections A1 to A5 inclusive), is
automatically generated during the build process.
TEI naming conventions have evolved over time, but remain fairly consistent.
model.
or
att.
and indicates whether this is a model or an attribute class. The
suffix, if present, is used to indicate subclassing: for example
att.linking.foo is the foo subclass of the attribute
class att.linking/usr/share/xml/tei
and
/usr/share/doc/tei-*
directories on the TEI web site is
as follows:
xInclude
to do this instead,
but decided against it for reasons which now escape me.