Plan for markup revision
Posted by mholmes on 23 May 2008 in Activity log
CC and I met to plan the next couple of weeks of work. These are some of the decisions:
Transcription
- At the base level, transcription of text on the images should be diplomatic. At present it is a mixture, with some regularizations; these should be undone, then re-done methodically (see below).
- Superscript and smallcaps need to be included, using
<hi>
elements, and then reproduced in the rendering. - Typographical features should be regularized using a
<choice>
element, containing<orig>
and<reg>
. Three features to which this will apply are:- long s
- u which is typographically identical to v
- tilde used to indicate an abbreviation (typically of a word suffix)
- These
<choice>
elements should be applied at the word level, so that searching can more easily handle variants. - A modernized version of the text should also be created. This would be in a separate category, but both would be (in the coming TEI version) linked through
@facs
, as opposed to@corresp
. (A new version of the IMT will handle this, in a month or two. In the meantime, category ids will be sufficient anyway.) The modernized transcription would be keyed to the original using identifying attributes (probably@n
).
Object markup
We began working on a plan for formalizing, focusing and regularizing the markup of objects, creatures, weapons etc. in the images:
- Make sure gestures are moved from the Objets to the Gestes category. Where a gesture incorporates an object, use two annotations.
- A new category is needed for Créatures.
- The title of an annotation must include, unambiguously, the identifier we're using for that species of object.
- Where the object is used as a weapon (a common feature), it should be titled like this: "Baton comme arme".
- The following items are already of special interest, and must be marked up wherever they appear:
- corne
- crochet (link to corne; symbolic)
- clé
- balai
- chaise (especially as weapon)
- verre
- pichet
- bourse
- argent
- pot
- seau
- croix
- marque (badge -- see for example Invention)
- chapeau
We'll work next week in production-line mode: I'll add and regularize object markup, while CC proofs, corrects and diplomatizes transcription (and probably expands on my object markup by adding some content to the annotation texts).