Made the changes that needed to be made based on FL's report.
I'm posting this message for France, because she wasn't able to add it to the blog yesterday.
Here is the complete message about the work done yesterday. If you
have any question please don't hesitate to email me or to post a
message on this blog: Today I looked through the home page and checked
for any typos & errors:
ACCUEIL:
Line 1: (starting with) débuts...la Concile change for (=) le Concile
line 3: en France ...auparavent = auparavant
Under "AU SUJET"
SCHÉMA:
Do we have more than one schéma here? should we add a 's' to the title
SCHEMA?
INSTRUCTIONS
line 4: On peut accéder... ù change for (=) où
...ils sont listés;on = listés; on (add a space after semi-colon)
...peut cliquer sur les entêtes = en-têtes
ENCODAGE
2. TEXTES AUTONOMES
LES TEXTES
line 1: Les textes autonomes...documents tel que = tels ...grand texte
telle = tel...
PRINCIPES DE BALISAGE
line 2: <div>: ... et d'autres <div>s = should we keep the s? The
plural form sounds weird to to me.
... Les éléments <div> element = '<div> element' I wonder if we should
italicise <div> element or put quotation mark?
line 3: l'attribut n = 'n' This n sound smaller than it suppose to be,
also do we neen to put it in quotation or to italicise ?
line 4: L'élément <head> est utilisée = utilisé
line 9: Pour des structures...des éléments peuvent être insérées =
insérés
line 15: <hi>:...un texte en caractères tels que = caractères
spécifiques tels que... Maybe we need to add 'spécifiques'?
line 17 : L'étiquette est utilisé = utilisée
2.1. Pour les notes = 2. Pour les notes
lines 12:<list> et <item>: ... aparaissent = apparaissent
ANALYSE DE TEXTES
line 1: L'index... aux versions texte = versions 'texte' or italicised
line 3: plus...et nous fournirons prochainement des liens pertinents
aux textes = plus... et des liens pertinents aux textes seront
disponibles prochainement. I think it sounds better this way.
I also noticed when we go on DOCUMENT then 'gravure' page and we point
to the 1st image, the title that appears for a few seconds has a typo
Présage malheurux = malheureux
Pushed the latest XML files into the db, and then noticed that apostrophes in title
attributes in the output were being double-escaped. Edited teiGeneral.xsl
to remove escaping from the output of <reg>
elements as title
attributes. We'll have to watch this, though; the escaping may have been there for a reason. Not clear whether the XSLT processor will handle escaping appropriately or not.
I've got more than half-way through now, although some images (Fornaux, for instance) are difficult to interpret, which makes it hard to identify Personnages. I'm noticing that bâtons, éventails, marteaux, balais and clefs are very important.
After an hour or so of looking at the first few images, it became clear that we also needed a category called "Personnages", so I've added both the Créatures and Personnages categories to all of the image files (using XSLT). I've also worked through a dozen or so of the images, adding markup of objects, creatures and people (often empty placeholders, but enough for searching purposes), and moving other annotations from categories they didn't belong in (such as people in the Gestes category).
Found and corrected a lot of typos in the process, but CC is taking the images I've finished with and doing a proper proofing of them, also adding the typographical regularization tags discussed last week.
Also found a couple of annoyances in the IMT, and determined that I really want the UniSynEdit syntax highlighting that was planned for version 1.8, so I think I'm going to add that today and tomorrow (assuming I don't hit any horrible roadblocks with it).
CC and I met to plan the next couple of weeks of work. These are some of the decisions:
Transcription
- At the base level, transcription of text on the images should be diplomatic. At present it is a mixture, with some regularizations; these should be undone, then re-done methodically (see below).
- Superscript and smallcaps need to be included, using
<hi>
elements, and then reproduced in the rendering. - Typographical features should be regularized using a
<choice>
element, containing<orig>
and<reg>
. Three features to which this will apply are:- long s
- u which is typographically identical to v
- tilde used to indicate an abbreviation (typically of a word suffix)
- These
<choice>
elements should be applied at the word level, so that searching can more easily handle variants. - A modernized version of the text should also be created. This would be in a separate category, but both would be (in the coming TEI version) linked through
@facs
, as opposed to@corresp
. (A new version of the IMT will handle this, in a month or two. In the meantime, category ids will be sufficient anyway.) The modernized transcription would be keyed to the original using identifying attributes (probably@n
).
Object markup
We began working on a plan for formalizing, focusing and regularizing the markup of objects, creatures, weapons etc. in the images:
- Make sure gestures are moved from the Objets to the Gestes category. Where a gesture incorporates an object, use two annotations.
- A new category is needed for Créatures.
- The title of an annotation must include, unambiguously, the identifier we're using for that species of object.
- Where the object is used as a weapon (a common feature), it should be titled like this: "Baton comme arme".
- The following items are already of special interest, and must be marked up wherever they appear:
- corne
- crochet (link to corne; symbolic)
- clé
- balai
- chaise (especially as weapon)
- verre
- pichet
- bourse
- argent
- pot
- seau
- croix
- marque (badge -- see for example Invention)
- chapeau
We'll work next week in production-line mode: I'll add and regularize object markup, while CC proofs, corrects and diplomatizes transcription (and probably expands on my object markup by adding some content to the annotation texts).
Anticipating that at least one of us, and hopefully both I and CC, will be doing lots of markup next week on objects in the files, I've installed the IMT version 1.7.1.1, the latest, on the two markup computers.
I've finished the editing of the mariage stylesheets as mentioned in my previous post. There are now 4 styles available from the style menu section. The new one is brown - title = Bois.
I have also tidied up all of the stylesheets so they are consistent in terms of format and commentary.
Bug reports welcome.
I've finished a new stylesheet for the site - it's green and goes by the name Pelouse.
I've also begun cleaning up the sheets so that colours only get declared in <colourname>.css files. All other sheets only deal with layout, typography etc. This keeps the site looking the same in terms of layout no matter the colour scheme.
I'll finish the de-crufting of layout sheets next and begin working on a brown sheet (Bois?) after that. The last step will be tidying up the original and versailles sheets.