Transcription and annotation of Blanchon's Stances du mariage has been completed. The text is now published on the site, however, it appears under the category of "Satire" which is a problem because it is not a satire. CC suggests that a new category should be created, possibly "Éloges du mariage". We'll have to discuss this with MH when he returns from holiday.
Category: "Activity log"
...spent over the last few days on moving EG-B to Spartan, timesheets, and discussing various markup issues.
I've started work on a text-output format suitable for text-analysis work. The first stage is normalizing spacing and re-connecting hyphenated words, which I seem to have working OK, although I need to do some more testing. Next we need to look at what normalization strategies we're going to undertake, from the obvious (replacing long s) to the less obvious (such as correcting all variant spellings). I've started learning a little about Mallet, and I've run some of their sample data, but it's obvious that I'm going to have to do a lot of learning before I can make worthwhile use of it.
An issue arose today due to a collision of XSLT templates which are handling <cit>
and <quote>
elements. Since our practice has evolved rather than been planned, we've ended up with a complex range of different encoding strategies, involving the use of @type on both cit and quote. As it stands, everything seems to work, but we really should take a close look at the relevant templates in teiGeneral.xsl, as well as the encoding practice, so that we can at least document where and when the various strategies are being used.
28 instances in two documents, done on instructions from the French team. Checked them all before making the change.
The references list is now divided into alphabetical groups by first letter, and there is a list of the letters at the top of the page which you can click on to jump to the appropriate place in the file.
EDIT: SEE COMMENTS BY MH BELOW.
Working on the varin.xml, I discovered a problem when trying to use <cit type="blockquote"><quote type="noMarks">
. Although the document validated, it caused an error when trying to view the XSLT. The error was related to XML lines 3218 to 3226 (/TEI/text[1]/body[1]/div[35]/p[4]/cit[1]/quote[1]). I tried to fix the error by changing <cit type="blockquote" rend="margin: 0"><quote type="noMarks"><foreign xml:lang="la">
to simply <cit type="blockquote"><quote><foreign xml:lang="la">
, however, this displayed "guillemets" around the quoted text. I ended up having to use <cit type="blockquote" rend="margin: 0"><quote type="italics"><foreign xml:lang="la">
to fix the error and make the text display properly without "guillemets" and only in italics.
MH will probably need to do some more tweaking of block quote handlers.
Investigation by MH
I think this issue arises out of a basic misunderstanding. The handling for blockquotes depends on <cit type="block">
, NOT type="blockquote"
. If you use <cit type="block">
, then no quotes (guillemets) will be applied anyway; by default, blockquotes do not have quotation marks. In fact, the @type
attribute on the <quote>
element is ignored in the case of <cit type="block">
.
So if the objective is to have an italicized blockquote, with the italicization due to its being in a foreign language, you can do this:
<cit type="block" rend="margin: 0; margin-top: 0; margin-bottom: 0;"><quote><foreign xml:lang="la"> I’ay pour le dot beaucoup d’argent receu,<lb/> Et mon pouuoir ce faiſant i’ay vendu.</foreign></quote></cit>
There are 24 instances of the erroneous cit type="blockquote" in the following documents: forest_nuptiale,
sonnet_1609, and varin. There are 14 instances of the correct formulation, <cit type="block">
, which occur in sonnet_1609 and varin. I think we need to look at all the erroneous ones and fix them, making any other adjustments required to make them display properly.
I found a way to automate the addition of @xml:id attributes where these are required because of linking from the TOC, so that's been done in Le Bon Mariage, Le Blanc and Ville-Thierry.
Started working on rendering of TOC code, as described here. These are the changes I've made:
- I made substantial edits to Le Bon Mariage, Ville-Thierry and Le Blanc to fix minor inconsistencies and missing attributes.
- I added a special handler for
<ref type="pageNum">
which processes a TOC page number into a link to the<fw>
containing that page number. Note, though, that none of the<fw>
tags in question actually has an@xml:id
attribute yet, with the exception of one that I added to Le Bon Mariage for testing. I'll probably try to automate the addition of those attributes. - I added an appropriate CSS ruleset for the page numbers.
Basically everything seems to be working, although the page width of the Ville-Thierry is still wrong; I need to work on that a little. But someone else will take over the markup of that document at some point, so it could be left to them.