I've made substantial changes to the Le Blanc, adding many linebreaks, moving all right-floated labels further into the margin, and reducing the page size. There are still problems: one issue is that the <quote type="italics">
inline quotes often actually have a smaller font size than the surrounding text, but @rend
attributes on <quote>
are not carried through to the XHTML yet. That needs to be fixed in the XSLT, and then the relevant <quote>
elements (there are 394 of them) need to be examined to see if they need the @rend
to reduce their font size.
Category: "Activity log"
I've now switched all files over to using the new schema, and standardized them all on the xml-model processing instruction, replacing the oxygen and oasis-schema PIs wherever they occurred. I've also trimmed out a bunch of old unnecessary namespace declarations from root elements.
Following that, I've worked extensively on the Le Blanc, which had a disastrous document model; some chapters were identified as such and some not; chapters were embedded within chapters within other chapters; other divs were left hanging outside chapters; and the structure took no account at all of the volume's organization into four books. That's all now fixed, along with many typos or layout issues I came across while working on it, but it will need proofing again eventually. I'm going to spend a little more time going through LCC's list of issues while I'm focused on that text, though.
I've done some major refactoring of Le Blanc, which is my pilot text for using <label>
instead of <argument>
for the marginal headings. That feature is working well, but I've had to do a huge amount of work to fix wildly proliferating <div>
s in the original markup; they're still a bit of a mess, with chapter <div>
s interspersed with other unidentified <div>
s that have no apparent purpose, while some subsections in chapters are nested <div>
s and others aren't. But a few more hours should sort it.
I've also generated a new schema, which includes <argument>
in all the right places as before, but also includes <label>
and adds the @type
attribute to it (so we can distinguish <label>
s used for marginal headings from those in e.g. TOCs.
In the process of this work, I discovered some aberrant behaviour in oXygen which threw me for a while. The old oasis-schema
PI, linking to an RNG file, was not actually working, now that oXygen has switched to xml-model; instead, it was silently switching to validation with tei_all, which meant that many things would validate under the oasis-schema heading but fail to validate when xml-model was used.
EDIT by MDH 2011-12-22: I believe this task has been obviated by the proposed (and partially-implemented) change from using <argument>
to using <label>
. Setting this task to Completed.
Note to self: I thought I had made substantial amendments to the schema that would permit us to use <argument>
inside e.g. <p>
, because that's where our marginal arguments most often show up. However, in the Le Blanc text, that seems to be failing, so I need to revisit it. The text has been marked up in such a way that open ps and divs are being closed before and argument and opened again afterwards, and when I try to refactor appropriately, the code turns out not to be valid. Either the validation process is failing somehow, or the schema is not the most recent, or perhaps my modifications didn't go far enough.
I've spent quite a while this morning fixing page sizing and margin issues with the Le Blanc; this arises out of the fact that (as far as I can tell) the original page dimensions were measured from the PDF page width rather than the width of the page-image sitting inside the PDF. I think I have that problem basically solved (I had to recalculate about 400 margin-right settings).
However, we have another transformational nesting issue which is exemplified 45 times in Le Blanc, but also shows up in three other texts in the collection: <cit>
/<quote> elements cross page boundaries, and therefore contain
<fw>
and <pb>
, but in this context, <quote type="italic">
was converted into an XHTML <i>
tag. Initially, I thought this was another case where we needed to split out the contents before the main transformation, but I began to re-think, because there's such a range of related phenomena, including the use of <blockquote>
.
I've decided to go for even more divs and spans, eschewing more specialized XHTML elements, because it's easier to get valid structures in highly complicated documents this way; I can then control the display with CSS, making <div>
s inline and <span>
s block where required. This makes for valid documents (most of them validate now), which ensures that the rest of the CSS gets applied correctly.
LCC left a detailed list of issues to address in the Le Blanc text, and I've started on the first one, which involves the markup and rendering of the title page. Some fixes will be required in XSLT, and also the markup needs a bit of reconfiguring. <titlePart>
elements do not seem to be processed as expected; instead of getting <h2>
and <h3>
tags, they're getting block-display <span>
s, and I can't yet figure out how that's happening. It means their rend attributes are not being passed through.
Five of the engravings contain unusual glyphs used for abbreviation, and now the same has appeared in one of the regular texts, so I've documented the markup process for these glyphs, and passed the document on to LSPW so she can merge it into her growing markup documentation.
Tracked down a bunch of bugs, which EGB fixed, and the output now validates, after coming through our two-stage transform. Yay! From 406 errors to zero in a few days.
I've written an intermediary stylesheet to split out <list>
elements containing <pb>
and <fw>
children into separate lists with the <pb>
s and <fw>
s outside them. This is processed prior to the main transformation, and has solved the problem for lists. I may revisit it for lg/l at some point. For the record, here's the template (the only non-pass-through template in an identity transform):
<!-- Split out interrupted lists into a sequence of lists with the interruptions
between them. -->
<xsl:template match="list[child::fw or child::pb]">
<!-- Stash a reference to the current list so we can easily refer to it in later nested code. -->
<xsl:variable name="thisList" select="."/>
<xsl:for-each-group select="child::item" group-adjacent="count(preceding-sibling::fw) + count(preceding-sibling::pb)">
<list>
<xsl:for-each select="$thisList/@*">
<xsl:copy/>
</xsl:for-each><xsl:text>
</xsl:text>
<xsl:for-each select="current-group()">
<xsl:copy-of select="."/><xsl:text>
</xsl:text>
</xsl:for-each>
</list><xsl:text>
</xsl:text>
<xsl:variable name="lastItem" select="current-group()[position() = last()]"/>
<xsl:for-each select="$lastItem/following-sibling::fw[preceding-sibling::item[1] = $lastItem] | $lastItem/following-sibling::pb[preceding-sibling::item[1] = $lastItem]">
<xsl:copy-of select="."/><xsl:text>
</xsl:text>
</xsl:for-each><xsl:text>
</xsl:text>
</xsl:for-each-group>
</xsl:template>
Created a thumbnail image for the title page of Maladies des Femmes.