Le Blanc reveals more issues involving other documents
I've spent quite a while this morning fixing page sizing and margin issues with the Le Blanc; this arises out of the fact that (as far as I can tell) the original page dimensions were measured from the PDF page width rather than the width of the page-image sitting inside the PDF. I think I have that problem basically solved (I had to recalculate about 400 margin-right settings).
However, we have another transformational nesting issue which is exemplified 45 times in Le Blanc, but also shows up in three other texts in the collection: <cit>
/<quote> elements cross page boundaries, and therefore contain
<fw>
and <pb>
, but in this context, <quote type="italic">
was converted into an XHTML <i>
tag. Initially, I thought this was another case where we needed to split out the contents before the main transformation, but I began to re-think, because there's such a range of related phenomena, including the use of <blockquote>
.
I've decided to go for even more divs and spans, eschewing more specialized XHTML elements, because it's easier to get valid structures in highly complicated documents this way; I can then control the display with CSS, making <div>
s inline and <span>
s block where required. This makes for valid documents (most of them validate now), which ensures that the rest of the CSS gets applied correctly.