Another fix for hyphen-removal, and style stuff
We're beginning to get to grips with the nitty-gritty of the primary source document styling, and finding that there are a few things about the original XML encoding of style which are either slightly wrong or incomplete. That means we'll have to make minor changes to some of the CSS in the original documents. I've done some work on Candale 1. The issue we're going to hit is that these changes may have unpredictable effects on the display of these documents on the current site. So I'm checking with CC to see if if we can call a halt to uploading them to the current version of the site, and focus on the new site.
I now have quite a nice view of the original text now, with linebreaks and hyphens removed, long s's switched out, and so on. But one possible problem is that there may be cases where a hyphenated compound happens to fall on a linebreak. I don't know how often that happens in these texts, if it happens at all, but if it does, we need to avoid deleting that hyphen when we remove the linebreak. That means finding all those instances and tagging them with a specific attribute that makes it possible to preserve the hyphen.
Obviously we don't want to be reading the whole document set again to find all those things, so I thought I'd write a routine that lists all the hyphenated linebreaks with their line numbers so CC can easily scan them and identify those that need to be tagged:
...il la laissa sur sa bonne-foy, & par cette maniere... (line xxx) ...lettres ſur ce neceſſai-res. A ces cauſes voulant... (line yyy)
I've written to see if that will be a workable approach.