Following discussion with AC, added a new diagnostic to catch cases where pseudonyms have possibly been omitted from the db record. Also added a complete listing page for all historical people; added more info to their individual person pages; and switched on rendering of non-transcribed poems, since these pages all link to them. Need to do more work on making those pages look more useful, but it's a start. Also discussed creating a file full of linkGrp elements, as MoEML has. 240 hours.
A number of changes arising out of discussions yesterday and today:
Documentation and rendering for choice/sic/corr implemented and tested.
Link to XML on poem page rendering (needs prettying up).
Discussion on how to encode elision initiated on TEI-L.
Fixes to rendering to handle previously-unexpected scenarios in which lgs appear within notes and/or epigraphs, but should not be processed or counted as lines in the main poem.
I've been working on generating JSON files in support of the search, mostly with the aim of supporting the search filters, as we do in other projects like Keats. The work is going well, but what I'm not really clear about yet is where to draw the line between convenience and file size; for example, I can include details of the periodical title and folder path with every poem retrieved under any category, or I could include only an id and have a small JSON file for periodical information which I use to look that info up when needed. Still thinking about all of this.
Today's meeting was productive, and following it I implemented a process for reporting on the main rhyme-scheme of a poem and variant stanzas that don't follow it. This threw up a couple of issues which I fixed with Schematron and cleanup. I've also re-worked the way the page-image pops up when you mouse over it, and parameterized the URLs used in building poems so that on Jenkins we should (if all goes well) get proper relative links to other site pages, whereas on the local build for encoders, you get fixed links to the Jenkins build. 360 minutes.
In our usual weekly meeting we talked about a few things including the common font-style: italics typo, which I added a Schematron rule for; the asterisk line encoding pattern, which was hastily conceived by me and basically silly, so I've replaced it with a saner and more extensible approach, with changed rendering; and various issues with rhyme. I also fixed a bunch of first-line issues in the db, and re-worked the diagnostics which discovers those problems so that it doesn't trigger on anything like as many false positives; and I rebuilt the TEI for Once a Week, which has lots of new content. 210 minutes.
Macs can't display the black triangle Unicode characters, so I switched to plus and minus signs for the TOC.
I've now finished and documented the rhyme-finding tool, and the team are testing it. Meanwhile, there's a need to be able to nimbly merge some components of the metadata db into a small subset of the TEI files -- specifically, a single year for a single periodical -- to allow indexing fixes and updates to be propagated on a folder-by-folder basis so the encoding can proceed without running the whole massive operation. I've therefore modularized that process, and it can now be called with parameters for periodical folder and year, and tested the result successfully with Chambers 1840, which is next on the encoding list. I did the same thing to the OCR process, which usually needs to be run after the db merge process anyway. This will make life easier going forward. 240 minutes.
Had to meld together four long poems into one, as a result of an indexing error that had to be corrected. In the process, I worked out a way to use the What Rhymes With functionality to find candidate existing tagged rhymes inside the poem you're currently working on, which should help speed up the rhyme labelling for longer poems. I'll show the team tomorrow. 240 minutes.