Log in

HCMC Journal

Ben Jonson 2026-02-23 to 2026-02-26

to : Illya Nokhrin
Minutes: 360

For the rebuild: Got email from MB about unlisted texts. Relocated the majority of them to appropriate folders and changed the indices so that they would now be listed on the rebuild site. Have one more file that needs MB’s input. Indices are somewhat clunky to change because they have a numerical listing of texts for each genre that is hard-coded. But, given that we’re unlikely to be changing them often, manually editing them seems fine.

For LEMDO conversion: Made some changes to the processing of speech xml:ids and corresps. Also created an XSL file to update the corresp values for speeches for the Alchemist, since a fair amount of hand remediation has been done for that file and we don’t want to go back to a re-generated version of it

Also made some changes to processing of OSP files for LEMDO conversion. The main change was converting the encoding of textual variants in OSP files to double endpoint encoding so that they would align with LEMDO standards. Smaller tweaks included making changes to castList processing, especially for lists of actors at the end of plays, tweaks to processing of prologue, act, scene, and section divs, and a patch for speech numbering for plays with inductions.

Running a validation check on the converted files, I get just under 750 errors, which, with 52 input files, works out to under 15 errors per file on average. I think that is about as good as it’s going to get for programmatic fixes at this point, though it is possible that we will get more issues that can be programatically fixed as we change the revisionDesc status.