DVPP 2024-10-01 to 2024-10-04
to : Martin Holmes
Minutes: 215
On Tuesday, met with KF and AC to discuss sonnet tagging. Then processed four volumes of Evergreen and four of Guardian into XML, then fixed a whole pile of fallout from page image URLs which (I initially assumed) were not properly encoded in the spreadsheets.
On Wednesday, OCRed one year each from Evergreen and Guardian. Also looked into the problem of page image URLs, and discovered that they were actually perfectly fine, and the XSLT is written in such a way that as long as there is any whitespace delimiting multiple images, they should be correctly handled. So that’s a bit of a mystery. The next time I have a new spreadsheet to deal with, I’ll do some proper debugging.
On Thursday, brief checkin with AC.