Ben Jonson 2025-10-14 to 2025-10-17
to : Martin Holmes
Minutes: 205
On Tuesday, continued my slow cleanup of HTML problems in the crawled site. Ended up writing some perl to do some multi-line search-and-replace operations, since ill-formed files can’t by processed with XSLT (not easily, anyway), and Oxygen is not good at multi-line search-and-replace across multiple files. I believe I’ve finished handling undeclared entities, and I’m now working on the last of the ill-formedness issues, which should allow me to get to processable XHTML. Then I can use XSLT to make it valid.