Endings/staticSearch 2024-10-07 to 2024-10-11
to : Martin Holmes
Minutes: 95
On Monday, did some more triage on tickets, and discovered two things: first, my fix for the extra slash for issue #261 broke the Windows build, so I added a fix for that; despite this, the Windows build is still broken for other reasons, so I raised a ticket for that. Secondly, although we were expecting the release of Saxon 12.5 to come with an updated Unicode db, that doesn’t seem to have been the case, so the special hack for the Sinological Dot (issue #300) is still required.
On Wednesday, following work on the Moses project, discovered that the undertie character was causing a tokenization break inside words; it is actually in the subclass of Connector Punctuation, which PERL includes in its word character (\w) class. That’s a strong argument for its being handled as a word character, so I raised an issue on staticSearch and created a PR with a fix and a test.