With JT, slightly edited our paper for DH to include a reference to our diagnostics github project and to add a figure. Now submitted.
Category: "Activity log"
Finished building the spreadsheet for ContentDM for JF. Found and fixed some broken fallout from old ANSI chars in the process.
RE fixed the ACL issue that was blocking our access, and although it's still nailed down to the HCMC VLAN, it's working well. Fixed a couple of bugs in the controller.
Wrote a simple new utility XSLT to generate data for the Library, following JF's requirements; also a fix to an encoding issue revealed by this process.
JT's new super-fast regex-based approach is working well for Linux and Mac, but one small function in it breaks things on Windows; I think it's an XPath regex which needs double-escaping instead of single-escaping. Meanwhile, the process itself is throwing up huge numbers of errors we can work on in MoEML.
We are aiming to produce a product for general use by the time of the DH conference presentation, and to add info about it to an edited version of the conf paper due at the end of the month, so pushing forward with more straightforward stuff this morning. Added file-existence checking for XML and text files; experimented with methods of checking for the existence of binary files, but so far I've only found things that work in Saxon PE or EE. This approach might work, though.
Following up on a hint from RC at Oxygen, I figured out how to use a Swing JFileChooser dialog to let the user select their project directory, directly from ant; this is neat and very useful. It took a while to nail down exactly how best to do it, and I still have some cleanup to do to make the process fail more elegantly if the user cancels out of the dialog, but we're nearly there. Haven't tested on Windows yet either.
In the morning, I did a pile of work to get the output page JavaScript working in Windows. It was substituting Windows line-endings in the script, which the serializer was then escaping to numeric entities; took some wrangling to prevent that.
This is defined as a document type so that it opens in Author mode and has the ant scenario associated with it, so users can carry out the action as they're reading the instructions. Fixed a couple of other bugs arising out of changing from @target only to full any-attribute mode.
Today I wrote code to process the IANA Language Subtag Registry into an almighty regex that can validate @xml:lang values, and then built that into the diagnostics. It seems to work well. The regex is the largest I've ever used.
I've tweaked quite a bit so that we don't check attributes that definitely can't be pointers, and so that when looking for a prefixDef we look first in the host file. I ran the process against the whole Mariage dataset and got some interesting results that were very helpful, causing me to go and fix the Mariage consistency checks, and write some other stuff for Mariage that fixes common linking errors not previously being reported.