RE fixed the ACL issue that was blocking our access, and although it's still nailed down to the HCMC VLAN, it's working well. Fixed a couple of bugs in the controller.
Wrote a simple new utility XSLT to generate data for the Library, following JF's requirements; also a fix to an encoding issue revealed by this process.
JT's new super-fast regex-based approach is working well for Linux and Mac, but one small function in it breaks things on Windows; I think it's an XPath regex which needs double-escaping instead of single-escaping. Meanwhile, the process itself is throwing up huge numbers of errors we can work on in MoEML.
We are aiming to produce a product for general use by the time of the DH conference presentation, and to add info about it to an edited version of the conf paper due at the end of the month, so pushing forward with more straightforward stuff this morning. Added file-existence checking for XML and text files; experimented with methods of checking for the existence of binary files, but so far I've only found things that work in Saxon PE or EE. This approach might work, though.
Following up on a hint from RC at Oxygen, I figured out how to use a Swing JFileChooser dialog to let the user select their project directory, directly from ant; this is neat and very useful. It took a while to nail down exactly how best to do it, and I still have some cleanup to do to make the process fail more elegantly if the user cancels out of the dialog, but we're nearly there. Haven't tested on Windows yet either.
This is defined as a document type so that it opens in Author mode and has the ant scenario associated with it, so users can carry out the action as they're reading the instructions. Fixed a couple of other bugs arising out of changing from @target only to full any-attribute mode.
Today I wrote code to process the IANA Language Subtag Registry into an almighty regex that can validate @xml:lang values, and then built that into the diagnostics. It seems to work well. The regex is the largest I've ever used.
I've tweaked quite a bit so that we don't check attributes that definitely can't be pointers, and so that when looking for a prefixDef we look first in the host file. I ran the process against the whole Mariage dataset and got some interesting results that were very helpful, causing me to go and fix the Mariage consistency checks, and write some other stuff for Mariage that fixes common linking errors not previously being reported.
Fairly major overhaul of the original code, to do this:
- Handle all attributes, rather than trying to figure out which might be pointers.
- Determine instead whether a token value in the attribute is some kind of pointer.
- Resolve prefixDef/private URI schemes.
- Handle paths on Windows. This took some wrestling.
- Calculate the date in the ant script instead of getting it from Oxygen.
- Format the output much more effectively.
- Add tests for simple cases.
Lots more to do, of course, but this is coming along nicely.
JT started the diagnostics repo for projectEndings on GitHub, and did all the initial work to get something that will process a folder full of XML and check internal links. I tested and raised a few issue tickets, then set about solving some of them. We now have a process that will check three kinds of link: internal links within the same document (on the assumption that RelaxNG does not do this unless DTD Compatibility is turned on); links to external documents; and links to specific ids inside external documents. I think this is a good initial selection, and when run against Mariage, it does indeed find some bugs that need fixing, which I will fix. There's a lot to do in terms of making it prettier and more user-friendly, but the basics are definitely there.