I have my simplification process almost complete, with all links being handled quite straightforwardly without reference to their type. The XSLT still needs to be cleaned up a bit, because it was written to handle the old stuff, but it's not a big job. I'm also continually refining the XSLT to handle the wide range of different document structures which have proliferated over the years; all of these will have to be fixed in the end, but in the meantime I have to allow for them. I think I'll write an identity transform to fix the most egregious structural errors (such as <group>
elements with only a single <text>
in them, which are common).
I've also eliminated the <milestone/>
tags I was using to generate hr tags in the output, and also the <hr/>
tags; I'm achieving the same effect with borders on <h2>
and <h3>
.
Leaving for the UK tomorrow, but taking the project with me so I can keep working on it.