Started a tutorial based on SNOW1 (for the moment), and in the process of writing the first bit of it, came up against many annoyances in the rendering of egXML blocks; fixed those rendering issues (in three places, site, redesign, and codesharing. Grrr).
Added rendering handling for sp, speaker, and p within sp. The stage tag isn't handled yet. Rolled out changes both to site and to redesign codebases.
Since SNOW1 was a bit of a mess at the beginning, because of the encoders following obsolete examples, I've manually encoded the title page as an example.
Also found a problem with METR1 which was not really a bug, nor an encoding invalidity: a body element which goes straight to content (e.g. a head) with no intervening div is not invalid, but it triggered rendering problems because it was completely unexpected. As it happens, the encoding should not have been that way -- other divs appear later in the body -- but it wasn't technically wrong, so it would be good to figure out a way to prevent this through the schema or more likely through Schematron. We could change the content model of body so that it can only have divs, of course.
Did some tasks from yesterday and some new ones:
<group>have now been converted to
<div>s. (The only exception is stow_1633, which probably does need
I have these tasks coming out of the team meeting today:
I've spent the whole day working on getting a more flexible and successful build system for eXist. This is what I've added to Greg's script:
Found a number of problems with eXist, which I've reported, including a bad one once the webapp is running: you can no longer call transform:transform with a relative path to the XSLT file, otherwise you get an error. A full path from /db seems to work.
I've now figured out how to create an extension module for eXist, following the instructions here. These are some things I've learned:
build.sh extension-modules, then drop that jar into an existing eXist instance (although if the new jar was built with a substantially different version from the rest of the code, there could well be problems).
<module uri="http://hcmc.uvic.ca/ns/usm" class="org.exist.xquery.modules.unisimmetric.UniSimMetricModule" />along with the other modules.
I'm not yet happy with my module, and I'm still working on it. In particular, I'm not happy with the scores it's generating, and I think this might be something to do with other bits that get included in the GZIP stream, such as a header; if I can figure out how big those are, I can remove them from the calculation. The highest difference I seem to get is around 0.53 with completely dissimilar strings, so it seems as though the results are being compressed into a range much smaller than 0-1.
Team meeting, at which we discussed the use of ISE's facsimile viewer in MoEML (which will be easy enough to do, although it's based on a traditional db, and we'll have to replace that with proper TEI facsimile encoding).
People also asked me to clarify how the EEBO linking works, so I've done that in the transcriptions documentation file, and I've also implemented the display of little page-images linking to the EEBO pages. Also, during today,
<addrLine> were added to the schema, with some basic display rendering.
Lucene-based fuzzy matching seems to be very broken in the build of eXist I'm using, and in any case it's based on Levenshtein distance, so I've implemented a crude version of the USM/NCD algorithm in XQuery. It's a long way from ideal, though, because it's using base64 versions of strings rather than compressing the actual strings (this is all I can do with eXist's exposed gzip access); using zip seems to be punitive because it would require creating a file on the filesystem or in the db and compressing that. I think a simpler approach would be to take my Java class and strip out all the command-line stuff it contains, then call that directly from XQuery (see the xqSearchUtils java project and the way it's called from the Despatches XQuery for an example). A jar file with a simple XQuery module interface might be very handy indeed.
I've been using the opportunity of the redesign (which gives me a complete new incarnation of the web application working alongside the current one) to fix a whole raft of problems and annoyances going back a long time. Among those completed so far:
declare variable $dataDoc := if (collection('/db/data')//TEI[@xml:id=$fileId]) then collection('/db/data')//TEI[@xml:id=$fileId] else let $dummy := response:set-status-code(404) return collection('/db/data')//TEI[@xml:id='missing'];
<li>elements now have a
class="active"attribute where their target URL matches the current URL.
<div>s are often auto-generated with
generate-id()during the XSLT transformation, they cannot be matched for linking any other way.
:: Next Page >>
This project allows literary and scholarly works (primary and secondary) to be associated with locations in London, providing the reader with a richer understanding of the works.
|<< <||> >>|