List of what morphemes link to
Posted by mholmes on 22 Aug 2013 in Activity log
As part of the preparation for work on the auto-hyphenator, I've generated a list of all the distinct forms of morphemes and what they link to, initially using this XQuery (which takes a long while to run):
xquery version "1.0";
declare default element namespace "http://www.tei-c.org/ns/1.0";
for $mText in distinct-values(//m[@corresp])
return concat('morpheme form: ', $mText, ' links to ', string-join(distinct-values(//m[text() = $mText]/@corresp), ', '))
and then trimming the results to remove everything that links only to m:UNASSIGNED, as well as removing links to m:UNASSIGNED from the lists of other links (should have built that into the XQuery). We can now use this list to spot candidates for auto-assignment (starting with those forms which only ever link to a single morpheme entry).