Root-based index sort order

February 6th, 2019

SK reported an issue with the sort order of entries in the root-based index. I dug into it, and discovered: The main Moses-to-English entries appear to be sorted in the correct order. First a sort key is created like this:

<xsl:variable name="sortKey" select="if (descendant::orth) then normalize-space(descendant::orth[1]) else normalize-space(string-join(for $s in descendant::pron[seg[@type='p']]/descendant::seg[@type='p'] return hcmc:createOrth($s), ''))"/>

In other words, if there's an orth it uses the orth, and if not, it creates an orth from all the descendant phonemic prons. Then it sorts the entries using the orthographic collation:

<xsl:sort select="@sortKey" collation=""/>

When it comes to processing the root-based index, we were doing something slightly different:

<xsl:sort select="if (descendant::orth) then descendant::orth[1] else hcmc:createOrth(descendant::pron[seg[@type='p']][1]/descendant::seg[@type='p'][1])" collation=""/>

In other words, we were using the Phonemic collation. I can't remember when/where/why we have both phonemic and orthographic collations -- there must have been a reason -- but I've now switched the root-based index sort so that it uses the orthographic one. That appears to fix the problem, but SK will check for any unwanted fallout.

Meeting and rewrite of presentation slides

January 11th, 2019

Discussed our first draft at length, and I then rewrote my slides.

Tweak to order of root-based index component sorting

November 16th, 2018

Per SK, switched the order of two morphemes and rebuilt the PDF. Took a while to figure out where to make the change, though.

Worked with SK on diagnostics code to replace old Python stuff

August 24th, 2018

ED's convoluted Python/NTLK stuff for diagnostics just doesn't work on the new Jenkins server, and in any case it seems, as we look at it, that it could perfectly well have been done in XSLT, so SK and I have made a start on figuring out how it works and converting it. It'll take a while, but lesson learned -- don't let people use stuff just because they like it, keep the range of tech limited for any given project.

Added diagnostics build scenario to XPR file

December 12th, 2017

So that ECH can work remotely without needing a network connections, we've added a build scenario for the diagnostics to the Oxygen project file, so that running the default scenario on any XML document actually runs the diagnostic process. It takes nearly ten minutes, but it's still a bit quicker than waiting for Jenkins and it can be done without a network connection.

Meeting with app developer

September 25th, 2017

Met with AP, linguist and app developer, and shared ideas on dictionary interfaces, data-entry, and outputs.

Ant task to build the PDF

July 7th, 2017

Our PDF build is dependent on XEP, which is installed on my desktop, so up to now I've needed to be here to build it (since it was a scenario run from inside Oxygen). I've now converted this into an ant task which can be run remotely at the command line. Lesson learned at some cost of time: you can't use this:

  <arg value="-fo ${foFile}"/>

Instead, you have to use this:

  <arg line="-fo ${foFile}"/>

Otherwise XEP doesn't find the FO file, and assumes the fo is coming from stdin; it then complains that the root element is not fo:root.


June 28th, 2017
With ED and SK, porting over diagnostics that are now cleared into schematron where possible.

SVN conflict resolution

June 20th, 2017
Working with the Moses team to help resolve a conflict in SVN and helping E with XSLT.

Created metadata.xml

June 1st, 2017
Attempted to run the improved Endings diagnostics code against Moses and realized that the "psn" and "m" prefix weren't defined whatsoever. After consulting with MH, SK, and ED, I created metadata.xml and defined the prefixes as best as I could. This put up a few errors with psn pointers; I fixed what I could and then added a diagnostics to the Moses build.