Nxaʔamxcín (Moses) Dictionary Blog

December 17, 2012

Discussions with LR

Posted by on 17 Dec 2012 in Activity log

Skyped with LR about the possible next steps for TEI as an LMF serialization.

December 13, 2012

Autophonemicizer deployed and tested

Posted by on 13 Dec 2012 in Activity log

Tweaked the XSLT for actual use (adding utility templates to preserve whitespace between PIs etc.), and SK has now tested this with real data. She has some suggestions for improvements, one of which might be achievable, although it will require a structural rewrite, but results are good so far, and most cases of deviation from desired results are in situations where there is no mechanically-detectable context which could be used to change the outcome. Can this now be extended to pron/segs, where there's even less context because hyphenation is not there?

December 12, 2012

LMF vs TEI

Posted by on 12 Dec 2012 in Activity log

Still working on this, with a slow but fruitful discussion on the LMF list helping me to confirm that what I thought were limitations in LMF interoperability really are so. I'm coming round to LR's view that TEI would be a better serialization format.

Autophonemicizer now working

Posted by on 12 Dec 2012 in Activity log

Much frustration involved in my belated (re?)discovery that neither word-boundaries nor lookarounds are supported in the XPath implementation of regular expressions. Grrr. But now working fine, with lots of help and a test set from SK. We can start testing it on whole files tomorrow.

English homophones in the Eng-NX word list

Posted by on 12 Dec 2012 in Tasks

Note to selves for the future: how will we deal with English homophones for sorting the Eng-Nx word list? If we remember, let's list the ones we come across here. So far we have:

fire - "flames" vs. "dismiss employee"

fast - "quick" vs. "abstain from food"

hide - "skin" vs. "conceal"

cold - adj or noun "coldness" vs. "illness"

saw - "tool" vs. past tense of see (replaced with <gloss subtype="i">see</gloss>, SMK 21May13)

close (near) vs. close (shut)

stern (back of boat)

game (recreation)

watch (observe) vs. watch (wristwatch)

back (body part)

top (toy) vs. top (of something)

fish vs. fish (catch fish)

gloss tagging and the search engine

Posted by on 12 Dec 2012 in Activity log

ECH's goal for the search engine in the web database is that, if a user searches for "fat", s/he will get results including fat, fatten, fattening, fatty.

Our current settings, and our policies for adding inferred glosses, seem to be accomplishing this nicely. An entry which has "fatty" in its def is found by a search for "fat", because it also has an inferred gloss "fat".

Searching for "fat*" also returns defs including fat, fatten, fattening, fatty ... but also fatal, fathom, father.

plans for printed Eng-Nx wordlist

Posted by on 12 Dec 2012 in Activity log

We reviewed our gloss-tagging policies yesterday, and concluded that yes, we are placing inferred gloss tags correctly for the purposes of generating the English-Nxa'amxcin word list, both in the web display and for the future print dictionary.

I summarize our notes about the Eng-Nx section of the print dictionary here, so we can remind ourselves in the future!

-The Eng-Nx section in the print dictionary should be considered a (fairly detailed) index to the Nx-Eng side, not a full Eng-Nx dictionary. It will be comparable to what MDK did in his Chehalis dictionary.

-Ours will go one step further than the Chehalis dictionary, in that, for example, a Nxa'amcxin word with "fattening" in its def will be found under fat, fatten, and fattening (not just the lemma, fat).

-Our print version will be like our current Eng-Nx wordlist view in the web interface, expanded to the first level of detail - e.g.

fatten

kn sacqʼʷúcnctəxʷ fattening
ʔacqʼʷúcn fattening
ʔacqʼʷúcts fattened

-Inferred glossed will be hidden in both the web view and the print dictionary, although they are important for the "behind the scenes" generation of the Eng-Nx wordlist.

-Our gloss-tagging process should provide at least one English key for each Nxa'amcxin word. It currently almost accomplishes this. There is just the occasional def in which it is impossible to figure out what the gloss-tag should be - e.g,
<seg>someone who goes fishing or hunting and does not get anything; poor fisherman; poor hunter</seg>

December 6, 2012

Article coming along

Posted by on 06 Dec 2012 in Activity log

Got some good work in today, and it feels like it's coming together. 8 pages done, about another 8 to do, I think, and some diagrams required.

December 5, 2012

Implement Sarah's autophonemicizer2

Posted by on 05 Dec 2012 in Tasks

See autophonemicizer2.doc in moses/trunk/docs. It can be done with XSLT and regular expressions.

More work on draft

Posted by on 05 Dec 2012 in Activity log

For every paragraph I write, I seem to have to find and read two more papers...

Nxaʔamxcín (Moses) Dictionary Blog

This is an XML dictionary project based primarily on the materials compiled by the late M. Dale Kinkade during fifteen years of work in the 1960’s and 1970’s with more than a dozen native speakers of the language, but it also includes materials compiled by Ewa Czaykowska-Higgins in the early 1990’s.

Search

XML Feeds

RSS 2.0: Posts
Atom: Posts

What is RSS?

Sidebar 2

This is the "Sidebar 2" container. You can place any widget you like in here. In the evo toolbar at the top of this page, select "Customize", then "Blog Widgets".