23/05/17

Permalink 05:14:20 pm, by mholmes, 52 words, 4 views   English (CA)
Categories: Activity log; Mins. worked: 100

Meeting/discussion and followup transformation

Met with SK and ECH and discussed a number of remaining issues that might be amenable to algorithmic approaches; one was decided on (removing stress marks from phonemic segs in inferred roots), and I wrote and tested the required transformation, then ran it on the data at the end of the day.

27/04/17

Permalink 09:34:25 am, by mholmes, 26 words, 9 views   English (CA)
Categories: Activity log; Mins. worked: 20

Hard-run of cit-duplicating-entry commenting-out xslt

Ran it on these files:

affix_aspectual, affix_glot-ix, affix_k-m, affix_n-t, affix_u-CAPs, lex-pref, lex-suf, particles,pron

and committed the results. SMK now checking.

21/04/17

Permalink 02:35:21 pm, by mholmes, 19 words, 13 views   English (CA)
Categories: Activity log; Mins. worked: 60

cits duplicating entries now working

Finished and tested the XSLT from yesterday; SMK will check results before we hard-run it and change the data.

13/04/17

Permalink 12:13:21 pm, by skell, 203 words, 25 views   English (CA)
Categories: Activity log, Announcements; Mins. worked: 30

feature structures for numbers

Further to our discussions on numbers, I have added the following to feature_system.xml

1) wordType numberStem. So ECH will add this <fs> to the number stems 1-10.

<fs>
<f name="numberStem">
<binary value="true"/>
</f>
</fs>

2) countingType "ten"

I have also added the following <fs> to lexical suffix "akst-2", so ECH can use this morpheme for marking up the numbers 30, 40 ... 90.

<fs>
<f name="baseType">
<symbol value="affix"/>
</f>
<f name="positionType">
<symbol value="suffix"/>
</f>
<f name="affixType">
<symbol value="derivational"/>
</f>
<f name="derivationalType">
<symbol value="lexical"/>
<symbol value="counting"/>
</f>
<f name="countingType">
<symbol value="ten"/>
</f>
</fs>

MDH will then search for entries with this <fs> to build a test column for the table of numerical expressions. We can subsequently add more countingType values to the feature system, and to the entries for the appropriate lexical suffixes with classifier functions, and generate more columns for the table.

04/04/17

Permalink 04:56:12 pm, by mholmes, 55 words, 36 views   English (CA)
Categories: Activity log; Mins. worked: 30

Discussions on numbers

Discussions and decisions on how to handle numbers and counters: new wordType of cardinalNumeral, new lexicalSuffix type of numeralClassifier. These will be applied, and then harvesting will be done to generate a table of numerical expressions which will form the basis of decisions on how/whether to create a special section in the print dictionary.

29/03/17

Permalink 11:25:01 am, by mholmes, 130 words, 14 views   English (CA)
Categories: Activity log; Mins. worked: 60

New Schematron rule; fixes to PDF build

SK pointed out that the English-Moses index was sorting Js to the end, and indeed when I looked at the collation that we're using for all sorting (MosesPhonemicCollation, which is designed to handle both English and Moses), J was omitted from the sequence. I added it to the source, installed NetBeans and recompiled the jar, and all seems to be well. I was happy to see that NetBeans was its usual trouble-free self; installed quickly, worked out of the box, and although it complained that a dependency ("hamcrest") was missing from the project, it added it for me, resolving the issue painlessly.

Also added a new Schematron rule to the set, to catch entries with no pron/seg[@type='p'], at SK's request; that caught 19 additional errors, which she's fixing.

23/03/17

Permalink 04:12:11 pm, by mholmes, 14 words, 50 views   English (CA)
Categories: Activity log; Mins. worked: 45

Added new temporary diagnostics report to the build

Per SK's request, new report on entries ending with a specific sequence of chars.

07/02/17

Permalink 04:41:02 pm, by skell, 139 words, 49 views   English (CA)
Categories: Tasks; Mins. worked: 0

zero morpheme character

Greg pointed out that we are using Ø (Latin Capital Letter O with Stroke, U+00D8) for our zero morpheme marker, rather than ∅ (Empty Set, U+2205). The latter is noted in the Unicode character map as the one used in linguistics to indicate a null morpheme or phonological zero.

We have at least been consistent in our use of the former! We may not have known the Empty Set character existed when we chose the other one in 2010, or it may have been a font-based choice. (I'm using Aboriginal Sans in Oxygen right now, and the Empty Set character doesn't display properly.)

Martin will add this change to his list of global changes to make when improving our current encoding, if we can be assured of fonts that include Empty Set along with all the other special characters we need.

25/01/17

Permalink 05:26:25 pm, by mholmes, 63 words, 52 views   English (CA)
Categories: Activity log; Mins. worked: 120

Bug fixes in XML encoding; set up for XEP; fix in XSLT...

Diagnosed the borkedness of a borked XML file; fixed some XSLT; tried building the dictionary only to discover that of course XEP wasn't set up in Oxygen; reconfigured all the old hard-coded paths in build tasks; built the PDF; and more tweaks. Reminder to self: the diagnostics page is erroneously including an extra include for the personography, minus its file extension; needs fixing.

13/12/16

Permalink 02:12:56 pm, by mholmes, 30 words, 56 views   English (CA)
Categories: Activity log; Mins. worked: 60

Feature structure validation

Beefed up the diagnostic processing of feature structures to add stats tables, revealing that many vals are never used. Food for thought. But no new errors revealed, which is good.

:: Next Page >>

Nxaʔamxcín (Moses) Dictionary Blog

This is an XML dictionary project based primarily on the materials compiled by the late M. Dale Kinkade during fifteen years of work in the 1960’s and 1970’s with more than a dozen native speakers of the language, but it also includes materials compiled by Ewa Czaykowska-Higgins in the early 1990’s.

Reports

XML Feeds