17/05/13

Permalink 03:27:02 pm, by mholmes, 38 words, 4 views   English (CA)
Categories: Activity log; Mins. worked: 90

More work to be done on the presentation

Meeting to review the presentation -- my task now is to collapse six slides which begin with the picture of the filecard box into a single stepped diagram illustrating the old encoding process and the horrible binary result.

16/05/13

Permalink 05:27:51 pm, by mholmes, 13 words, 4 views   English (CA)
Categories: Activity log; Mins. worked: 120

Finished reworking and collapsing my part of the presentation

Section 2 is now down to 6 slides, with more detail and more extensive notes.

Permalink 05:27:14 pm, by mholmes, 82 words, 5 views   English (CA)
Categories: Activity log; Mins. worked: 90

Work on names list

Following Sarah's post, I've done the following:

  • Added a language filter so you can view names only in English or Nxaʔamxcín. This is a crude regex, but it works because English names always begin with caps, and Nxaʔamxcín names never do.
  • Turned off the traffic light display in the names page.
  • Added more processing to the path, to handle rendering of e.g. choice elements inside names.
  • Excluded lexical suffix entries.
  • Elaborated the captions and links a bit.
Permalink 12:17:53 pm, by skell, 229 words, 8 views   English (CA)
Categories: Tasks; Mins. worked: 0

changes for Names pages

Here are a few requests for the Names page on the website:

DONE -exclude Lexical Suffix entries

DONE -fix the display of sic/corr, so that only “Wenatchi” displays, not “WenatcheeWenatchi” (See for example the entry for “Sam George”.)

DONE -put flora (plants) and fauna (animals) in the link text at the top of the page

-separate out the sorting into Nx-Eng and Eng-Nx pages. Ideally, users should be able to view the complete list, or any of the six lists by name type, sorted either by Nxa'amxcin name or by English name. The present setup with Nx and Eng names mixed together in the Name column is somewhat confusing. Continue to sort the Nx-Eng lists based on name tags in prons. For the present, exclude name tags in orths when generating these lists. Sort the Eng-Nx lists based on name tags in defs.

PENDING ECH'S FURTHER DISCUSSION WITH CCT:

Please also generate a printable version of the six lists of names by type. These only need to be sorted alphabetically by Nxa'amxcin name - i.e. only include the name tags within prons when generating these lists. Ideally they would be spreadsheets with the following columns:

Name (pron:seg type= “p”)
Source (following bibl ... if the pron:seg type= “p” is NOT subtype=“i”)
Definition (all defs)
Pronunciation (pron:seg type= “n”)
Source (following bibl)
Word Parts (hyph)

10/05/13

Permalink 02:25:08 pm, by mholmes, 211 words, 9 views   English (CA)
Categories: Activity log; Mins. worked: 90

Security re-established

We've been running the live db with open access since the last time I rebuilt it, so in the process of doing other updates (such as rolling out the Java sorting collations) I've also added back the protection that we had before. In the process of doing this, I got bitten by the horrible eXist bug which enables you to lock yourself out of the admin account if you edit the admin user and forget to retype the password into the two password boxes (the effect is that you end up with a random admin password that you can never discover). As a result, I had to remove the server version of the app and replace it with a refreshed version of my local copy. This failed the first few times -- Tomcat tries to auto-deploy the app before it's completely uploaded the dbx files, so the uploaded .filepart files can not be renamed to overwrite the ones created by the live startup. It took two or three shots to get this problem solved. The only way seems to be to let it deploy, but stop it immediately in the Tomcat manager; then delete all the dbx, lock and log files; then upload them again; then restart it in the manager.

Permalink 12:22:00 pm, by skell, 96 words, 7 views   English (CA)
Categories: Activity log; Mins. worked: 5

print dictionary layout and web dictionary sort orders

1) For the linguists' dictionary, we would like to see:

first phonemic representation in bold <orthography in angle brackets> [narrow transcription(s) in square brackets], for both forms and cits - e.g.:

ʔáyx̣ʷt <ʔáyx̌ʷt> [ʔáyəx̣ʷt]
√ʔáyx̣ʷ-t
1. be tired
2. tired, worn out

• √ʔáyx̣ʷ-tl kɬʔámnc
<√ʔáyx̌ʷ-tl kɬʔámnč>
[√ʔáyəx̣ʷ-t ləkɬəʔámənč]
he is tired of waiting (for you / me)

2) On the website, we would ultimately like things sorted by orthography.

Permalink 11:33:31 am, by mholmes, 91 words, 10 views   English (CA)
Categories: Activity log; Mins. worked: 90

Handling of homographic glosses

This morning we decided that a simple and quick way to distinguish between homographs with different meanings is required to make the English lookup part of the dictionary less confusing. This will be achieved by adding a clarificatory word or phrase in the @n attribute of a gloss. Glosses will then be presented in the E-to-M view with this clarification in parentheses. Processing on the website will need to be changed to take account of this, and the print dictionary rendering will also have to be written with this in mind.

02/05/13

Permalink 03:27:05 pm, by mholmes, 324 words, 18 views   English (CA)
Categories: Activity log; Mins. worked: 120

Duplicate @xml:ids

The problem of duplicate @xml:id attributes on entries has now become a serious issue for the print dictionary building, because I'm unable to properly process the entire collection properly to produce the book; to build the dictionary I have to use XInclude to create a single XML source file, and when I do that there are over 1600 duplicate ids which prevent some of the processing steps from being successful.

I've taken a quick look at where the duplicates tend to be concentrated, by adding the files in alphabetical order and looking to see how many duplicates occur with each addition. These files create no problems (i.e. they have no duplicates among themselves):

affix_glot-ix.xml
affix_k-m.xml
affix_n-t.xml
affix_u-CAPS.xml
c.xml
c-glot.xml
c-rtr.xml
glottal.xml
h.xml
h-phar-part1.xml
h-phar-part2.xml
l-affric.xml
lex-suff.xml
new-data-2013.xml
p-glot.xml
phar-w.xml
qw-glot.xml
s-rtr.xml
t-glot.xml
xw.xml

When I add the remaining files, one by one (and only one at a time), these are the results:

k.xml            100 duplicates.
k-glot.xml:         18
kw.xml:              2
kw-glot.xml:          2
l.xml:              3
l-fric.xml:          6
m.xml:              3
n.xml:             97
p.xml:              7
particles.xml:          4
pron.xml:          2
q.xml:              4
q-glot.xml:          3
qw.xml:              1
rescued.xml:         54
s.xml:              2
t.xml:             20
ww-glot.xml:          4
x.xml:              3
x-uvul.xml:          4
yy-glot.xml:          4

What I'm going to do is develop the dictionary output using only the valid files, and then add the others in as they get fixed. In the meantime, it might be worth having a go at some of the low-hanging fruit (the ones with only two or three duplicates). More will show up as we add those in, of course -- there will be duplicates across the currently-excluded files as well as those that they share with the "good" files. So the dictionary PDFs will shrink in size, but I'll be able to start doing things like generating page-references that depend on xml:ids.

26/04/13

Permalink 01:15:10 pm, by mholmes, 93 words, 14 views   English (CA)
Categories: Activity log; Mins. worked: 120

Another collation, and a fork of the dictionary output

I've created a new MosesPhonemicCollation jar for sorting based on the phonemic representations. I've also forked the dictionary build process based on a parameter called "dictionaryType", which can be "learner" or "linguist". The former produces a dictionary based on the orthography, sorted with the MosesOrthographyCollation, and the latter produces one based on the phonemic transcriptions, with the new collation. The "alphabet" guides that run across the bottoms of pages are also appropriately different. I've abstracted the front matter into a separate file, and I'm auto-including the personography, although I'm not processing it yet.

25/04/13

Permalink 11:49:08 am, by mholmes, 110 words, 18 views   English (CA)
Categories: Activity log; Mins. worked: 180

More collation work

The idea of having a single collation to sort everything in our db is now impractical, because the orthographical sorting rules clash with the transcriptional sorting rules, so I've created a new, simpler MosesOrthographyCollation class for sorting the orthography only. It's working well, but there are still some outstanding questions about it. In the meantime, we can't update the website because we don't have orthographies there yet, so this is only going to be used in for the print dictionary generation.

This has had to be redone a couple of times due to changes in the list of glyphs, but it's working now and tested with the print dictionary system.

:: Next Page >>

Nxaʔamxcín (Moses) Dictionary Blog

This is an XML dictionary project based primarily on the materials compiled by the late M. Dale Kinkade during fifteen years of work in the 1960’s and 1970’s with more than a dozen native speakers of the language, but it also includes materials compiled by Ewa Czaykowska-Higgins in the early 1990’s.

Reports

Categories

May 2013
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  

XML Feeds