In the tei_xml folder, I have added a file called qw-glot-test.xml.
It contains three entries copied from qw-glot.xml which should be good test cases for all the transformations we hope to be able to do.
Phase one: Rescuing the old data (basically complete)
The initial task was to retrieve the original data from its DOS/WordPerfect/Lexware form. The bulk of this work was done by Greg, with the assistance of a piece of software (Transformer) written by me to make certain complicated search-and-replace operations easier.
Phase 2: XML Encoding
(ongoing: this ball is currently in Ewa's court)
We chose TEI P5 (http://www.tei-c.org/P5/), and we decided to avoid all nesting and do all linking through xml:id attributes. We also decided that each entry would be marked up in such a way as to break it down into individual morphemes, each of which would be linked through xml:id to the entry for that morpheme. In this way, most feature information for most entries need not be encoded at all, because it can be retrieved from the entries of the morphemes that constitute it. This makes the encoding simpler and cleaner, offloading much of the work onto the XML database that will store and handle the data.
Phase 3: Storage and Presentation
The search functionality basically works like this:
Phase 4: Media
We plan to integrate audio and visual media into the database in the future.
This is an XML dictionary project based primarily on the materials compiled by the late M. Dale Kinkade during fifteen years of work in the 1960’s and 1970’s with more than a dozen native speakers of the language, but it also includes materials compiled by Ewa Czaykowska-Higgins in the early 1990’s.
|<< <||> >>|