On SK's instructions, I've run the autophonemicizer on all files with @status="unedited" and "rescued". Everything validates, and I've committed to SVN and uploaded to the db.
The autophonemicizer now handles prons, hyphs and quotes, and seems to be working well. One final tweak might be to re-generate the @xml:id of the entry based on the phonemicized hyph as well, but we may want to leave that until later, when we can just more reliably use the first seg in the pron.
SK will test, and if she's happy, she'll send me a list of files this should be run on, and I'll run it on Monday.
We had a common situation in unprocessed files in which the same transcription would appear in multiple segs in the same pron, with different bibls. I've written XSLT to detect and collapse these instances, and after some testing and tweaking we ran it on the whole collection today. This puts us in a better position to do phonemicization of the segs, because we don't have to allow for duplicates during that process.
Phonemicization has also been extended to deal with R, and we've realized that we can add a phonemic seg based on the existing hyph (text content), so the code that already works for phr should be usable. I have to finish that bit, and also rewrite the phr stuff so that it does some collapsing of its own at the end of the process.
Once in a while Moses loses its marbles when ingesting a large number of documents. A couple of suggestions:
When it does fall over, this is what seems to be necessary:
I remembered something today that will make it much easier to autophonemicize pron:segs.
In Lexware, all the entries were hyphed. MDH took the same input to generate both the pron:segs (removing the morpheme delimiters) and the hyphs (adding the m-tags).
So we need use the contents of the hyphs, minus their m-tags, to generate the pron:<seg type="p" subtype="i">s. Then the morpheme delimiters will be included, and we can use the same autophonemicizer transformation that we used for the <phr>s. Then we simply remove + - = √ from the pron:segs again. Then we can collapse duplicate pron:segs, the same as we're doing for <phr>s.
So we don't have to worry about writing complicated rules using a long list of prefixes to find the "root-initial" context in pron:segs without morpheme delimiters.
We will also autophonemicize the contents of the hyphs, by removing the m-tags, autophonemicizing, and then adding the m-tags back in based on the morpheme delimiters.
Autophonemicizer2.doc contains detailed instructions for all this.
MDH has also implemented a transformation to collapse identical segs within the same pron. This happens before the autophonemicization.
Visions of Autohyphenators dance in my head! Maybe next year ...
We've decided to extend the existing XSLT rather than create a new file and duplicate code, so <segs> in <pron>s will now be handled by the same transformation that handles <phr>s in <quote>s. The bones are in place, and the existing code is still working well on our test set with the empty branching in place to support <pron>s. The test set has been extended and we've already got a long regex to match the range of prefixes which will serve as a surrogate for the more convenient morpheme delimiters.
One final step: the collapsing of duplicate <phr> elements, which is currently handled by an external post-transformation, needs to be brought back into this single file, so that all the changes required are done at the same time.
Here's a summary of what MDH needs to do next to improve the Autophonemicizer for <phr>s, and implement an Autophonemicizer for pron:segs:
1) Add new step 5c from Autophonemicizer2.doc to the autophonemicizer for <phr>s.
2) Add new step 7 from Autophonemicizer2.doc, to compare generated <phr type="p" subtype="i">s to attested <phr type="n">s and collapse identical ones. This will be a subsequent step to the whole autophonemicization process.
3) Adapt the autophonemicizer for pron:segs
Got our abstract finished and submitted. The deadline has been extended, but it's good to get it out of the way.
Skyped with LR about the possible next steps for TEI as an LMF serialization.
Tweaked the XSLT for actual use (adding utility templates to preserve whitespace between PIs etc.), and SK has now tested this with real data. She has some suggestions for improvements, one of which might be achievable, although it will require a structural rewrite, but results are good so far, and most cases of deviation from desired results are in situations where there is no mechanically-detectable context which could be used to change the outcome. Can this now be extended to pron/segs, where there's even less context because hyphenation is not there?
Still working on this, with a slow but fruitful discussion on the LMF list helping me to confirm that what I thought were limitations in LMF interoperability really are so. I'm coming round to LR's view that TEI would be a better serialization format.
Much frustration involved in my belated (re?)discovery that neither word-boundaries nor lookarounds are supported in the XPath implementation of regular expressions. Grrr. But now working fine, with lots of help and a test set from SK. We can start testing it on whole files tomorrow.
Note to selves for the future: how will we deal with English homophones for sorting the Eng-Nx word list? If we remember, let's list the ones we come across here. So far we have:
fire - "flames" vs. "dismiss employee"
fast - "quick" vs. "abstain from food"
hide - "skin" vs. "conceal"
cold - adj or noun "coldness" vs. "illness"
saw - "tool" vs. past tense of see (replaced with <gloss subtype="i">see</gloss>, SMK 21May13)
close (near) vs. close (shut)
stern (back of boat)
game (recreation)
watch (observe) vs. watch (wristwatch)
back (body part)
top (toy) vs. top (of something)
fish vs. fish (catch fish)
ECH's goal for the search engine in the web database is that, if a user searches for "fat", s/he will get results including fat, fatten, fattening, fatty.
Our current settings, and our policies for adding inferred glosses, seem to be accomplishing this nicely. An entry which has "fatty" in its def is found by a search for "fat", because it also has an inferred gloss "fat".
Searching for "fat*" also returns defs including fat, fatten, fattening, fatty ... but also fatal, fathom, father.
We reviewed our gloss-tagging policies yesterday, and concluded that yes, we are placing inferred gloss tags correctly for the purposes of generating the English-Nxa'amxcin word list, both in the web display and for the future print dictionary.
I summarize our notes about the Eng-Nx section of the print dictionary here, so we can remind ourselves in the future!
-The Eng-Nx section in the print dictionary should be considered a (fairly detailed) index to the Nx-Eng side, not a full Eng-Nx dictionary. It will be comparable to what MDK did in his Chehalis dictionary.
-Ours will go one step further than the Chehalis dictionary, in that, for example, a Nxa'amcxin word with "fattening" in its def will be found under fat, fatten, and fattening (not just the lemma, fat).
-Our print version will be like our current Eng-Nx wordlist view in the web interface, expanded to the first level of detail - e.g.
fatten
kn sacqʼʷúcnctəxʷ fattening
ʔacqʼʷúcn fattening
ʔacqʼʷúcts fattened
-Inferred glossed will be hidden in both the web view and the print dictionary, although they are important for the "behind the scenes" generation of the Eng-Nx wordlist.
-Our gloss-tagging process should provide at least one English key for each Nxa'amcxin word. It currently almost accomplishes this. There is just the occasional def in which it is impossible to figure out what the gloss-tag should be - e.g,
<seg>someone who goes fishing or hunting and does not get anything; poor fisherman; poor hunter</seg>
Got some good work in today, and it feels like it's coming together. 8 pages done, about another 8 to do, I think, and some diagrams required.
See autophonemicizer2.doc in moses/trunk/docs. It can be done with XSLT and regular expressions.
For every paragraph I write, I seem to have to find and read two more papers...
Continued building the annotated biblio, and drafted a few paragraphs from the rescue section.
Laid out the outline for the article and divided up sections for drafting. We start writing tomorrow.
Lots of reading, some annotation and other note-taking, quote-garnering and much useful discussion of how our features map onto GOLD (not very well, because we are focused a lot on bound morphemes, and GOLD becomes sparse at that level). I've also built a complete RELISH schema, which gives me access to lots of stuff we'll need that was missing from the core, including e.g. <SenseExample>.
Discussion with ECH to prepare for writing the article next week. We have a stack of stuff to read, but the basic outline is becoming clear. There's still a lot of work to do on GOLD/ISOcat integration.
This is very slow and painful work. Half the problem is the very slow response of the clarin isocat site. I've also built a status page which shows all the mappings so far, and links them to the ISOcat definition pages, so that ECH can check them. There are some I haven't been able to map, and others that I'm very unsure about.
A bit slow and painful, but it's clarifying some of our original decisions on features and we've discarded a couple of unused ones so far.
We've discussed some code structure issues outlined below, and agreed that they're desirable, but should be put off until later because they increase the quantity of code we'll have to edit. One approach is to write XSLT to create these changes, and make the changed versions of our XML files available through the website, while we edit the unchanged versions behind the scenes; then when the time is right, we can convert everything permanently. Here are the details:
I've been reading through LR's jTEI paper with a view to bringing our encoding more into alignment with the recommendations there (which should also make it more amenable to LMF-ication), and I think we should reorganize the way we're doing citations a bit. At the moment, we have this:
<cit>
<quote>
<phr type="p" subtype="i">s-√cə́s=lqs kˀʷáʔncás</phr>
<bibl corresp="psn:ECH">ECH</bibl>
<phr type="n">s-√cə́s=əlqs kˀʷáʔəncás</phr>
<bibl corresp="psn:JM psn:AM">Y14.219,220</bibl>
<seg>a mosquito bit me</seg>
<bibl corresp="psn:JM psn:AM">Y14.219,220</bibl>
</quote>
</cit>
In this, we rely on contiguity to associate each <bibl> with its preceding element, and we rely on <phr> and <seg> to distinguish original from translation. What we might do instead would look like this:
<cit>
<cit type="example">
<cit>
<quote xml:lang="col" type="p" subtype="i">
s-√cə́s=lqs kˀʷáʔnc
</quote>
<bibl corresp="psn:ECH">ECH</bibl>
</cit>
<cit>
<quote type="n">s-√cə́s=əlqs kˀʷáʔəncás</quote>
<bibl corresp="psn:JM psn:AM">Y14.219,220</bibl>
</cit>
</cit>
<cit type="translation">
<quote xml:lang="en">a mosquito bit me</quote>
<bibl corresp="psn:JM psn:AM">Y14.219,220</bibl>
</cit>
</cit>
This is much more detailed, but it makes more things explicit. It uses nested <cit> tags to ensure that each quote is bracketed with its <bibl>, and that each <quote> has the required @xml:lang setting. The second level of <cit> is divided into @type="example" and @type="translation" (following recommendations in the TEI Guidelines), and the @type and @subtype values are realized directly on <quote>, rather than requiring the use of <phr> or <seg>.
The obvious drawback is that there's more code here. Existing <cits> should be easy to convert to this framework with XSLT, though.
Similarly, we currently have things that look like this:
<pron> <seg type="p">hámp</seg> <bibl corresp="psn:J psn:MS">J3.72-74,78; MS1.53</bibl> <seg type="n">hə́mp</seg> <bibl corresp="psn:JM psn:AM">Y24.90; Y29.179; Y6.282</bibl> </pron>
where the association between <seg> and <bibl> again depends on sequence. I wonder if we might be better off with two <pron>s:
<pron type="p"> <seg>hámp</seg> <bibl corresp="psn:J psn:MS">J3.72-74,78; MS1.53</bibl> </pron> <pron type="n"> <seg>hə́mp</seg> <bibl corresp="psn:JM psn:AM">Y24.90; Y29.179; Y6.282</bibl> </pron>
where the @type attribute is applied to the <pron> element, and the <bibl> is unambiguously associated with the appropriate <pron>?
Again, it's a bit more code, but it seems a bit cleaner, and as I try to map our data onto the sorts of structures allowed by Lexus, it looks like this sort of approach will work better.
I've finished working through LR and WW's article in jTEI re TEI and LMF, and made a couple more encoding changes as well as tidying up some rendering; I've also proposed a re-working of our <cit> encoding, using nesting and @type to tighten the specificity and make it clearer which <bibl> is attached to what. This is pending approval from ECH and SMK. I've also generated a list of cross-references which aren't actually pointing at anything yet.
<ref>/@target values have also been converted to use the m: prefix, and the rendering code is now capable of handling multiple space-separated values, and handling both m: and non-m: values.
Changed all @sameAs to @corresp, and tested and bug-fixed all the rendering code, then deployed the changes to the live db. Also updated some documentation (more to be done here).
I've now written some XSLT to convert all @sameAs to @corresp in the <m> element, and also to prefix all such values (along with those already in @corresp by virtue of being multiple) with the m: prefix (our planned private URI scheme for morpheme pointers). I've also written updates to XQuery and XSLT to take account of this, currently commented out, and I'm going to be testing everything locally tomorrow and fixing before I upload to the live db. Everything has to change at once before it will work. I also have to update documentation.
Made some changes to provide suggested psn: values in some attributes for convenience; also wrote some XSLT to generate said values for the ODD file from the personography, and created a Schematron schema (still to be linked into the XML files) which has two constraints, and will have more, designed to really tighten up our encoding practice.
Did a lot of manual work on unattested glosses in cits after the tranform, because the variety of formations turned out to be too complex for a couple of regexes.
Unattested glosses, originally indicated with angle brackets, appear in two places: <def>s and <cit>s. Those in <def>s should be converted into something more suitable; once that's done, those in <cit>s can be deleted. This is the form (a complex instance):
<def> <!--Generated from: [ *clabber|ed milk <*sour>; *cottage~*cheese <*sour> ]--> <seg><gloss>clabbered</gloss> milk <<gloss>sour></gloss>; <gloss>cottage~*cheese</gloss> <<gloss>sour></gloss> </seg> <bibl corresp="psn:JM psn:AM">Y14.182</bibl> </def>
This needs to be converted such that the unattested gloss is lifted out of the context, and turned into a new <seg> with a <bibl> ascribing it to ECH:
<def>
<!--Generated from: [ *clabber|ed milk <*sour>; *cottage~*cheese <*sour> ]-->
<seg>
<gloss>clabbered</gloss> milk; <gloss>cottage~*cheese</gloss>
</seg>
<bibl corresp="psn:JM psn:AM">Y14.182</bibl>
<seg><gloss type="i">sour</gloss></seg><bibl corresp="psn:ECH">ECH</bibl>
</def>
Note that there are two instances of the same unattested gloss in the original, but we should have only one in the output, so I'm using distinct-values in the XSLT. Also note that the opening angle-bracket entity is outside the tag, but also needs to be removed. I've now written the XSLT for this, and I'll run it tomorrow morning.
Once that job is done, the only remaining unattested glosses will be in cits, and they can be commented out. You can find them with this regex:
(<<gloss>[^<]+<</gloss>)
and replace them with:
<!-- $1 -->
There are also instances of these things without gloss tags:
<seg> <gloss>clabber|ed</gloss> milk <*sour>; *cottage~*cheese <*sour> </seg>
Those can be matched with:
(<[^<]+<)
Updated the XQuery and XSLT to account for changes spelled out in recent posts. The online db is now working again with the current XML files.
Accomplished all the planned changes to XML files through XSLT, and also tweaked the schema a bit. More work is to be done on the schema, probably using oddbyexample.
Arising out of today's meeting:
We have decided to always hyphenate compound words into ALL their components, as in the following examples.
√<m sameAs="ḥawˀy">ḥáwˀy</m>-<m sameAs="aɬ">a</m>-
<m sameAs="s">s</m>-<m sameAs="n">n</m><m sameAs="DIM">C₁</m>√<m sameAs="cwˀaxaʔ">cwˀáxaʔ</m>
<m sameAs="kas">kas</m>-√<m sameAs="ḥawˀy">ḥáwˀiy</m>-<m sameAs="aɬ">ɬ</m>-√<m sameAs="təmnayˀ">təmnayˀ</m>-<m sameAs="mix">əxʷ</m>
That is, we will NOT just divide compounds into stem-connector-stem. If we keep the structure flat, as in the examples above, it reduces the number of inferred entries we have to create, and means we don't have to interpret potentially ambiguous morphological structures. (The first example above is clearly [√ḥáwˀy]-a-[s-n-c-√cwˀáxaʔ], but the second could be [kas-√ḥáwˀiy]-ɬ-[√təmnayˀ-əxʷ] or kas-[√ḥáwˀiy-ɬ-√təmnayˀ]-əxʷ.)
ALL compound entries will have this feature structure
<fs>
<f name="baseType">
<symbol value="compound"/>
</f>
</fs>
We will create an inferred root entry for the root of the second stem, if it does not already exist in the database, and add a <note type="referToElders"> to the compound entry, asking whether the second stem can stand on its own as a word. If the Elders say yes, we will create a new entry for the stem, and add <xr>s to and from the compound entry.
Password protection got removed when I uploaded my local copy of Moses to the server, so I tried copying the SVN version of web.xml up there, but that killed the app and Tomcat couldn't restart it, so I replaced the original web.xml, and instead added in the changes in the svn version of web.xml manually into the server copy. That seems to be working. Had to restart Tomcat a couple of times, but that seems to be smooth and problem free since Tomcat-stable was moved to Grape.
I have entered all the lexical suffixes from MDK's cards, but have not entered their dictegs, on the assumption that these words exist elsewhere in the database, organized by their prefix or root.
As I was entering the lexical suffixes, I only checked for their dictegs elsewhere in the data if the dictegs were:
-personal names, or
-examples of lexical suffixes with Meaning Not Determined.
I found that the vast majority of these dictegs do exist elsewhere in the data, BUT:
-sometimes additional info is on the lexical suffix card (e.g., Sam Miller)
-sometimes the word only exists in another dicteg (e.g., shotgun)
-sometimes the morpheme breakdown is different (e.g., Nellie Leo, Canada goose)
-sometimes the entry is not all there due to a bad conversion from Lexware (e.g., Paul Timentwa)
Our approach to this issue will therefore be:
-wait until all the alphabetical files are edited
-check that all dictegs NOT yet entered from the lexical suffix cards exist as entries elsewhere in the data. Enter any missing information, and address differing morpheme breakdown.
-check lexical suffix dictegs that WERE entered at the Lexware stage:
--check cards against Lexware. Pencil any changes onto the Lexware printout.
--check whether the dictegs exist as entries in other files. Refer to phr_to_seg_matches_2.odt.
---if phr to seg is a perfect match in the list, search on the xml:id to view the entry. Check whether the translation is also a match. If the translation adds any new information that's not in the entry already, copy the new information from the lex-suf file into the alphabetical file, with a Comment about where it came from. Then Comment out the dicteg in the lex-suf file.
---if phr to seg is NOT a perfect match in the list, search more carefully on the phr and/or the translation to try to find the entry, and why it didn't match. Check Lexware printout for discrepancies. Inform Martin of any perfect matches NOT found by the search.
----if the entry can be found, copy any relevant information from the lex-suf file into the alphabetical file, with a Comment about where it came from. Then Comment out the dicteg in the lex-suf file.
----if the entry really cannot be found, copy the whole dicteg into the alphabetical file (near its root), with a Comment about where it came from. Build a well-formed entry, changing the <phr> to pron:seg and the <seg> to def:seg. Then Comment out the dicteg in the lex-suf file, noting that it could not be found elsewhere and has been copied into the appropriate alphabetical file.
Tomcat went into a tailspin yesterday while I was in the middle of uploading to the Moses db, and the db (presumably) got corrupted; in any case, Moses would not come back even after two restarts of Tomcat. This morning, I brought down Tomcat and replaced the Moses webapp (in webapps-dev) with a copy of my local version. All seems to be working now.
Here are some editing tasks that Ewa needs to come back to:
1) Review entries with [No meaning specified.] in the following files, and add inferred definitions or change to [Meaning unclear.]
h-phar-part1.xml (3 entries)
h.xml (4 entries)
phar-w.xml. (1 entry)
Fruitful discussion on how we might auto-generate orthographic representations, and check these against community-generated versions, based on morphological information, phonemic context, and phonetic context/realization rules.
I have replaced
U207B superscript minus, and
U2010 hyphen
with
U002D hyphen-minus
The latter is what occurred the most in the data, and is what you get if you type either hyphen or minus on the keyboard.
I've implemented the schema constraints described in the previous post, making @type required on <note>, and providing the documentation to ease usage in Oxygen. This will render many files invalid, since there are a lot of note elements currently lacking @type.
This took longer than it should have because the TEI Debian Roma package appears to be broken at the moment. Reported my symptoms to SR.
We have decided on the following note type values. These will allow us to organize the different kinds of notes, and later to decide which ones to suppress from the print dictionary and/or the online interface.
<note resp="psn:MDK" type="cultural"> cultural notes, from Kinkade's cards
<note resp="psn:MDK" type="comparative"> comparisons to other Interior Salish languages (Coeur D'Alene, Okanagan-Colville), from Kinkade's cards
<note type="editorial"> - should be used:
1.When a root or stem is inferred from a complex word or dicteg. In other words, when we infer an <entry>.
<note type="editorial" resp="psn:ECH">[Root entry added based on attested complex forms.]</note>
<note type="editorial" resp="psn:ECH">[Root entry added based on attested examples.]</note>
2. When the definition of a word or phrase is inferred – e.g.,
<note type="editorial" resp="psn:ECH">[Definition inferred based on complex forms containing this root.]</note>
<note type="editorial" resp="psn:ECH">[Definition inferred based on phrase containing this stem.]</note>
3. Etymological notes – e.g., “from Chinook”, “onomatopoeic”
4. Genealogical notes – e.g. ,“uncle of Jerome Miller's father”, “Henry Miller's childhood name”
5. Details on morphology – e.g.,
<def>
<seg>diminutive</seg>
<note type="editorial">[The reduplication is usually accompanied by glottalization of resonants in the word.]</note>
</def>
<def>
<seg>relational</seg>
<note type="editorial">[Attaches to motion verbs, psychological event verbs, speech act verbs, and transfer source verbs; cooccurs with transitive, causative, applicative and external possession markers.]</note>
</def>
6. Contextual notes – e.g., “said if the rabbit belongs to the person addressed”
7. Elaborations on definitions – e.g., “mainly for dried salmon that has gotten old.
<note type="internal"> - should be used for any kind of comments about the database structure, possible analyses, or for comments about notes that MDK made on his filecards, including forms MDK noted as ungrammatical. Some of MDK's notes have consequences for how we structure the entries; some of his notes are simply points that we want to keep track of, but that do not need to be shown to non-editors. So internal notes are essentially keeping track of editors' decisions and unresolved questions, and MDK's comments.
<note type="referToElders"> - questions for the Elders.
Editorial cases 1 and 2 above could also be referred to Elders for confirmation or additional discussion, with a separate <note type="refertoElders">.
It turns out that matching against only <seg type="n"> misses most of the matches, because entries that haven't yet been processed don't have type attributes. Thus take 2, which finds 670 matches:
declare default element namespace "http://www.tei-c.org/ns/1.0";
import module namespace util="http://exist-db.org/xquery/util";
for $p in collection('/db/moses/')//TEI[@xml:id='lex-suff']//phr[@type='n'][parent::quote]
let $target := normalize-space(translate($p/text()[1], '+-=√‐', '')),
$matches := collection('/db/moses/')//pron/seg[text() = $target]
return
if (count($matches) gt 0) then
concat('*** ', $p/ancestor::entry/@xml:id, ' (', $target, ') [', $p/text(), '] matches ', collection('/db/moses/')//pron/seg[text() = $target][1]/ancestor::entry/@xml:id)
else
concat(' ', $p/ancestor::entry/@xml:id, ' (', $target, ') [', $p/text(), '] has no matches. ')
This was the code for detecting dupes where dictegs exist in lex-suff which match full entries elsewhere:
declare default element namespace "http://www.tei-c.org/ns/1.0";
import module namespace util="http://exist-db.org/xquery/util";
for $p in collection('/db/moses/')//TEI[@xml:id='lex-suff']//phr[@type='n'][parent::quote]
let $target := translate($p, '+-=√‐', ''),
$matches := collection('/db/moses/')//seg[@type='n'][text() = $target]
return
if (count($matches) gt 0) then
concat('*** ', $p/ancestor::entry/@xml:id, ' (', $target, ') [', $p/text(), '] matches ', collection('/db/moses/')//seg[@type='n'][text() = $target][1]/ancestor::entry/@xml:id)
else
concat(' ', $p/ancestor::entry/@xml:id, ' (', $target, ') [', $p/text(), '] has no matches. ')
From today's meeting:
Some dictegs inside the lex-suff.xml file are basically duplicates of entries in the main entries files, and should therefore be commented out; and conversely, some are not reproduced in the entries files, and should be copied over to full entry status before being commented out. In order to detect this situation, I need to:
This should help us keep track of what's happened if we get an SVN conflict.
I've begun the process of mapping our feature structures onto the GOLD ontology, using the @dcr:datcat and @dcr:valueDatcat attributes in TEI, based on unique identifiers from the ISOCat.org version of GOLD 2010. Here I'll gather together a list of concepts which are difficult to map or for which no obvious terms appear in GOLD; based on this, we'll contact the GOLD maintainers for advice.
Two new informant values added to the @corresp attribute of bibl based on content.
SK has now updated the project documentation (DictionaryEditingManual.odt, NotesOnDefinitionsAndGloss-Tagging.odt, and EditorialDecisions.odt) to include the corresp="psn: " attribute, as well as our decisions from last December around <persName>, <placeName>, and <orgName> tags. (The latter are also laid out in DecisionsOnMarkupOfNames.odt in the docs directory.)
Read these over and update the XML Markup Documentation.
The last transform threw up a bunch of default attribute values that we didn't want, so I'm going to have to get rid of them. This is the list:
I've done various transforms to get rid of these, and removed them from the schema.
Wrote an identity transform to add psn:SM to the @corresp in <bibl>s which mention SM in their textual content. We will do the same for two other previously-untagged informants in the same way.
Updated the personography and some XSLT based on previous post of plans for the next phase. Then got ECH and SK set up and working on the current repository.
Met to plan the next couple of months of work and preparation. Takeaways:
Met with ECH to plan and draft an abstract on the use of TEI in our project for ICLDC. Did a draft, which is too long, of course, and sent it to ECH.
Went through the responses from the SSHRC last fall, and planned the next application; I have a task to do for this on Wednesday.
Met with ECH to plan work for the Fall, and write application for the HCMC committee. The application is done, workstation time is booked for SK, and we have plans under way for grant applications etc.
This is an XML dictionary project based primarily on the materials compiled by the late M. Dale Kinkade during fifteen years of work in the 1960’s and 1970’s with more than a dozen native speakers of the language, but it also includes materials compiled by Ewa Czaykowska-Higgins in the early 1990’s.
| << | Current | >> | |
| Jan | Feb | Mar | Apr |
| May | Jun | Jul | Aug |
| Sep | Oct | Nov | Dec |