We have identified the following repetitive steps that the editors are doing. We're hoping these steps can be done globally, programmatically. Please make the following global changes in all files with status="unedited", and Sarah and Caitlin's "editing" files, c.xml and x-uvul.xml.
1. DONE, 9Aug13: Comment out hyphs in monomorphemic entries.
2. DONE, 9Aug13. REVISED with bug fixes, 9Sep13:
For inferred root entries, change structures like this:
<entry xml:id="qəm">
<form>
<pron>
<seg type="p">qə́m</seg>
<bibl><!--[No source]--></bibl>
</pron>
<!--Original hyph element before phonemicization:-->
<!--<hyph xmlns="http://www.tei-c.org/ns/1.0">
√<m corresp="m:UNASSIGNED">qəm</m>-</hyph>-->
<!--<hyph>√<m corresp="m:UNASSIGNED">qəm</m>-</hyph>-->
</form>
<sense> </sense>
<!--[Not yet edited.]-->
</entry>
to this:
<entry xml:id="qəm">
<form>
<pron>
<seg type="p" subtype="i">qəm</seg>
<bibl corresp="psn:ECH">ECH</bibl>
</pron>
</form>
<sense>
<def>
<seg><note type="editorial" resp="psn:ECH">Meaning unclear.</note></seg>
<note type="referToElders" resp="psn:SMK">Is qəm a word by itself? If so, what
does it mean? Click 'other entries containing this morpheme' for examples of
words with the root qəm.</note>
</def>
</sense>
<fs>
<f name="baseType">
<symbol value="root"/>
</f>
</fs>
<note type="editorial" resp="psn:ECH">Root entry added based on attested complex forms.</note>
<!-- [Editor: please double check the following and update this entry accordingly:
Is this a root or a stem? Update <symbol value=>
Was it inferred by ECH or MDK? Update <bibl>
Has it been added based on several attested complex forms, or just one? Update <note> above.]-->
<!--[Not yet edited.]-->
</entry>
This involves the following steps:
-remove stress mark from contents of pron:seg
-add subtype="i" within the pron:seg tag
-replace <bibl><!--[No source]--></bibl> with <bibl corresp="psn:ECH">ECH</bibl>
-replace empty <sense> </sense> with the following. Can you embed the contents of the pron:seg into the contents of the <note>??
<sense>
<def>
<seg><note type="editorial" resp="psn:ECH">Meaning unclear.</note></seg>
<note type="referToElders" resp="psn:SMK">Is contents of pron:seg a word by itself? If so, what does it mean? Click 'other entries containing this morpheme' for examples of words with the root contents of pron:seg</note>
</def>
</sense>
<fs>
<f name="baseType">
<symbol value="root"/>
</f>
</fs>
<note type="editorial" resp="psn:ECH">Root entry added based on attested complex forms.</note>
<!-- [Editor: please double check the following and update this entry accordingly:
Is this a root or a stem? Update <symbol value=>
Was it inferred by ECH or MDK? Update <bibl>
Has it been added based on several attested complex forms, or just one? Update <note> above.]-->
3. MDH attempted, 13Aug13, but no matching contexts can be found in the data:
Within each <sense>, remove defs which duplicate the <seg> and following <bibl> in a <quote>. (These were mistakenly copied twice in the Lexware to XML conversion.)
For example, change this:
<sense>
<def>
<seg>I <gloss>hear</gloss>d you</seg>
<bibl corresp="psn:W">W8.138</bibl>
</def>
<def>
<seg>I did <gloss>not</gloss> <gloss>hear</gloss> you very well</seg>
<bibl corresp="psn:W">W8.139</bibl>
</def>
<cit>
<quote>
<phr type="p" subtype="i">lút x̣ə́st √cq=ánaʔ-mn-c-n</phr>
<bibl corresp="psn:ECH">ECH</bibl>
<phr type="n">lút x̣ə́st √cq=ánaʔ-mən-č-ən</phr>
<bibl corresp="psn:W">W8.139</bibl>
<seg>I did <gloss>not</gloss> <gloss>hear</gloss> you very well</seg>
<bibl corresp="psn:W">W8.139</bibl>
</quote>
</cit>
</sense>
to this:
<sense>
<def>
<seg>I <gloss>hear</gloss>d you</seg>
<bibl corresp="psn:W">W8.138</bibl>
</def>
<cit>
<quote>
<phr type="p" subtype="i">lút x̣ə́st √cq=ánaʔ-mn-c-n</phr>
<bibl corresp="psn:ECH">ECH</bibl>
<phr type="n">lút x̣ə́st √cq=ánaʔ-mən-č-ən</phr>
<bibl corresp="psn:W">W8.139</bibl>
<seg>I did <gloss>not</gloss> <gloss>hear</gloss> you very well</seg>
<bibl corresp="psn:W">W8.139</bibl>
</quote>
</cit>
</sense>
Searching for duplicates such as this one this could be more difficult in cases where programmatic gloss tagging has been done differently in the defs than in the cits. Duplicate defs to be deleted should contain:
-a seg identical to a quote:seg in a following cit, ignoring <gloss> tags and the *character, and
-a sister bibl completely identical to a quote:bibl in a following cit.
4. DONE, 13Aug13: When two or more defs in the same <sense> are identical, collapse them together, but concatenate their <bibl>s. For example, change this ...
<sense>
<def>
<seg>Nez Perce</seg>
<bibl corresp="psn:EP">EP2.84.8</bibl>
</def>
<def>
<seg>Nez Perce</seg>
<bibl corresp="psn:JM">JM2.141.2</bibl>
</def>
</sense>
to this:
<sense>
<def>
<seg>Nez Perce</seg>
<bibl corresp="psn:EP psn:JM">EP2.84.8; JM2.141.2</bibl>
</def>
</sense>
5. DONE, 7Aug13. Please add <name type="flora"> and <name type="fauna"> to the schema, so they will appear in the dropdown list of values.