Further to commas and glottals
Response to Martin and Greg regarding glottalization:
Well I am happy with everything here: in the data, ejectives will be transcribed with a raised comma; glottalized resonants will be transcribed with a superscript glottal. In the output, all the segments will be transcribed with raised comma.
The ejective and sonorant/resonant categories are not quite right however. They should be:
Ejectives: p’, t’, c’, ƛ’, k’, kʷ’, q’, qʷ’
Sonorants: mˀ, nˀ, lˀ, ḷˀ, rˀ, wˀ, yˀ and ʕˀ, ʕʷˀ (the voiced phargyngeal fricative, what you are calling epiglottal) which appears as both plain glottalized and rounded glottalized.
Belted l is never glottalized.
The question of which raised comma character to use: I agree that modifier letter apostrophe is the best option. (It’s too bad about the handwritten alphabet having the w and y with raised commas above--when we transcribe by hand we are not always as precise as we should be; raised comma above is often not distinguished from raised comma just to the right when one writes by hand, but clearly this is an important difference when using computer fonts)
Normalizing the data: I definitely think we should do this with simple search and replace operations. These are easy to do and can be done as I go through each file, although clearly I will need to watch out for apostrophes in the English text, as these do appear.
I’m glad we are in agreement about all this. I look forward to hearing about the techniques for searching with XPath in oXygen: so far I have been using the Find and Replace function when I have needed to change things (for instance I did this in dealing with the cross-reference changes that I had to make), and it has been working well.