I've written some simple XSLT to compile a file called serOgraphies.xml from the three input files KT says are basically ready. The entries look like this:
<item xml:id="Charles"> <rs n="1">Charles</rs> <rs n="3">Charles Gould</rs> <rs n="6">Don Carlos</rs> <rs n="2">Don Carlos Gould</rs> <rs n="1">Gould</rs> <rs n="4">Señor Administrador</rs> <rs n="2">Señor Administrador of the San Tomé Mine</rs> <rs n="1">their Señor Administrador</rs> </item>
The @n values are the counts of instances of that particular epithet, so "Charles" occurs once, "Charles Gould" occurs three times, and so on.
I found and fixed a few encoding errors and oddities in the transcription files at the same time.
This is generated from <persName> tags, but it's simple to change to <rs> tags, add <event>s, etc. It's likely that tagging in the text will shift to <rs> from <persName>, so that e.g. non-human characters such as animals can be accommodated.
No Pingbacks for this post yet...
This blog is for work done for academic departments which does not fall under other categories.
| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|---|---|---|---|---|---|
| << < | > >> | |||||
| 1 | 2 | 3 | 4 | |||
| 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 | 16 | 17 | 18 |
| 19 | 20 | 21 | 22 | 23 | 24 | 25 |
| 26 | 27 | 28 | 29 | 30 | 31 | |