I've written some simple XSLT to compile a file called
serOgraphies.xml from the three input files KT says are basically ready. The entries look like this:
<item xml:id="Charles"> <rs n="1">Charles</rs> <rs n="3">Charles Gould</rs> <rs n="6">Don Carlos</rs> <rs n="2">Don Carlos Gould</rs> <rs n="1">Gould</rs> <rs n="4">Señor Administrador</rs> <rs n="2">Señor Administrador of the San Tomé Mine</rs> <rs n="1">their Señor Administrador</rs> </item>
@n values are the counts of instances of that particular epithet, so "Charles" occurs once, "Charles Gould" occurs three times, and so on.
I found and fixed a few encoding errors and oddities in the transcription files at the same time.
This is generated from
<persName> tags, but it's simple to change to
<rs> tags, add
<event>s, etc. It's likely that tagging in the text will shift to
<persName>, so that e.g. non-human characters such as animals can be accommodated.
No Pingbacks for this post yet...
This blog is for work done for academic departments which does not fall under other categories.
|<< <||> >>|