MVP: Generating -ographies for Nostromo
Posted by mholmes on 03 Apr 2012 in Activity log
I've written some simple XSLT to compile a file called serOgraphies.xml
from the three input files KT says are basically ready. The entries look like this:
<item xml:id="Charles"> <rs n="1">Charles</rs> <rs n="3">Charles Gould</rs> <rs n="6">Don Carlos</rs> <rs n="2">Don Carlos Gould</rs> <rs n="1">Gould</rs> <rs n="4">Señor Administrador</rs> <rs n="2">Señor Administrador of the San Tomé Mine</rs> <rs n="1">their Señor Administrador</rs> </item>
The @n
values are the counts of instances of that particular epithet, so "Charles" occurs once, "Charles Gould" occurs three times, and so on.
I found and fixed a few encoding errors and oddities in the transcription files at the same time.
This is generated from <persName>
tags, but it's simple to change to <rs>
tags, add <event>
s, etc. It's likely that tagging in the text will shift to <rs>
from <persName>
, so that e.g. non-human characters such as animals can be accommodated.