Modified proofing and for mysql rng files, and 1790s data files by replacing the names of these two elements:
<trial_publication>some book</trial_publication>
<trial_correspondence>some other thing</trial_correspondence>
with these:
<trial_printed_sources>some book</trial_printed_sources>
<trial_other_documents>some other thing</trial_other_documents>
Also created a 1790s folder, put all the dev files into that. Created a 1800s folder and put copies of all the rng files into that. Will update those and the instructions for generating XML from word file.
Set Marri up with Oxygen on Mac. Gave her the most recent data file (it lacks judge names).
Based on her first couple of hours, I edited the schema file to allow multiple (rather than one) outcomes and made a couple of other minor modifications.
Once the data set is cleaned up, I'll need to revisit whether the db engine should be relational or xml based.
Confirmed with Simon that fixed list of judge names has been included in the rng file for the proofer (checked rng files, added a couple of new values and tested them). Still need to find out from him if an empty string is an acceptable value.
The existing 1790s data has no judge names in it. Asked SD if he wants that added by the proofer or how we wants those names included.
As far as I'm concerned, we're ready to have proofer go through records. It looks like about 250 of the 750 records in the 1790s set will need some attention. They have snippets of text that my battery of regexp's could not figure out how to deal with - typically extraneous bits of names, second charges or subsequent findings by the court. The XML structure accommodates those features, but the regexp provides for only the simplest cases.
Added choice element allowing empty or item from list of normalized names of judges. The list items can't include a space character (I don't know why - I researched for an hour and all documentation indicates the space character should present no problem). I've put those elements into the file "bailey_proofing.rng" which is the file that the proofer will use.
I still have to modify the regular expressions in the NameJudgeJuryAge4 section of howto_regular081017.txt to look for the text values and replace them with the normalized ones.