SD sent me an spreadsheet with the 1820 to 1824 data in it. Found a little "how-to" file which explained the steps to turn that into an xml data file with schema for validating. Did the process and noticed that two of the fields had values swapped. Checked the xslt and sure enough
found this:
<crime_normalized><xsl:value-of select="Crime_Group"/></crime_normalized>
<crime_group><xsl:value-of select="Crime_Normalized"/></crime_group>
which I corrected to this:
<crime_normalized><xsl:value-of select="Crime_Normalized"/></crime_normalized>
<crime_group><xsl:value-of select="Crime_Group"/></crime_group>
Also noticed that the import changed all integer values to floating point, (e.g. 16 became 16.0), and only integers are valid in the various field (age, weeks, months, years etc.) Just did a grep search and replace to fix those.
Huge majority of 100+ remaining invalid instances are mercy appeals where Simon has entered something like jury/prosecutor and the XML requires a separate mercy appeal for each proponent.
XML file now with SD to make remaining corrections, then return to me, at which point I'll follow the rest of the how-to procedure to render back to relational data and upload to db.