update XSLT to generate SQL
Posted by sarneil on 27 Apr 2009 in Activity Log
Martin wrote some XSLT a year ago to export the then-current XML structure for the Old Bailey trials to mysql statements which we could use to create and populate tables.
I began work on modifying that XSLT so that it would reflect changes in the structure of the XML in the data files and output based on current thinking of how we want the tables to be organized.
Tried extracting list of current juries from trial records but ran into a stone wall when we tried to concatenate two strings (e.g. jury type and subtype) and then have that treated as an item. In addition, I realized that when we processed subsequent data files, there would be no way of preventing duplicate keys being generated. So, I decided that it would be simpler and safer to include in the data file all the fixed lists (e.g. judge names, jury types, outcomes, crime groups etc.) and then have the xslt process those into the appropriate sql statements to generate output and into xslt variables. That would also allow me to more easily write xslt in the trial processing code that could write in the appropriate id for e.g. the jury type by looping through the fixed list I'd generated previously and held in an xslt variable.
I've now added those elements to a schema file (for_sql.rng) and example data file. I have yet to add the code in the trials processing section which will obtain the appropriate id for each of the fields (e.g. match the jury type and subtype against the the fields in each record in the lookup table and then return the id field for the matching record in the lookup table).