Archives for: July 2011

29/07/11

Permalink 09:59:50 am, by jamie, 166 words, 175 views   English (CA)
Categories: Activity Log; Mins. worked: 60

Removing the case element from the schema

While working on the import process to bring the XML data into the website database, it occurred to me that, after all of the changes to the schema, the <case> element doesn't serve much of a function anymore. While it used to group together multiple trial files, all of the useful grouping information (namely the trial dates) has been moved to the <trial_file> elements themselves. So, we don't need the extra level in the hierarchy.

For that matter, we also don't need the <regular_cases> element that wraps up the <case> elements, since it's also an extra level of hierarchy. The schema really just needs to be: data -> one or more trial files

Thus, I've removed <regular_cases> <case> from the schema and modified the XSLT transformation scripts accordingly. I also modified the data that's already been processed.

These changes will make the PHP script that converts the XML to MySQL simpler.

26/07/11

Permalink 09:12:44 am, by jamie, 8 words, 133 views   English (CA)
Categories: Notes; Mins. worked: 0

1760-69 proofed

Received the proofed 1760-69 data from SD, 100% valid.

25/07/11

Permalink 03:48:18 pm, by jamie, 20 words, 55 views   English (CA)
Categories: Activity Log; Mins. worked: 25

1760-69 data received, processed, sent for proofing

Received the 1760-69 data from SD in Excel format. Converted it to XML and returned it to SD for proofing.
Permalink 03:19:16 pm, by jamie, 191 words, 135 views   English (CA)
Categories: Notes; Mins. worked: 0

eXist rebuild cancelled: PHP version reinstated with planned XML import

I've decided to scrap the eXist rebuild for a few reasons:

The XML schema was not meant as a final destination for the data. SA created it as an ad hoc intermediary step between SD's raw data and the MySQL database. So, the schema doesn't lend itself well to being the backbone of an eXist architecture, particularly when it comes to searching and generating human-readable values. The structure of the data is highly relational and is a good fit for SQL, which, of course, was SA's original intention.

That said, my original motivation for the eXist rebuild still stands. The current method of translating the XML to SQL involves using XSLT transformations to generate SQL statements. However, every time the schema is updated with new structure or values - which is happening a lot - the XSLT stylesheets need to be updated as well. So, to get around these difficulties, I'm going to write a PHP script that will convert the XML to MySQL, likely use SimpleXML. It won't be the fastest script I've ever written, but since it will be a one-use-per-dataset kind of thing, speed isn't a big issue.

20/07/11

Permalink 12:40:55 pm, by jamie, 110 words, 131 views   English (CA)
Categories: Notes; Mins. worked: 0

eXist rebuild planned

After some consultation with Greg I've decided to scrap the PHP/MySQL version of Bailey that I built and write the site in eXist. After building FrancoToile and the Lansdowne Lecture site in eXist I'm comfortable working with it, and this way the Bailey XML data can be used directly without being shoehorned into a MySQL database. The allowed values for crimes, outcomes, judges, etc. is constantly changing in the schema, so maintaining those keys in MySQL would be a long-term pain. Using eXist cuts out the XML-to-MySQL middle man. I don't expect the basic functionality of the site to take long - it'll likely be done next week sometime.

Permalink 09:59:40 am, by jamie, 14 words, 38 views   English (CA)
Categories: Activity Log; Mins. worked: 10

New crime_normalized

Added a new crime_normalized "StealinaChurch" to the schema at the request of SD.

13/07/11

Permalink 11:10:47 am, by jamie, 28 words, 1136 views   English (CA)
Categories: Notes; Mins. worked: 0

Beginning restructuring of SQL schema

Because of the laundry list of changes to the XML structure, I've begun a wholesale restructuring of the MySQL database schema. There are quite a few core changes.
Permalink 09:56:32 am, by jamie, 12 words, 1136 views   English (CA)
Categories: Notes; Mins. worked: 0

1770-79 data proofed and received

SD sent back the 1770-79 XML file this morning, which is 100% valid.

12/07/11

Permalink 02:52:50 pm, by jamie, 30 words, 1080 views   English (CA)
Categories: Activity Log; Mins. worked: 60

1770-79 data sent for proofing

Sent the latest data set, 1770-79, to SD for proofing. Quick import this time and only 25 errors, 23 of which are mercy appeal errors which SD usually saves for post-import anyway.
Permalink 01:19:35 pm, by jamie, 51 words, 1143 views   English (CA)
Categories: Notes; Mins. worked: 0

1770-90 data received; change in spreadsheet format

Received the 1770-79 XLS spreadsheet from SD. Minor change to the format as he's added "Outcome Durn - Yrs" and "Outcome Durn - Other" to handle outcome duration values (see previous blog post: http://hcmc.uvic.ca/blogs/index.php?blog=36&p=8333&more=1&c=1&tb=1&pb=1 ).

Capital Trials at the Old Bailey

Simon Devereaux has approximately 10,000 records of people convicted in potentially capital cases between 1710 and 1840 in London heard at the Old Bailey court. This project will create a web-based database which will allow interested researchers and members of the public to compose queries on that data (e.g. women charged with robbery 1710-1720). It must be able to support a range of queries and produce output allowing researchers to identify trends in judicial practice over that time.

Reports

Categories

July 2011
Sun Mon Tue Wed Thu Fri Sat
 << < Current> >>
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31            

XML Feeds