My work on this site can be put into three different categories:
- Adding the 1911 table and views to the database, and modifying the views for the other census years to take into account any new fields introduced in 1911
- Adding the new 1911 census data
- Modifying the 'loader' application to be more robust, and to accept the CSV data files
- Making mostly minor changes to the actual website, such as textual changes and some additions to the advanced census search
I did all work, including importing data, on my local machine first, on which I had a complete development environment. After importing the data and testing all changes, I then updated the development server, using phppgadmin.uvic.ca for the database changes and Subversion for the PHP changes.
In addition to making the census_1911 database, I also had to make new views for the 1911 census, since the website only interacts with the views for searching and displaying, rather than the tables. I also had to modify the older views and put in some new fields (which are NULL on those older views). The search functions on the site use a UNION query to join all of the views, so they all need to have the same columns for the search to work properly, even if some of those columns are NULL in the various views. This process is all fully documented in David's excellent manual. I didn't change the methods or the way anything works; I just added the new data.
Each sub-district is in its own CSV (and XLS – Patrick usually sent me the data in both formats). Thus, each sub-district was imported separately. This is documented in full on the blog, but a short summary of the steps:
- Add a new row to the location table, which becomes the location_id of the rows to be imported
- Make a new 'mapping' file in the map directory in the loader, following the conventions of the other mapping files
- Add any new rows to the auxiliary tables as necessary (occupations, nationalities, etc.) - Patrick supplied these when sending the sub-districts
- Import the data with the loader, check for errors, delete and re-import as necessary
- Create a dump file and then import into the development database on tapor
Although it had cursory support for CSV files, the loader application was really only built to handle the old Access database format, and then only on a Windows machine. So, I modified it to accept CSV files. I also beefed up the available mapping functions, and changed some of the queries to be more organized. The application isn't 100% complete, but does the job.
The loader.php script itself was a mammoth, so I created a class Loader in inc/loader.php to do some of the heavy lifting and abstract some of the functionality. David had written a short manual for the loader which explains how the mapper works. I filled out the manual with the new mapping functions that I wrote. The basic steps for importing data are:
- On the 'Configure' page, putting in the full path to the CSV data file in the field 'CSV File (for CSV imports) - absolute path'
- Choosing census_1911 for the table name on the main page and ensuring that “empty table before import” is not checked
- For the 'field map' file , choosing the map file made for the sub-district
- Cross fingers and import! There are often incorrect foreign keys which cause SQL errors, so it wasn't uncommon for me to delete the new data a few times and re-import to account for new foreign keys
The loader tends to time out and/or run out of memory when processing larger CSVs. If this happens, then it's fine just to re-import because I extended the loader with the option to skip previously entered records, which can be defined in the mapping files (and is documented in the loader doc file).
The loader does not live on the development website; all loading was done on my machine (parsnip).
Almost all of the changes I made to the website itself (i.e. the .inc pages) were textual changes given to me by PD. I did have to modify some of the search functionality, most significantly re-organizing the boxes and adding some new fields for the 1911 data. These changes were relatively minor and are documented in the blog.