Log in

HCMC Journal

More birthplaces; languages; race; and more uncomfortable subjects

: Martin Holmes
Minutes: 200

Added handling for literacy, English/French competence, birthdates, schooling, and other stuff including the unpleasant colour and race data. There are still a few fields left to fix, such as immigration and naturalization info, but we’re getting close. Also ran conversions on the full census recordsets and validated them; only the 1911 is still not converting successfully, but the remaining issues are minor. Tomorrow I’ll put a full build process in place with validation on Jenkins, and then I can stop doing my own exhaustive validation locally.

In the process I found and corrected some errors which should actually never have been allowed by the database itself: ids which should have been foreign keys, but which turned out to contain integers that had no matching key. There were only a few, but it suggests that the foreign key setups were not all done properly. We also decided to exclude the mysterious BC 1901 dataset of 8092 records, because I can’t find any sign that they were ever available/searchable on the original site, and nobody seems to know what they are or where they came from.