More birthplaces; languages; race; and more uncomfortable subjects
: Martin Holmes
Minutes: 200
Added handling for literacy, English/French competence, birthdates, schooling, and other stuff including the unpleasant colour and race data. There are still a few fields left to fix, such as immigration and naturalization info, but we’re getting close. Also ran conversions on the full census recordsets and validated them; only the 1911 is still not converting successfully, but the remaining issues are minor. Tomorrow I’ll put a full build process in place with validation on Jenkins, and then I can stop doing my own exhaustive validation locally.
In the process I found and corrected some errors which should actually never have been allowed by
the database itself: ids which should have been foreign keys, but which turned out to contain integers
that had no matching key. There were only a few, but it suggests that the foreign key setups were not
all done properly. We also decided to exclude the mysterious BC 1901
dataset of 8092 records, because
I can’t find any sign that they were ever available/searchable on the original site, and nobody seems to
know what they are or where they came from.