Archives for: April 2011

21/04/11

Permalink 11:19:06 am, by jamie, 93 words, 58 views   English (CA)
Categories: Activity log; Mins. worked: 60

1911 "Central Park" data received and imported

Received sub-district 5, "Central Park", from PD yesterday and imported this morning. Had to fix a handful of codes and modify the loader map file for slightly-different column names, but other than that the data was brought in smoothly. Added new occupations:

INSERT INTO occupation (occupation_description, occupation_code) VALUES 
('Pool Romm Attendant', '59990'),
('Setter', '81275'),
('Lamp Fitter', '87330'),
('General Finisher', '99999');

And a new location:

INSERT INTO "location" VALUES (84, 'BC', 'British Columbia', '13', 'Victoria City', '5', 'Central Park', NULL, NULL, NULL, NULL, NULL, '1911', NULL, NULL); 

20/04/11

Permalink 03:10:38 pm, by jamie, 68 words, 63 views   English (CA)
Categories: Activity log; Mins. worked: 45

Updated page copy

Received and added some new and updated page copy from PD. Added to both the development and live sites.

Permalink 11:10:42 am, by jamie, 118 words, 57 views   English (CA)
Categories: Activity log; Mins. worked: 60

1911 Vic West data received by PD; imported

Received the latest set of 1911 census data from PD - subdistrict, #1, Vic West. Imported into the development database smoothly and without incident.

Also inserted six new occupations:


INSERT INTO occupation (occupation_description, occupation_code) VALUES ('Pianist', 17140);
INSERT INTO occupation (occupation_description, occupation_code) VALUES ('Show Woman', 17225);
INSERT INTO occupation (occupation_description, occupation_code) VALUES ('Butter Maker', 77530);
INSERT INTO occupation (occupation_description, occupation_code) VALUES ('Gas Fitter', 87120);
INSERT INTO occupation (occupation_description, occupation_code) VALUES ('Cement Worker', 95200);
INSERT INTO occupation (occupation_description, occupation_code) VALUES ('Road Maker', 97415);

And a new location for the sub district:


INSERT INTO "location" VALUES (83, 'BC', 'British Columbia', '13', 'Victoria City', '1', 'Victoria West', NULL, NULL, NULL, NULL, NULL, '1911', NULL, NULL);

Permalink 10:29:54 am, by jamie, 169 words, 1298 views   English (CA)
Categories: Notes; Mins. worked: 0

How to fix PostgreSQL error "duplicate key violates unique constraint"

If you get this message when trying to insert data into a PostgreSQL database:

ERROR:  duplicate key violates unique constraint

That likely means that the primary key sequence in the table you're working with has somehow become out of sync, likely because of a mass import process (or something along those lines). Call it a "bug by design", but it seems that you have to manually reset the a primary key index after restoring from a dump file. At any rate, to see if your values are out of sync, run these two commands:

SELECT MAX(the_primary_key) FROM the_table;

SELECT nextval('the_primary_key_sequence');

If the first value is higher than the second value, your sequence is out of sync. Back up your PG database (just in case), then run thisL

SELECT setval('the_primary_key_sequence', (SELECT MAX(the_primary_key) FROM the_table)+1);

That will set the sequence to the next available value that's higher than any existing primary key in the sequence.

18/04/11

Permalink 03:29:17 pm, by jamie, 339 words, 47 views   English (CA)
Categories: Activity log; Mins. worked: 60

New views: origins, nationalities, races, bands, mother tongue

Added more views to the PostgreSQL database to generate lists of values that are present in the census data, for the advanced search form.

vOriginsUsed - used for the "Birthplace", "Birthplace of Father", and "Birthplace of Mother" fields:


CREATE VIEW "vOriginsUsed" AS

SELECT origin_id, origin

FROM "origins"  

WHERE origin IS NOT NULL   

AND origins.origin_id IN (
    SELECT "vCensus1881".birthplace_id FROM "vCensus1881"
    UNION
    SELECT "vCensus1881".birthplace_father_id FROM "vCensus1881"
    UNION
    SELECT "vCensus1891".birthplace_id FROM "vCensus1891"
    UNION
    SELECT "vCensus1891".birthplace_father_id FROM "vCensus1891"
    UNION
    SELECT "vCensus1891".birthplace_mother_id FROM "vCensus1891"
    UNION
    SELECT "vCensus1901".birthplace_id FROM "vCensus1901"
    UNION
    SELECT "vCensus1911".birthplace_id FROM "vCensus1911"
    )  
    
ORDER BY upper(origin)

The origins view is pretty slow because of the large number of SELECT subqueries.

vNationalitiesUsed - used for the "Nationality" search field:


CREATE VIEW "vNationalitiesUsed" AS

SELECT nationality_id, nationality

FROM "nationalities"

WHERE nationality IS NOT NULL

AND nationalities.nationality_id IN (
    SELECT "vCensus1901".nationality_id FROM "vCensus1901"
    UNION
    SELECT "vCensus1911".nationality_id FROM "vCensus1911"
)

ORDER BY upper(nationality)

vRacesUsed - used for the "Race/Ethnic Origin" search field:


CREATE VIEW "vRacesUsed" AS

SELECT race_id, race_name

FROM "races"

WHERE race_name IS NOT NULL

AND races.race_id IN (
    SELECT "vCensus1901".race_id FROM "vCensus1901"
    UNION
    SELECT "vCensus1911".race_id FROM "vCensus1911"
)

ORDER BY upper(race_name)

vBandsUsed - used for the "First Nations Band" search field:


CREATE VIEW "vBandsUsed" AS

SELECT band_id, band

FROM "band"

WHERE band IS NOT NULL

AND band.band_id IN (
    SELECT "vCensus1881".band_id FROM "vCensus1881"
    UNION
    SELECT "vCensus1891".band_id FROM "vCensus1891"
)

ORDER BY upper(band)

vMotherTonguesUsed - used for the "Mother Tongue", "First Language Spoken", and "Second Language Spoken" search fields:


CREATE VIEW "vMotherTonguesUsed" AS

SELECT mother_tongue_id, mother_tongue

FROM "mother_tongues"

WHERE mother_tongue IS NOT NULL

AND mother_tongues.mother_tongue_id IN (
    SELECT "vCensus1901".mother_tongue_id FROM "vCensus1901"
    UNION
    SELECT "vCensus1911".first_language_id FROM "vCensus1911"
    UNION
    SELECT "vCensus1911".second_language_id FROM "vCensus1911"
)

ORDER BY upper(mother_tongue)

Permalink 10:06:24 am, by jamie, 142 words, 46 views   English (CA)
Categories: Activity log; Mins. worked: 30

New view for all used occupations

Made a new view, "vOccupationsUsed", to get a list of all occupations that show at least once in any of the census data sets (1881-1911). This will enable the "occupations" pop up list in the advanced census form only to show occupations that are present in the census data. Other views for religion, race, birthplace, etc. will follow.

CREATE VIEW "vOccupationsUsed" AS 

SELECT DISTINCT ON (upper(occupation_description)) occupation_description AS occupation_description, occupation_id, occupation_code

FROM "occupation"

WHERE 
    occupation_description IS NOT NULL 
    
    AND 
    (
        (occupation.occupation_code IN (SELECT "vOccupations1881".code FROM "vOccupations1881" WHERE "vOccupations1881".count1881 > 0))
        OR
        (occupation.occupation_code IN (SELECT "vOccupations1891".code FROM "vOccupations1891" WHERE "vOccupations1891".count1891 > 0))
        OR
        (occupation.occupation_code IN (SELECT "vOccupations1901".code FROM "vOccupations1901" WHERE "vOccupations1901".count1901 > 0))
        OR
        (occupation.occupation_code IN (SELECT "vOccupations1911".code FROM "vOccupations1911" WHERE "vOccupations1911".count1911 > 0))
    )

ORDER BY upper(occupation_description)

15/04/11

Permalink 02:14:36 pm, by jamie, 53 words, 40 views   English (CA)
Categories: Activity log; Mins. worked: 120

Changes made as per last meeting

Made all changes, and committed to the development site, from the last meeting EXCEPT for "When choosing occupations, races, birthplaces, relationships to head, etc. in the advanced search form, only display values that have corresponding census records". Due to the inconsistency between some of the codes, that will take a little while longer.
Permalink 12:28:01 pm, by jamie, 68 words, 46 views   English (CA)
Categories: Activity log; Mins. worked: 30

SQL to update census_1911 location ID

Since location_id isn't part of the 1911 records insert, but sub_district_id is, use this SQL to update the location_id after you've put a row into the location table corresponding to the new sub-district:


UPDATE census_1911 SET location_id = (SELECT location.location_id FROM location WHERE location_02_id = '13' AND location_03_id = CAST(census_1911.sub_district_id AS character varying(50)) AND census_year = '1911')

14/04/11

Permalink 11:36:13 am, by jamie, 298 words, 125 views   English (CA)
Categories: Notes; Mins. worked: 60

Meeting with PD

Met with PD today to discuss the integration of the 1911 data thus far, and to go over any changes that need to be made. My task list from the meeting:

  • Fix the occupation and religions table searches, both of which are broken on the development and live sites
  • Add the 1911 information into the locations table:
    • location_02: id 13/"Victoria City"
    • location_03: sub district ID as given by PD (e.g. 10 for Fernwood, 3 for Rock Bay)
  • Remove the "address" field from the advanced search, which is now obsolete thanks to the "street" field
  • Put insurance data into the detail view for 1911 records
  • Change instances of "color" to "colour"
  • Remove "colour" from 1911 views since that data wasn't collected
  • In the summary view of a record, move "Speaks English/French" above "First Language" and "Second Language"
  • Insert missing first/second language information into 1911 data
  • For census record fields that have no information, display a blank field instead of "Unknown" as this glosses over the nuances of the data (i.e. no data vs. illegible or otherwise unknown data)
  • Fix the earnings field in the advanced search, which doesn't seem to work at all
  • Fix a display error in the "Birthdate" field for 1911 records
  • When choosing occupations, races, birthplaces, relationships to head, etc. in the advanced search form, only display values that have corresponding census records
  • Remove erroneous "E" relationship to head and move all records associated with it to "Employee"
  • Add "Hotel" to the building table with an ID of 9
  • Add the amount paid for education field to the 1911 detail view
I am also going to send PD a list of all the annotations for census records submitted by users over the years. He will compile a list of corrections and then we will go through them together to fix the data.

11/04/11

Permalink 02:58:24 pm, by jamie, 25 words, 53 views   English (CA)
Categories: Activity log; Mins. worked: 20

Updated views with street information

Updated the four census views with street_number and street_name fields. The fields are NULL in 1881 and 1891 and point to real data in 1901 and 1911.
Permalink 12:19:20 pm, by jamie, 126 words, 56 views   English (CA)
Categories: Activity log; Mins. worked: 30

Loader app changes: converted from raw SQL to organized arrays

To prepae for the previous update (adding 'conditionKey' support), I modified the loader so that, instead of building a raw SQL query to insert each row, the app now builds an associative array of data and then passes it to either pg_insert() or pg_update(), depending on whether any conditions exist. Aside from allowing for condition support, this change also keeps the data more organized and takes out the guesswork involved in building raw SQL queries via a for loop. I also had to edit the _fix_field() function in inc/lib.inc to stop the automatic escaping of data values into single quotes (e.g. 'data'). pg_insert() and pg_update() escape automatically, so the code in _fix_field() was resulting in double escaping.
Permalink 12:17:17 pm, by jamie, 199 words, 48 views   English (CA)
Categories: Activity log; Mins. worked: 60

Loader app expansion: conditional updating support

PD recently sent me an updated version of the 1901 data with expanded address information, namely street number and street name separated into two fields. This will allow us to have street searching for both 1901 and 1911. However, the loader application didn't support updating existing rows, only importing new ones. So, I had to build update capabilities.

The loader now supports "conditionKey" lines, which can be used to specify conditions that need to be met in order to perform an update. The format of the line is:


conditionKey=SourceDataColumnName:DatabaseTableColumnName

For example, this is the condition key line that I specified when importing the new address data for 1901:


conditionKey=census_record_id:census_record_id

The field on the left of the colon tells the loader to look at the value of the census_record_id column in the source data CSV, while the field on the right side of the colon tells the loader to check that value against the census_record_id column in the database table.

Internally, adding one or more conditions causes the loader to use the pg_update() function instead of the pg_insert function for the row in question.

Multiple conditionKey lines may be specified.

06/04/11

Permalink 01:49:24 pm, by jamie, 69 words, 52 views   English (CA)
Categories: Activity log; Mins. worked: 45

Dev site updates

Committed all of my changes to the census search/display to the SVN repository and pushed to the dev site. Dev site also has an up-to-date database with the partial 1911 data (Fernwood/Hillside and Rock Bay) as well as any new codes I've added (occupations, nationality, race, etc.). From this point I'll commit my code changes daily and push to the dev site at the end of the day.
Permalink 12:31:00 pm, by jamie, 99 words, 120 views   English (CA)
Categories: Notes; Mins. worked: 0

Development site online

With help from Greg I've put a development site for VIHistory online at: http://tapor.uvic.ca/~vihdev/

Access is currently restricted by Netlink ID to associated parties (myself, GN, MH, SA, PD, JL). The dev site runs its own database. We put this site online so that PD and JL could "beta test" the new 1911 data once it's ready for them. This also allows me to periodically commit my development changes to the SVN repository, rather than having them sit on my machine for months, in case I get hit by a bus (or suffer another similar calamity).

05/04/11

Permalink 03:13:04 pm, by jamie, 43 words, 129 views   English (CA)
Categories: Notes; Mins. worked: 0

1871 data: not to be integrated

Contrary to my previous post about the advanced census search to-do list, PD, MH and I have agreed that integrating the 1871 data isn't prudent at this time. It's not nearly as complete as the other census years and would probably just confuse users.

04/04/11

Permalink 01:34:04 pm, by jamie, 137 words, 116 views   English (CA)
Categories: Notes; Mins. worked: 0

Changes needed to advanced census search form

PD has sent a list of changes to be done to the advanced census search form, in preparation for the 1911 data:

  • Move address from Name/Family/Location to Building
  • Move infirmities to its own group
  • Change "Race" heading to "Race/Ethnic Origin"
  • Add "1st language commonly spoken" and "2nd language commonly spoken" fields to Language
  • Change 'Building' header to 'Habitation'
  • Add "Street" to aforementioned group to search 1901 and 1911 census
  • Include 1871 as an option
  • Add new "Infirmities/Insurance" header with: Infirmities, Has Life Insurance, and Has Accident/Health Insurance
A few small shuffles, but also a few larger jobs, namely integrating the 1871 data, adding the new Infirmities/Insurance section (most of the work there is massaging the pre-1911 data to allow it to be searchable), and being able to search just by street, rather than address, for 1901 and 1911.
Permalink 12:31:53 pm, by jamie, 166 words, 41 views   English (CA)
Categories: Activity log; Mins. worked: 30

Changes made to 1911 table/view: languages

Some clarification from PD about how language fields work in the 1911 data. There are two language fields: 1st language spoken and 2nd language spoken. I had mistakingly thought that 1st language spoken was equivalent to the "mother tongue" of previous years and imported it as such. I had also made a new "secondary_language" field in the 1911 table and in all of the views. PD, however, informs me that, in fact, there is no "mother tongue" in the 1911 census and "1st language spoken" and "2nd language spoken" are new fields. So, I've modified the 1911 census table and all of the census views as follows: - census_1911 table: renamed "mother_tongue_id" to "first_language_id"; renamed "secondary_language_id" to "second_language_id" - view for census 1911: converted mother_tongue and mother_tongue_id to NULL fields - views for all census years: renamed "secondary_language_id" to "second_language_id"; added "first_language_id" and "first_language" as NULL fields to all years previous to 1911
Permalink 09:31:15 am, by jamie, 34 words, 55 views   English (CA)
Categories: Notes; Mins. worked: 0

Fernwood-Hillside data received

Received the spreadsheets for the Fernwood-Hillside sub-district (#10) for the 1911 census from PD. Importing of the data into my development database should be seamless. Two new occupation codes are being inserted: Riveter (87460) and Cement Worker (95200).

viHistory

viHistory is a web site that is a teaching, learning and research tool. It's principally about the history of Vancouver Island in British Columbia, but it is also a vehicle for exploring the larger field of Canadian history during the late 19th and early part of the 20th century. It allows census, directory and tax assessment roll data from the late 19th and early 20th centuries to be searched in many ways. It also incorporates IMaP to display historical maps. The project director is Dr. Patrick A. Dunae.

Reports

Categories

April 2011
Sun Mon Tue Wed Thu Fri Sat
 << < Current> >>
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30

XML Feeds