Based on the titles_to_titles linking, and comparing every owner from the first title to every owner on the second, I came up with only 44 potential dupes using the type-1 rules (names and addresses absolutely identical).
I think I've fulfilled the requirements for this, although I don't yet have SF's list of columns; there may be some new constructed variables, and there may be columns we can discard, but I think the hard bit is done.
Meeting with people from all UVic clusters, and fruitful discussions about integration of datasets. Summarized as rough minutes with action items for the team.
Beginning to modularize the test code and build actual outputs.
We have a plan to provide customized outputs for SPSS in TSV format with some constructed variables, so I've done proof-of-concept versions of the titles table and the properties table. These can be extended and made much more sophisticated. This is much easier for statistical work than querying the SQL db.
Some conclusions, but we're still articulating the questions about both these issues.
Figured out how to generate and include tiled layers from geo-referenced tiffs. JC had produced one from the 1930 Fire Insurance map.
I used GDAL to generate the tiles; the MapTiler tiles didn't work:
gdal2tiles.py --s_srs=EPSG:3857 --zoom=1-22 --resume -v fim_1930_geo.tif tiles
That generates the tileset in the /tiles/ folder. Then to create a layer in OpenLayers, you just do this:
var layerFIM1930 = new ol.layer.Tile({ source: new ol.source.XYZ({ /*attributions: 'Fire Insurance Maps...',*/ url: 'fim_1930_geo/{z}/{x}/{-y}.png' }), opacity: 0.5 });
Works a treat! The zoom layers took a bit of figuring out. I guess I don't really need much below 16, but they don't cost much at that level; 20 didn't seem to go quite high enough. When I tried 25, the script crashed.
At SF's request ahead of a meeting, created a spreadsheet view of owners designed to allow easy identification of potential duplicates.
After discussions with JC (geo-reffed plans coming along nicely) asked for a project meeting so we can get an idea of what our "base layers" should be.
SF reports gen 4 complete; ran the process to generate gen 5 candidates and sent on to her.