Monument 2023-11-20 to 2023-11-24
to : Martin Holmes
Minutes: 600
On Monday, got three spreadsheets from the team, and processed the one which adds exclusions to the Monument dataset for people who should not be on the monument. However, when I went to look at the other two, and ran somne tests prior to implementing them, I found a number of cases where persons not in the Monument dataset were referenced. Most of these appear to be people who were not brought into Monument because they had no residence elements. It appears that decisions still need to be made on some of these, so I can’t really go ahead with the processing as planned.
On Tuesday, got a reduced spreadsheet for merges and was able to complete that process successfully. Then got a series of new versions of the spreadsheet for children born after uprooting, each one with new issues, all relatively minor but demonstrating why spreadsheets are not a good data format. Along the way, tweaked my own import/merge code to catch and report problems more comprehensively, so I was able to report the bugs more quickly each time. By the end of the day, all three of the incoming spreadsheets had been processed into the data, with just a handful of special cases remaining to be dealt with manually.
On Wednesday, did a partial rerun of one of the processes from the previous day to update some pointers which had been missed, and then integrated three more dataset spreadsheets coming from AB and SI. Reworked the diagnostics to take account of changes, made updates to the site rendering to handle the new people added who were born after uprooting, and did some testing, checking and bugfixing on a handful of problem cases, which also involved some editing of LOI data.
On Thursday, processed a few more small spreadsheets coming in, and edited some cases by hand. Then worked on the problem of disambiguation, to devise an encoding strategy for specifying that two individuals are distinct from each other after checking, and then the processing to make that render correctly in the output. Did some more tweaking of the appearance of the output.
On Friday we merged an additional half-a-dozen spreadsheets, added some new content to the site, reworked the diagnostics, and published the latest version of the public site.