Deep dive on possible missing titles


Did some archaeology on the apparent missing titles from Powell St and reported as follows:

My backup of the database from September 2014 (when I think we were just creating it) contains a total of 3489 titles, including those for B43 L2. The next backup, from November, has only 1513 titles, so nearly 2,000 were deleted.

This was planned. These are my notes from our discussions and our action at the time:

October 3:
Met with JSR and SF to discuss refining data in the LTD. First, we create a new duplicate of the existing db. SF will generate a list of known-good titles (in that they've been fully edited using the final protocols). I'll then generate lists of titles that don't match that set, which will be candidates for deletion; she will check those. Then we delete those. Then we generate lists of now-unlinked people (owners and sellers), other documents, and legal descriptions, which again are assessed as candidates for deletion.

October 14:
SF and I have been working on generating and checking lists of records which we believe can be deleted from the db. I've made a landscapes_backup db and cloned the current content into it before we start deleting; it looks like we'll be removing over 2,000 title records, but we're still doing some checking; then we'll remove associated unlinked items.

October 17:
Did a number of planned deletions, and then some additional work identifying now-unlinked owners; there are 2713 which could be deleted. Many of these are additions by the recent research team, which were added, but whose titles were then mistakenly linked to earlier identical or similar entries. SF is now analysing this situation, and we will eventually prune all the unwanted owners from the system. Then comes the issue of identifying and linking or merging owners we believe to be the same person.

October 27:
Generated lists of owners who were previously connected to Block 43 and 52, so that they can be eliminated from the db if they're now no longer connected to anything else. SF is doing this work.

November 7:
SF and I worked through a lot of different approaches to confirming that no useful data was deleted during our cleanup. We have plausible explanations for all but 32 of the orphaned owners; and we have identified about ten titles which were deleted by editors during the summer work period; these must have been purposefully deleted around the time they were created -- they never got saved into a backup -- so they must have been erroneously entered. I think these are the plausible explanation for the orphaned owners.

In the process I added a new titles as seller field to the owners table, and we confirmed the consistency of data in lots of other respects, so we're looking good.

So we know these deletions were intentional, they were carefully checked, and they seem to have been primarily associated with blocks 43 and 52. If the intention was that these titles should be re-captured, that apparently never happened. 


