Long discussion: next stage
Posted by mholmes on 15 May 2012 in Activity log
JS-R is able to work with the current spreadsheet for the May presentation, but would like some more elaborate output for the next phase. These are the details we've discussed:
- The initial db XML output should go through a transformation which basically takes all the information encoded in relations and makes it explicit on individual records. So, for instance, all owners should have explicit ethnicities realized on the owner record; each title should have complete copies of all its owners; and so on. This will make it much easier, and faster, to generate other views of the data.
- This output also needs to include some new boolean flags on titles:
- Sale to self (as currently created during the spreadsheet transaction transform).
- Possible family sale (ditto).
- liquidated property: properties sold by a Japanese owner to custodians or the state between beginning 1943 and end 1946. The custodian category would be set as an institution type by JS-R.
- Control for liquidated property: any title which is not flagged as above, nor is it a family transaction or a sale to self, which takes place from 1943-01-01 through 1946-12-12.
- We need a view which constitutes a chain of transactions, constructed by preceding title. The way to construct the chains is:
- Order titles by date ascending.
- Start from the first.
- Look for another title which as this one as its preceding title. Add that to the chain, and continue.
- Every time you add a title to a chain, flag the title as having been used.
- If you find two titles with the current title as preceding, then you have a fork. Annotate the end of the current chain to point to those two titles, and start new chains from each of those titles. Annotate the first link in the new chains to point back to the fork title.
- If your current chain hits a title which has already been used, then you have a merge. In that case, split the previously-constructed chain into two, and annotate the break points, and stop your current chain, annotating the end of it, so that you end up with two chains which end, pointing to another single chain which continues.
- In the current spreadsheet output, the LIQ and LIQ_CONTROL flags would be output, along with a generation number for any title which has one of these flags, constituting the count of transactions subsequent to the custodian transaction (in the case of LIQ titles), or the first transaction following 1943-01-01 (in the case of LIQ_CONTROL properties).