These are some of the results coming out of the generation of transaction-chains through XSLT:
This is an example of what I'm pulling out so far, and the sorts of oddities that are being revealed:
<transaction-chain>
<title key="206" property-id="101" property-name="B:103 L:003"/>
<transaction-chain>
<title key="249" property-id="101" property-name="B:103 L:003"/>
<title key="204" property-id="101" property-name="B:103 L:003"/>
<title key="157" property-id="101" property-name="B:103 L:003"/>
<title key="25" property-id="71" property-name="B:011 L:026"/>
</transaction-chain>
<transaction-chain>
<title key="157" property-id="101" property-name="B:103 L:003"/>
<title key="25" property-id="71" property-name="B:011 L:026"/>
</transaction-chain>
</transaction-chain>
This shows nesting chains. Title 206 is the start of the initial chain; 249 is then split from it (while presumably 206 continues?). 249 becomes 204, then the split is re-joined: 157 has both 206 and 204 as preceding-titles.
I don't know if this makes sense -- can a title be split into itself and another title, as seems to be the case here with 206? There do seem to be lots of examples of this in the database.
My system currently captures splits like this well, but it doesn't yet unify chains which come back together again (so the two interior chains in the above example both have 157 -> 25). A subsequent transformation could easily detect such merges and represent them somehow, but it's not clear how. If we don't do that, then you would end up with two distinct chains:
- 206 -> 249 -> 204 -> 157 -> 25
- 206 -> 157 -> 25
This would be problematic if you were doing stats which depend on the number of transactions. We could, alternatively, collapse all chains of which one is a reduced subset of the other, so you would end up with just one here:
- 206 -> 249 -> 204 -> 157 -> 25
However, this would ignore the fact that 157 has 206 as a preceding title. It's also not clear what should happen with chains which diverge but never re-unite, such as this:
<transaction-chain>
<title key="606" property-id="211" property-name="B:039 L:005"/>
<transaction-chain>
<title key="507" property-id="211" property-name="B:039 L:005"/>
<title key="421" property-id="211" property-name="B:039 L:005"/>
</transaction-chain>
<transaction-chain>
<title key="510" property-id="214" property-name="B:039 L:008"/>
<title key="422" property-id="214" property-name="B:039 L:008"/>
</transaction-chain>
</transaction-chain>
Here you would conceivably have two distinct chains:
- 606 -> 507 -> 421
- 606 -> 510 -> 422
and any stats based on these would end up counting the sale of 606 twice (which might well be legitimate, because it is split, so there are arguably two transactions).
It's worth noting that in most of the complex chains I'm seeing, an initial split into two or more titles is then followed by their being re-united very quickly.
Some quick stats:
- 713 primary chains exist (meaning that there are 713 chains which start from a title which has no preceding-title).
- 411 of these primary chains go nowhere (in other words, there is no subsequent title, so no transactions take place other than the primary title purchase).
- Therefore there are 302 instances of actual usable chains involving one or more sale.
- 253 of those chains are simple, in that there are no splits. (There could be unions, though, because I'm not detecting those yet).
- 53 of the chains split into sub-chains.
- 28 of the chains involve more than one property.
- 186 titles appear in more than one root transaction chain (suggesting there may be up to 100 merges between root chains, something distinct from the examples above where a root chain splits and then merges again).
- 40 root chains feature the same title more than once (meaning that the chain splits, then merges again at some point).