18/01/19

Permalink 05:24:01 pm, by mholmes, 1 words, 9 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 437 + 1 = 438 hours G&T

ISE.

17/01/19

Permalink 05:42:07 pm, by mholmes, 18 words, 9 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 435 + 2 = 437 hours G&T

Determined to get an article both finished and submitted to a journal call. It needed a concentrated sprint.

Permalink 05:41:17 pm, by mholmes, 56 words, 10 views   English (CA)
Categories: Activity log; Mins. worked: 420

JTEI day

Dealt with two submissions, including a full copyedit of one of them; met with KT and went through the list of todos; made some edits to one which is in proofing; and turned my own original proposal document and subsequent presentation into a full article, which I submitted to the conference call for the TEI 2018 conference.

Permalink 05:05:25 pm, by jtakeda, 241 words, 8 views   English (CA)
Categories: Activity log; Mins. worked: 480

Major rework of annotation matching code

I realized that the annotation matching code was a bit too generous with its selection of nodes, due to the preceding:: and following:: axis. I've now reworked it into something much more efficient and cleaner. The problem is that there are a number of different ways in which an editor might select a bit to annotation; the best way would be to point directly to an element (say an l or an lg), but most often editors want to gloss particular words or phrases using an anchor, so we have to handle both (and both in combination: say something starts at a line but ends halfway through another). So the solution to this is to first generate all the nodes that are between the two boundary points (using the incredibly fast << and >> selectors, which I've just learnt about) and then group them based off of ancestor nodes that were in that original selection of nodes. (That's not a very elegant way to describe it, but it's all documented heavily in the XSLT.)

This new process shored up a number of issues with the annotations. Many of them are aesthetic, some are functional, and some are processual. We'll have to have a set of diagnostics to check whether or not something that was annotated with anchors could be better annotated by pointing to a line or a line group. That won't be trivial, so that'll take some time

16/01/19

Permalink 04:20:18 pm, by mholmes, 172 words, 6 views   English (CA)
Categories: Activity log; Mins. worked: 180

Wrote XSLT to amend faulty rhyme labels

KF asked me today if we could have a fix for rhyme-labelling issues, in particular where you get through an entire labelling sequence then you discover that where you assigned label "m" you failed to notice that it was the same as "c" so you need to change it to "c" and re-label everything after "m" to move it down one slot. This is a slightly more tricky problem than you might think, so rather than try to squash it into a quickfix, which doesn't allow for as much graceful termination with useful messages when you ask it to do something that doesn't make sense, I've created it as an XSLT transform with a framing transformation scenario, which is not the default for XML documents but which you could run on them by using "Transform with". I've also written a fairly extensive XSpec test suite for it. In the process of developing and testing, I fixed some old encoding from back in the day. That's going to be a long steady process.

Permalink 03:50:17 pm, by mholmes, 1972 words, 6 views   English (CA)
Categories: Activity log; Mins. worked: 180

Deep dive into one Maple Ridge property

As instructed by JSR, did a deep dive into the property. Results:


Summary: the lot spreadsheet appears to be constructed (in this test case, at least) according to the rules that we set up. The full process for checking this appears in a short version and a long version below, but first, my take:

MY TAKE ON THIS:

There are two issues here. First, the Maple Ridge properties start to fragment very quickly indeed, giving rise to a huge number of descendant properties and therefore titles; the chances of getting a simple, clean sequence which aligns with the narrative you're chasing are probably pretty slim.

Second, we don't have a rule which says that when we are retrieving T3, it should NOT be a custodian sale IF the T2 is a custodian sale. I think it would make sense to have that rule, to avoid the situation we see in this case where T2, T3 and T4 are all custodian sales. Alternatively, we could adopt a rule which says that when there are multiple custodian sales to choose from at the T2 stage, we always take the latest rather than the earliest. Nevertheless, these measures may not solve your problem; do we need a similar rule that says that if T3 is a VLA sale, then T4 is not allowed to be one? What if T3 is not a VLA sale, but T4 turns out to be one?

Another option would be to adopt a rule which says that at every stage, the subsequent property in the sequence must be a descendant property not of the original ur-property but of the property or properties which were on the preceding retrieved title -- so T3 must be on descendant properties of the properties on T2, not on T1, and T4 must cover descendant properties of T3. This is more likely to result it what looks like a coherent sequence, but it is also much more likely to fail to retrieve a successful sequence at all, resulting in a much diminished number of spreadsheet rows, and therefore less statistically convincing results.

---------------------------------------------------------------------

DEEP DIVE INTO PROPERTY 21178

Base property: T:12 S:20 Q:NE 1/4 PL:3189 L:1 (p_21178)

---------------------------------------------------------
QUERY on overlaps.xml:

declare default element namespace "http://hcmc.uvic.ca/ns";

let $baseProp := '21178',
$overlaps := //overlap[@prop1=$baseProp or @prop2=$baseProp][xs:float(@percentProp1InProp2) ge 50],
$overlapIds := for $o in $overlaps return ($o/@prop1, $o/@prop2),
$distinctOverlapIds := distinct-values($overlapIds)
return string-join($distinctOverlapIds, ', ')
---------------------------------------------------------

Overlapping properties: 21373, 21298, 21297, 21622, 21623, 21616, 21617, 21618, 21621, 21619, 21620, 21574, 21575, 21416, 21682, 21681, 21374, 21614, 21376, 21375, 21615, 21330, 21329, 21255, 21252, 21257, 21256, 21254, 21253

Ids of titles which cover all of these properties:

---------------------------------------------------------
XQuery on landscapes_mapridgelive_current_titleBased.xml:

let $titles := //title[props/prop[@refid=('21373', '21178', '21298', '21297', '21622', '21623', '21616', '21617', '21618', '21621', '21619', '21620', '21574', '21575', '21416', '21682', '21681', '21374', '21614', '21376', '21375', '21615', '21330', '21329', '21255', '21252', '21257', '21256', '21254', '21253')]],

$titleDesc := for $t in $titles order by $t/titleInfo/ttl_date return concat($t/@id, ': ', $t/titleInfo/ttl_title_code, ' (', $t/titleInfo/ttl_date, ')')

return string-join((for $t in $titles order by $t/titleInfo/ttl_date return $t/@id), ', ')
---------------------------------------------------------

26684, 26959, 27763, 26688, 26675, 26723, 26732, 26733, 26725, 26730, 26788, 26796, 26791, 26792, 26799, 26807, 26811, 26809, 26812, 26814, 26816, 26762, 26763, 26883, 26927, 26854, 26884, 26885, 26929, 26974, 26984, 26857, 26849, 26987, 26871, 27564, 26945, 26956, 26944, 26955, 26958, 26994, 27581, 27582, 27929, 27014, 27015, 27016, 27571, 27017, 27020, 27654, 26949, 26970, 27930, 27649, 27936, 27021, 27667, 27668, 27712, 27714, 27673, 27729, 27728, 27730, 27731, 27734, 27733, 27677, 27940, 27727, 27759, 27679, 27766, 27710, 27769, 27724, 27708, 27761

Details of these titles:

XQuery on landscapes_mapridgelive_current_lotBased.xml:
---------------------------------------------------------
let $titleIds := ('26684', '26959', '27763', '26688', '26675', '26723', '26732', '26733', '26725', '26730', '26788', '26796', '26791', '26792', '26799', '26807', '26811', '26809', '26812', '26814', '26816', '26762', '26763', '26883', '26927', '26854', '26884', '26885', '26929', '26974', '26984', '26857', '26849', '26987', '26871', '27564', '26945', '26956', '26944', '26955', '26958', '26994', '27581', '27582', '27929', '27014', '27015', '27016', '27571', '27017', '27020', '27654', '26949', '26970', '27930', '27649', '27936', '27021', '27667', '27668', '27712', '27714', '27673', '27729', '27728', '27730', '27731', '27734', '27733', '27677', '27940', '27727', '27759', '27679', '27766', '27710', '27769', '27724', '27708', '27761'),

$titles := for $tId in $titleIds return //titles/title[@id = $tId][not(preceding::titles/title[@id=$tId])],

$titleInfos := for $t in $titles order by $t/effectiveDate return concat($t/@id, ': ', $t/code, '	',
$t/effectiveDate, ' ',
' nominal: ', if ($t/@isNominal = 'true') then 'yes ' else 'no  ',
' custSale: ', if ($t/hasCustodianSeller = '1') then 'yes ' else 'no  ',
' vlaBuy: ', if ($t/hasVlaOwner = '1') then 'yes ' else 'no  ',
' vlaSell: ', if ($t/hasVlaSeller = '1') then 'yes ' else 'no  '
)

return string-join($titleInfos, '
')
---------------------------------------------------------

26688: 32738E    1920-05-10  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26675: 74159E    1927-10-16  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26684: 106669E    1934-03-03  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26723: 106822E    1934-03-21  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26732: 107183E    1934-04-24  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26733: 107254E    1934-05-02  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26725: 107963E    1934-07-18  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26730: 109668E    1934-12-06  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26788: 113071E    1935-10-29  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26796: 116972E    1936-09-18  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26791: 132983E    1939-09-15  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26792: 149801E    1942-04-02  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26799: 166279E    1943-12-28  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26807: 169926E    1944-04-17  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26811: 169945E    1944-04-17  nominal: no   custSale: yes  vlaBuy: yes vlaSell: no
26809: 169933E    1944-04-18  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26812: 169952E    1944-04-18  nominal: no   custSale: yes  vlaBuy: yes vlaSell: no
26814: 170808E    1944-04-25  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26816: 170828E    1944-04-25  nominal: no   custSale: yes  vlaBuy: yes vlaSell: no
26762: 171885E    1944-05-18  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26763: 171903E    1944-05-18  nominal: no   custSale: yes  vlaBuy: yes vlaSell: no
26883: 175591E    1944-09-02  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26927: 281892E    1951-04-16  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26854: 291940E    1952-01-11  nominal: no   custSale: no   vlaBuy: no vlaSell: yes
26884: 295379E    1952-04-16  nominal: no   custSale: no   vlaBuy: no vlaSell: yes
26885: 295380E    1952-04-16  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26929: 313897E    1953-08-05  nominal: no   custSale: no   vlaBuy: yes vlaSell: no
26974: 322796E    1954-02-23  nominal: no   custSale: no   vlaBuy: no vlaSell: yes
26984: 359159E    1956-01-26  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26857: 412345E    1958-07-28  nominal: yes  custSale: no   vlaBuy: yes vlaSell: no
26849: 423191E    1958-12-05  nominal: yes  custSale: no   vlaBuy: no vlaSell: yes
26987: 448583E    1959-12-14  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26871: 450855E    1960-01-07  nominal: yes  custSale: no   vlaBuy: no vlaSell: yes
27564: 452250E    1960-01-18  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26945: 467021E    1960-09-26  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26956: 469702E    1960-11-17  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26944: 468183E    1960-11-30  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26955: 468184E    1960-11-30  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26958: 484848E    1961-09-12  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
26994: 507210E    1962-11-20  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26959: 533338E    1964-04-09  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27581: 538458E    1964-07-13  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27582: 538459E    1964-07-13  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27929: 608026E    1967-08-14  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27014: 627673E    1968-03-13  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27015: 627674E    1968-03-13  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27016: 627675E    1968-03-13  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27571: 634784E    1968-06-18  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27017: 634747E    1968-07-09  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27020: 640865E    1968-09-25  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27654: 656339E    1969-03-10  nominal: no   custSale: no   vlaBuy: no vlaSell: no
26949: 658596E    1969-04-09  nominal: yes  custSale: no   vlaBuy: no vlaSell: yes
26970: 658597E    1969-04-09  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27930: 660594E    1969-05-02  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27649: 675235E    1969-10-30  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27936: 683572E    1970-01-27  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27021: 728793E    1971-09-21  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27667: G72055E    1971-10-29  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27668: G112249E    1972-02-01  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27712: H10857E    1972-02-24  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27714: H10856E    1972-02-24  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27673: H14697E    1972-02-28  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27729: H48509E    1972-05-30  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27728: H42064E    1972-06-07  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27730: H58317E    1972-06-20  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27731: H60992E    1972-07-10  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27734: H65306E    1972-07-24  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27733: H90242E    1972-09-13  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27677: H92378E    1972-09-15  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27940: H96413E    1972-09-27  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27727: H101791E    1972-10-18  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27759: H124001E    1972-12-06  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27679: J5100E           1973-01-24  nominal: no   custSale: no   vlaBuy: no   vlaSell: no
27766: J55243E    1973-06-06  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27710: J71976E    1973-07-04  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27763: J86568E    1973-07-06  nominal: no   custSale: no   vlaBuy: no vlaSell: no
27769: K20906E    1974-03-14  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27724: K68691E    1974-08-02  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27708: K85245E    1974-08-20  nominal: yes  custSale: no   vlaBuy: no vlaSell: no
27761: L91605E    1975-11-07  nominal: no   custSale: no   vlaBuy: no vlaSell: no


To select T1 for this property:

Get the latest non-nominal property FOR THE ORIGINAL BASE PROPERTY before the starting date (1942-03-04):

26675: 74159E    1927-10-16  nominal: no   custSale: no   vlaBuy: no vlaSell: no

To select T2 for this property and its descendants we first look for any custodian seller
titles in the period at all, nominal or not:

26811: 169945E    1944-04-17  nominal: no   custSale: yes  vlaBuy: yes vlaSell: no
26812: 169952E    1944-04-18  nominal: no   custSale: yes  vlaBuy: yes vlaSell: no
26816: 170828E    1944-04-25  nominal: no   custSale: yes  vlaBuy: yes vlaSell: no
26763: 171903E    1944-05-18  nominal: no   custSale: yes  vlaBuy: yes vlaSell: no

Since we find some, we take the first in the list as our T2:

26811: 169945E    1944-04-17  nominal: no   custSale: yes  vlaBuy: yes vlaSell: no

To get T3, we now get the next qualifying title (i.e. the next qualifying title which follows
the custodian sale. That also happens to be a custodian sale, this time to the VLA:

26812: 169952E    1944-04-18  nominal: no   custSale: yes  vlaBuy: yes vlaSell: no

To get T4, we do the same thing again: get the next qualifying title, which turns out
to be another custodian sale:

26816: 170828E    1944-04-25  nominal: no   custSale: yes  vlaBuy: yes vlaSell: no

To get T5, we get the last qualifying transaction on a descendant property which happens before
the end date (1972-03-04):

27936: 683572E    1970-01-27  nominal: no   custSale: no   vlaBuy: no vlaSell: no

This sequence makes sense according to the rules and the spreadsheet. 

15/01/19

Permalink 06:25:22 pm, by jtakeda, 109 words, 8 views   English (CA)
Categories: Activity log; Mins. worked: 320

H5 conversion

After corresponding with JM, determined list of priorities for the conversion. I decided to do H5 first since it is one of the most densely annotated/collated and the editorial process should be swift.

Converted H5_FM and Q1M and got annotations working; there were a few bugs in the HTML code, which I fixed. Also discovered the issue with part lines; the IML usually flagged the part lines with an empty line preceding the inferred line (i.e. via one of the LNs), so re-wrote bits of the conversion to handle that. Also had to make various fixes to the apparatus linking code to handle supplied tags.

Permalink 03:40:11 pm, by mholmes, 89 words, 7 views   English (CA)
Categories: Activity log; Mins. worked: 120

ISE: Crawled, now fixing

I now have (I hope) full crawls of the ISE site, and I will carry on to get full crawls of the others. I'm now at the stage where I'm trying to fix problems, such as the prominent message touting a user survey which needs to be removed from every page, and the fact that the nbsp entity is referenced everywhere (it should be a numerical entity). A few more hours of this sort of work will be needed, given the scale of the project (19GB+ for ISE itself).

Permalink 01:59:40 pm, by mholmes, 104 words, 11 views   English (CA)
Categories: Activity log; Mins. worked: 120

Old volume tagging: lots of updates needed

As JN proceeds with her tagging of volume 14, we're discovering that there are lots of cases in which the old volumes do things that we don't currently allow (such as putting footnotes in article titles). I've had to add processing for those things. I'm developing separate guidelines for encoding these old articles on the basis that they're never going to get the fully-detailed encoding I usually do, but they need enough to ensure that they're rendered correctly, and at a pinch, this documentation could be provided to a future encoder replacing me on this journal, since it will work just fine for rapid encoding.

14/01/19

Permalink 05:19:42 pm, by mholmes, 2 words, 9 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 434 + 1 = 435 hours G&T

Late duty.

:: Next Page >>

All HCMC Blogs

Actions

Reports

XML Feeds