Steady progress...
...of the Japanese Provisional names.
Perhaps a week or so to finish the JProv names from here.
Moving steadily along. I keep finding and fixing errors in the encoding too. Today I found three people missing from the 1942 person directory.
Slow but steady progress through the Japanese Provisional names.
Prior to my going on vacation at the end of June, we completed the following tasks
- add legal information to Mizuta spreadsheet of street addresses in Steveston
- scan map pages from Mizuta book on Steveston
- scan map pages from Yamaga book on Haney
- scan map pages from Hashizume book on mission (to help decide if area should be extended)
- transcribe into XML 1949 BC and Yukon Directory Street listing for Vancouver
This and the previous post covers what we did (and plan to do) this summer).
In my month's absence and in the week and a half since I got back, Josie and Ariel have completed work on virtually all of the tasks assigned before I left, namely transcribing into valid xml:
- 1949 BC and Yukon Person Directory for Vancouver
- 1930 Wrigley Directory for Haney
- 1930 Wrigley Directory for Steveston
- 1942 BC and Yukon Directory for Haney
- 1942 BC and Yukon Directory for Steveston
- 1943 BC and Yukon Directory for Haney
- 1943 BC and Yukon Directory for Steveston
- 1948 BC and Yukon Directory for Haney (1949 not available)
- 1948 BC and Yukon Directory for Steveston (1949 not available)
- list of trade licences (~800) for Vancouver
- list of incorporated companies (~100) for Vancouver
- file of protest letters and Custodian responses (~480 documents)
In addition, we've
- transcribed into a spreadsheet 4 lists of unsold properties held by the Custodian
- reviewed assignment of ethnicity for 9000 names and noted errors
- proofed and corrected markup for names
Todo:
complete the review of ethnicity
for protest letters, maybe add attribute to explicitly connect documents into correspondences, though it may be doable programatically
for protest letters, modify schema to allow markup of names and money values within list-item elements, then markup file appropriately
transcribe into xml the biography information from the Mission hitory book, as a number of those people appear in the protest letters
Josie is done in the next day or two, Ariel has 4 weeks left as she's been working part-time while taking summer courses.
xml_name_eth_diagnostics.xsl file goes through all the xml data files and for each persName generates a record with that person's name. The original block of code grabbed the surname and forename and left any other text, but took them in the order they appeared in the data file, so in the output text file you didn't have those two pieces of information ordered consistently.
It took me a few hours to write some code that gets me acceptable results (i.e. surname followed by the forename (or expanded version of forename if there is an abbr and and expan) and no addName or anything else.
<xsl:template match="persName">
<!-- SLA160803 : original code is first block below - didn't ensure consistent ordering; lower block forces order and includes expan if it exists, otherwise includes text of forename
<xsl:variable name="content"><xsl:apply-templates/></xsl:variable>
<xsl:value-of select="normalize-space($content)"/>
-->
<xsl:variable name="surname" select="string(./surname)" />
<xsl:choose>
<xsl:when test="./forename/choice">
<xsl:variable name="forename" select="string(./forename/choice/expan)" />
<xsl:value-of select="normalize-space(concat($surname, ' ', $forename))" />
</xsl:when>
<xsl:otherwise>
<xsl:variable name="forename" select="string(./forename)" />
<xsl:value-of select="normalize-space(concat($surname, ' ', $forename))" />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
In sorting this out, I discovered that our schema and schematron allow a persName element to contain more than one surname or forename elements, but this processing code breaks if there is more than one surname or forename, so I found and edited the approximately 15 instances, all of which were human error of one kind or another, so there are no legitimate cases of a persName having more than one surname or more than one forename element, so I've left this code assuming max of one surname and forename per persName.
convert *.jpg -interlace Plane -gaussian-blur 0.05 -compress jpeg -quality 85 ${FNAME}
convert -quality 100 -units PixelsPerInch -density 96 picture1.png picture2.jpg pictur3.tif output.pdf