Archives for: 2010

23/12/10

Permalink 04:30:41 pm, by kim, 81 words, 95 views   English (CA)
Categories: Activity log; Mins. worked: 5

Kim's year-end note

As the year wraps up, I thought I would jot a quick note as we head into holidays.

I continue to add pb tags to the 1859 files. Along the way, I have discovered that we need to process a few more image batches, as follows:

CO 60/5

CO 60/6

CO 398/1

The above contain a sum of roughly 120 despatches from 1859

I will ask Theo to work on this in the new year. Between the two of us, we should make short work of it!

17/12/10

Permalink 02:07:21 pm, by mholmes, 299 words, 70 views   English (CA)
Categories: Activity log; Mins. worked: 240

Progress with OAI code

I've re-focused on the task of generating and storing the OAI records in the database, in such a way that they can be updated easily whenever the db contents change. I've written a library called oai_update.xq, which has the original record-generating code from my first attempt, but massaged a bit so that it uses explicit namespace prefixes for TEI; this is necessary because we need to generate the record fragments in no namespace, so it's easier if we don't have a default one. I also fixed a couple of bugs which emerged when I tested my code on the whole 7000+ documents. This is what it does:

  • For each record in the correspondence collection, it checks whether there's an OAI record.
  • If there isn't, it generates one.
  • If there is, it compares the modified date on the OAI record against that of the original correspondence record, and if the former is older, it deletes it and generates a new one.

This is what it's not yet doing:

  • Removing OAI documents for any correspondence documents that no longer exist (occasionally we remove a document when we find a duplicate). This will be fairly easy to do.

As I write this, I'm generating a set of OAI records for the whole up-to-date collection on my local copy of the machine. In the new year, I should be able to dump those and upload them into the live db to pre-populate it. Then I can add the feature above, and then write sitemap pipelines for the operations and add them to my set of periodic update operation tasks. Finally, I can then finish the OAI interface, which should be much simpler, since it'susing existing records instead of querying source data and constructing records.

Reminder to self: the OAI docs are here.

Permalink 08:46:31 am, by mholmes, 156 words, 75 views   English (CA)
Categories: Activity log; Mins. worked: 15

Storing and removing a doc in the db

Note to self: this is a simple, tested way of storing a document in the db:

declare default element namespace "http://www.tei-c.org/ns/1.0";
declare namespace  xdb="http://exist-db.org/xquery/xmldb";
declare namespace util="http://exist-db.org/xquery/util";

let $doc := <doc><test>My test doc</test></doc>,
$coll := collection('/db/coldesp/oai/records/')
return xdb:store($coll, 'test.xml', $doc)

This will delete a document:

return xdb:remove('/db/coldesp/oai/records/', 'test.xml')

This snippet will delete a document if it exists, then replace it:

let $remove := 
	if (fn:doc('/db/coldesp/oai/records/test.xml')) then
           xdb:remove('/db/coldesp/oai/records/', 'test.xml')
	else (),
$doc := <doc><test>My test doc 2</test></doc>,
$coll := collection('/db/coldesp/oai/records/')
return
xdb:store($coll, 'test.xml', $doc)

16/12/10

Permalink 02:06:24 pm, by mholmes, 430 words, 69 views   English (CA)
Categories: Activity log; Mins. worked: 300

OAI-PMH interface: a change of plan

Today I finished the implementation of the GetRecord response, which is very substantial indeed. I then started working on ListIdentifiers, and got to the point where I was able to start testing the execution time of some queries. The results demonstrate that it's going to be entirely impractical to generate this data on-the-fly. We're going to have to generate it in advance and store it, in the OAI record format, and then run the OAI queries against that collection. So this is what I'm now planning to do:

  • Create a collection called oai in the database.
  • Create two collections inside it, one called meta and one called records.
  • Inside meta, store a document called sets.xml, which contains the entire ListSets response.
  • Also inside meta, store a document called identify.xml, which contains the entire Identify response.
  • Finally, inside meta, store a document called metadataFormats.xml, which contains the entire ListMetadataFormats response.
  • Inside records, store a generated full record for every correspondence document, possibly using the same xml:id attribute as on the original document.
  • Take code from my existing oai.xq file, and create a new library which does the following:
    1. For each document in the correspondence collection:
    2. If there's no corresponding record document in the oai/records collection, create one;
    3. Otherwise, if there is one already, check the last-modified date of the document against that in the record file, and if it's later, delete the record file and create a new one.
  • The initial task of creating records will be time and processor-heavy, so it might be done in the admin client in batches (by year, for instance). However, once it's done, an update pipeline can be created so I can call it on the server on a regular basis to update the metadata records.
  • Rewrite my oai.xq library so that it handles all requests using the data in the oai collection.

I've already implemented the three documents inside the meta collection, and simplified my oai.xq accordingly. Now I have to generate the records, before I can start working on the query interface. It's pretty certain I'll have to use the resumptionToken functionality -- I'll perhaps feed out records in sets of (max) 100. I'm going to encode the entire request in the resumptionToken so that I don't have to cache the query or results; that'll be simpler, and will obviate the need to periodically clear out the data from the cache.

I've started on the update script, and I'm going use code like this example to store the documents.

This won't be finished until next year.

15/12/10

Permalink 04:56:57 pm, by mholmes, 169 words, 74 views   English (CA)
Categories: Activity log; Mins. worked: 240

OAI-PMH interface coming along

Progress so far:

  • Implemented default responses for all six verbs, mostly with default info and placeholders.
  • Fully implemented the ListSets response, which has quite a range of sets.
  • Implemented the core of GetRecord, which calls out to a getDocRecord() function that does the real work.
  • Implemented about half of the getDocRecord() function; it can return a header (fulfilling the needs of ListRecords), and I'm just getting started on the Dublin Core metadata output in the <metadata> tag.
  • Fully implemented ListMetadataFormats (easy, because we're only supporting oai_dc).
  • Fully implemented Identify.

One thing I'm currently undecided on is whether I should bother with the resumptionToken functionality. If I do, that means I'll have to cache the parameters of the request in the db somewhere and retrieve them in response to the token, which is a bit of a pain; I'm more inclined to let the whole thing run, and only worry about the resumption token if it seems likely that the results will be too large to handle.

Permalink 03:45:28 pm, by mholmes, 33 words, 345 views   English (CA)
Categories: Activity log, Announcements; Mins. worked: 30

CO Series 6 vol 31 page images now posted to the site

A further 983 page-images have been added to the manuscript image browser, covering British North America correspondence from 1859 (Individuals). Transcriptions are now being linked into these images. CO 6 vol 30 is close to completion too.

14/12/10

Permalink 03:24:45 pm, by mholmes, 225 words, 82 views   English (CA)
Categories: Activity log; Mins. worked: 120

OAI-PMH API implementation

I've started implementing some back-end XQuery to respond to requests from OAI-PMH harvesters, according to the specifications and guidelines here. I'm intending to implement the baseURL as bcgenesis.uvic.ca/oai.xq, and handle all requests through a single XQuery library, which I've begun writing. So far I've implemented verb checking, return of passed arguments in the request element, and the UTC response date-time. Most of the time so far has been spent wading through the spec, which is predictably meticulously unilluminating, but there are examples, and it looks straightforward. My projected implementation of identifiers is going to look like this: oai:bcgenesis.uvic.ca:B63030SP.scx.xml, where the last component has the @xml:id attribute of any XML file in the database, and the final suffix dictates the format (XML or XHTML) of the resource, so we can treat XHTML and XML versions of the data as separate items. I think this makes sense, although my plans may change through the process of implementation.

I'm proposing to use sets for document type, year, and possibly others, with a hierarchy of type:year; this also may change. I'm hoping this won't take too long to implement, given that we're already spitting out pretty comprehensive Dublin Core for all the transcription documents, but handling the personography and other modern data may be more problematic.

Permalink 09:42:48 am, by mholmes, 316 words, 67 views   English (CA)
Categories: Activity log; Mins. worked: 90

Microfilm madness

CP and I spent some time this morning trying to figure out the history of incoming despatches and their attachments, in an effort to figure out what microfilms we need to order and digitize next year. We have pieced together a likely scenario that explains some of what we're seeing:

  • RG7 GC8 contains images of original despatches received from London, but stripped of all their enclosures and attachments. It seems likely that the encs and atts were stripped off in the BC Archives (where some of them still reside -- e.g. the commission for Douglas); then the microfilming was done on the bare despatches.
  • These are parallel to the 410 documents, which are the London letterbook copies of their outgoing despatches. Sometimes we have two copies of the same document, one from each series (e.g. this RG7 and this 410 document).
  • JH apparently transcribed from both sources (at least, transcriptions are labelled as having been from both, although it's possible he only actually used 410).
  • KSW digitized the RG7 series from 16mm microfilm earlier this year; it was ordered in specially, and was not part of the large orders to LAC which we're still processing.
  • Where we are attempting to reunite a despatch with its original attachments or enclosures, we should prefer the RG7 source, since this is the "real" document which was with the attachment.
  • However, it seems that differences other than pagination are minimal, since incoming despatches were not annotated; therefore there's no overriding need to switch any existing transcriptions from 410 to RG7 unless we are seeking out and reattaching important documents (such as the Blanshard and Douglas commissions).
  • There is also a series of transcriptions marked "PABC", which JH transcribed from original documents in the BC archives, and marked as "not on microfilm"; however, CP believes these are now on microfilm, and we should order that in an digitize it to support these transcriptions.

07/12/10

Permalink 12:43:16 pm, by mholmes, 27 words, 352 views   English (CA)
Categories: Activity log, Announcements; Mins. worked: 20

CO Series 6 vol 29 page images now posted to the site

A further 817 page-images have been added to the manuscript image browser, covering British North America correspondence from 1858 (Individuals, N-Z). Transcriptions are now being linked into these images.

02/12/10

Permalink 08:52:56 am, by mholmes, 70 words, 86 views   English (CA)
Categories: Activity log; Mins. worked: 30

Retrieved latest stats

The stats on Megapode seem to have been backfilled with stats from previous times, going back to February 2009, so I've now retrieved the six stat blocks I'm tracking for both 2009 and 2010 up to Nov 30. The aim is to have one file for each year, but just in case stats end up getting lost, as has happened in the past, I download the year-so-far stats at the end of every month.

30/11/10

Permalink 08:19:03 am, by mholmes, 27 words, 276 views   English (CA)
Categories: Activity log, Announcements; Mins. worked: 30

CO Series 6 vol 28 page images now posted to the site

A further 814 page-images have been added to the manuscript image browser, covering British North America correspondence from 1858 (Individuals, F-M). Transcriptions are now being linked into these images.

26/11/10

Permalink 02:16:13 pm, by mholmes, 19 words, 57 views   English (CA)
Categories: Activity log; Mins. worked: 5

7342 abbreviations tagged so far

Long way to go yet, but that's already more than one abbreviation per document in the collection, on average.

Permalink 02:03:00 pm, by mholmes, 38 words, 60 views   English (CA)
Categories: Activity log; Mins. worked: 20

Abbreviation makrup: recd for received

Another relatively straightforward one. Regexp:

(?<!abbr>)([Rr])(ec?<hi rend="[^"]+super+[^"]+">d\.?</hi>\.?)
<choice><abbr>$1$2</abbr><expan>$1eceived</expan></choice>
Permalink 01:44:32 pm, by mholmes, 62 words, 65 views   English (CA)
Categories: Activity log; Mins. worked: 20

Abbreviation markup: C[o]y for Company

Very common, especially in the name of the HBC. Preceding components of that name display remarkable variation, so most of those will probably end up being done manually. Regexps:

(?<!abbr>)([C])([o]?<hi rend="[^"]+super+[^"]+">y\.?</hi>\.?)
<choice><abbr>$1$2</abbr><expan>$1ompany</expan></choice>
Permalink 01:19:53 pm, by mholmes, 36 words, 66 views   English (CA)
Categories: Activity log; Mins. worked: 15

Abbreviation markup: desph for despatch

Done with regexps:

(?<!abbr>)([Dd])(esp<hi rend="[^"]+super+[^"]+">h</hi>\.?)

<choice><abbr>$1$2</abbr><expan>$1espatch</expan></choice>

Permalink 11:27:27 am, by mholmes, 51 words, 73 views   English (CA)
Categories: Activity log; Mins. worked: 10

Abbreviation markup: wh for which

This is common, and I suspect there may be other variants I'll catch later on. This is the regexp:

(?<!abbr>)([Ww])(<hi rend="[^"]+super+[^"]+">h\.?</hi>\.?)

<choice><abbr>$1$2</abbr><expan>$1hich</expan></choice>

Permalink 11:16:23 am, by mholmes, 74 words, 60 views   English (CA)
Categories: Activity log; Mins. worked: 15

Abbreviation markup: Atty and Att. for Attorney

Two strategies for this: for the superscript one, regexp, and for the simple one just a case-sensitive search and replace:

(?<!abbr>)([A])(tt<hi rend="[^"]+super+[^"]+">y\.?</hi>\.?)

<choice><abbr>$1$2</abbr><expan>$1ttorney</expan></choice>
Att.

<choice><abbr>Att.</abbr><expan>Attorney</expan></choice>
Permalink 11:02:23 am, by mholmes, 36 words, 68 views   English (CA)
Categories: Activity log; Mins. worked: 20

Abbreviation markup: Genl to General

Using this regexp:

(?<!abbr>)([Gg])(en<hi rend="[^"]+super+[^"]+">l\.?</hi>\.?)

<choice><abbr>$1$2</abbr><expan>$1eneral</expan></choice>

Permalink 10:47:51 am, by mholmes, 47 words, 78 views   English (CA)
Categories: Activity log; Mins. worked: 30

More abbreviation markup: Secretary

Used these regexps to mark up instances of "Secy" and "Secty" for "Secretary":

(?<!abbr>)([Ss])(ec[t]?<hi rend="[^"]+super+[^"]+">y\.?</hi>\.?)

<choice><abbr>$1$2</abbr><expan>$1ecretary</expan></choice>
Permalink 09:58:17 am, by mholmes, 83 words, 56 views   English (CA)
Categories: Activity log; Mins. worked: 60

New documents from JH

A handful of new documents has come in from JH, based on transcriptions from MF B-3006 through B-3008 (CO 6/19 through CO 6/23). These MFs have not been processed yet, so I'd like to wait on doing the markup until they have, but I took a look through the documents and found a couple of anomalies, one being mislabelled 6/25, and a name misspelled. The name issue caused me to find a similar error in one of our existing transcriptions, B585HB06, which I've now corrected.

25/11/10

Permalink 10:01:10 am, by mholmes, 121 words, 52 views   English (CA)
Categories: Activity log; Mins. worked: 30

Backing up processed images to Rutabaga

Just for the record: I'm backing up all the processed images from /home1t/coldesp/www/jpg_scans/ to Rutabaga, by ssh-ing into nfs.tapor.uvic.ca and running:

rsync --stats --recursive --times --delete --verbose -e ssh jpg_scans/ "mholmes@rutabaga.hcmc.uvic.ca:/home/mholmes/backups/Martin/Colonial\ Despatches/www/jpg_scans/"

This needs to be run periodically to keep the backup up to date. Of course home1t is backed up as a matter of course, but these images represent such an investment of time that I'd like to keep multiple copies. I'm also going to look at making a Rutabaga backup of /home1t/coldesp/archive/, which contains all the original image sets from which these were generated.

24/11/10

Permalink 11:07:15 am, by mholmes, 13 words, 46 views   English (CA)
Categories: Activity log; Mins. worked: 60

Planning/reporting meeting

Meeting with CP and KSW about grant report due now, and future plans.

23/11/10

Permalink 01:26:44 pm, by mholmes, 56 words, 64 views   English (CA)
Categories: Activity log; Mins. worked: 60

First steps towards a grant application

Spent an hour putting together some background and some ideas regarding the grant application, in the form of an email to KSW, NB and RP. I won't put it up here since it'll need a lot of work yet, but it's a reasonable start on the part of the grant application that will fall to us.

18/11/10

Permalink 02:36:58 pm, by kim, 13 words, 70 views   English (CA)
Categories: Activity log; Mins. worked: 120

Added PB tags to CO 6 / 27 files

Added PB tags to 17 files from 1858, that required CO 6/27 images to do so.

Permalink 09:43:59 am, by mholmes, 19 words, 286 views   English (CA)
Categories: Activity log, Announcements; Mins. worked: 20

CO 6 vol 27 images now available on the Colonial Despatches site

This is volume 3 of the 1858 material, covering British North America 1858 offices: General and Individuals A-E. There are 1421 new page-images.

10/11/10

Permalink 03:03:02 pm, by mholmes, 24 words, 66 views   English (CA)
Categories: Activity log; Mins. worked: 45

Interview with potential workstudy

Met with MP, who may be interested in doing some work on the Despatches maps, if he's not snapped up by the MoM project.

04/11/10

Permalink 11:47:07 am, by kim, 131 words, 161 views   English (CA)
Categories: Documentation; Mins. worked: 0

Maps of Interest

The following is a list of maps that we wish to add to the despatches-site map collection:

  • co_6_27_00299r.jpg -- "Frazer" River mineral-resources.
  • co_6_27_00328r.jpg -- "Railway Communication" and "Extensive System of Colonization."
  • co_6_28_00162r.jpg -- "The Grand Trunk Railway and it Connections, 1857" (left of fold-out).
  • co_6_28_00162v.jpg -- "The Grand Trunk Railway and it Connections, 1857" (right of fold-out).
  • co_6_28_00186r.jpg -- "Tracing A," various east-coast rail line routes.
  • co_6_28_00187r.jpg -- Untitled. Appears to be another tracing. Depicts various ship routes and distances across the Pacific and Atlantic oceans.
  • co_6_28_00283r.jpg -- by John Arrowsmith. Depicts various ship routes and distances across the Pacific and Atlantic oceans.
  • co_305_12_00007r.jpg -- Plan of Land reserved for Naval purposes at Esquimalt Vancouvers Island, 25 October 1858.

03/11/10

Permalink 08:06:23 am, by mholmes, 7 words, 65 views   English (CA)
Categories: Activity log; Mins. worked: 10

Updated CO 6 / 25 files in db

Eleven files updated (listed in KSW's post).

02/11/10

Permalink 04:49:43 pm, by kim, 65 words, 75 views   English (CA)
Categories: Activity log; Mins. worked: 5

Added PB tags to CO 6 / 25

This is part of the back-tracking work required for the files missing their corresponding images at the time of their original upload and proof. The following files were updated:

V585AD12.xml

V585FO02_A.xml

V585AD13.xml

V585AD14.xml

V585AD04_A.xml

V585AD08.xml

V585AD01_A.xml

V585AD09.xml

V585AD10.xml

V585AD11.xml

V475HB02.xml

Permalink 01:30:04 pm, by mholmes, 29 words, 334 views   English (CA)
Categories: Activity log, Announcements; Mins. worked: 30

CO Series 6 vol 26 page images now posted to the site

A further 1,123 page-images have been added to the manuscript image browser, covering British North America correspondence from 1858. Transcriptions are now being linked into these images and the CO 6 25 set.

Permalink 10:02:43 am, by mholmes, 97 words, 58 views   English (CA)
Categories: Activity log; Mins. worked: 30

Collected the latest stats

Took the latest stats from megapode (Urchin 6). They're still distorted by the Intermapper hits, so GN and I discussed using another "canary" -- possibly a web app just for that purpose -- to monitor our Tomcats, rather than bcgenesis. But the stats are rather a mess anyway because of the gap caused by moving servers (Sept 1 - Oct 4) and the fact that the old stats were Urchin 5 and these are Urchin 6. Very annoying. I don't think it's worth putting the work into integrating them and cleaning out the Intermapper hits unless somebody specifically asks for the stats.

28/10/10

Permalink 04:39:06 pm, by kim, 50 words, 58 views   English (CA)
Categories: Activity log; Mins. worked: 5

Image-processing update

Just a quick note to say that I am more than halfway through CO 6/26; the remainder should be completed early next week. Once this batch is complete, I will go back and connect some of the previous XMl files that had required this, and the Volume 25 collection, for their images.

25/10/10

Permalink 12:43:33 pm, by mholmes, 17 words, 264 views   English (CA)
Categories: Activity log, Announcements; Mins. worked: 30

CO 6 25 manuscript images added

1230 new page-images have been added to the site from Colonial Office Series 6 Vol 25 (British North America 1858 correspondence).

21/10/10

Permalink 10:35:54 am, by mholmes, 140 words, 210 views   English (CA)
Categories: Activity log; Mins. worked: 60

IE 9 bug with AJAX

Running a beta of IE9, I noticed that the AJAX functions on the site (retrieving bios, places etc.) weren't working. It turned out that there's something badly wrong with IE's DOM2 support. If you ask it whether it supports document.importNode, it says yes; but if you actually call document.importNode, it responds with "No such interface supported". I was using document.importNode to insert XHTML retrieved with AJAX into the page.

The conventional wisdom is that, rather than trying to detect browser versions, you should test for function support; but when the browser decides to lie about its support for a function, you have to fall back on detecting the browser itself. So I've now added a test for MSIE in the userAgent string to find IE, and fall back to using innerHTML. This also works OK for IE8.

Permalink 08:17:57 am, by mholmes, 653 words, 93 views   English (CA)
Categories: Activity log; Mins. worked: 120

TEI mime type serializer

Following instructions yesterday from CT on the TEI list, I've implemented a system for serializing TEI XML using the new TEI mime type, with a selector which determines whether the browser can handle it or not before using it, or using text/xml instead. These are the sitemap details:

In <map:serializers>:
          <!-- MDH: added this serializer to allow for UTF-8 XML output with TEI mime type. -->
          <map:serializer mime-type="application/tei+xml" name="tei" src="org.apache.cocoon.serialization.XMLSerializer">
            <encoding>UTF-8</encoding>
          </map:serializer>

In <map:selectors>:
	            	  <map:selector name="accept-content-type" 
        		    src="org.apache.cocoon.selection.RegexpHeaderSelector">
        		    <pattern name="tei">application/tei\+xml</pattern>
        		    <header-name>accept</header-name>
        		  </map:selector>

After <map:components>:

  <map:resources>
    <map:resource name="serialize-tei">
      <map:select type="accept-content-type">
        <map:when test="tei">
          <map:serialize type="utf8tei"/>
        </map:when>
        <map:otherwise>
          <map:serialize type="utf8xml"/>
        </map:otherwise>
      </map:select>
    </map:resource>
  </map:resources>

And in actual pipelines. use <map:call resource="serialize-tei"/> instead of <map:serialize type="utf8xml"/>. For example:

            <map:match pattern="getDoc.xml">
                <map:generate src="xq/doc.xq" type="xquery"/>
                <map:transform type="saxon" src="xsl/highlight_matches.xsl">
                    <map:parameter name="browserURI" value="{request:requestURI}?{request:queryString}"/>
                    <map:parameter name="queryString" value="{request:queryString}"/>
                </map:transform>
                <!--<map:transform type="saxon" src="xsl/add_xml_stylesheet.xsl" />-->
                <map:transform type="xinclude"/>
                <map:transform type="session"/>
                <map:transform type="encodeURL"/>
              <map:call resource="serialize-tei"/>
            </map:match>

For Firefox, you can set the browser to handle the TEI mime type in preference to the text/xml alternative by changing network.http.accept.default in about:config. This is my setting:

text/html;q=0.7,application/xhtml+xml;q=0.7,application/xml;q=0.7,text/xml;q=0.8,application/tei+xml;q=0.2,*/*;q=0.1

"q" settings are between 0 and 1, with higher priority for higher values, so here application/tei+xml is higher priority than text/xml.

To decide what happens to the mime type when the browser encounters it, you have to let the browser encounter it (unless you install the MIME Edit extension, which gives you actual control over mime type handling).

Other browsers are more problematic. This system works on the basis that the browser provides a prioritized list of acceptable mime types to the server (Cocoon), which can then serve application/tei+xml if the browser handles it. However, Chrome does not allow you to configure acceptable mime types, so it doesn't seem possible to make it announce to Cocoon that application/tei+xml is acceptable; therefore Cocoon will never deliver application/tei+xml to Chrome. If it were possible to deliver the correct mime type to Chrome, it would just hand it off to the OS or desktop (Gnome etc.) to deal with. Opera does allow you to configure what it will do with mime types, so you can set (for instance) application/tei+xml to open with oXygen instead of being displayed in the browser, but this does not appear to affect the list of mime types Opera sends to the server, so Cocoon is still delivering text/xml (as far as I know).

IE8 appears to do all file handling based on file extensions, which is not really helpful; there are hacks you can do in the registry, but it's not clear to me that they would succeed in achieving anything unless the file were delivered with a specific unique extension. IE9 beta is no different in this respect.

13/10/10

Permalink 01:58:48 pm, by kim, 35 words, 60 views   English (CA)
Categories: Activity log; Mins. worked: 5

Page Break (PB) Tag Clean-up

Martin and I identified over 200 files, for the years 1846-1858, absent of PB tags. Please see the document below for a detailed breakdown of the concerns and solutions for each: files_with_no_pb_tags

12/10/10

Permalink 04:12:56 pm, by mholmes, 91 words, 57 views   English (CA)
Categories: Activity log; Mins. worked: 120

Found some old copies of files that had not found their way into SVN

KSW found that some edited files from 1851 were further along in their editing process (addition of markup and pagebreak tags) than the ones in SVN, so we had to replace the SVN copies with the older ones; I then had to re-run the automated abbreviation markup regexps I'd (thankfully) documented carefully on the blog. We also found one file which had the wrong reference info (it was marked as CO 305 03 when it's actually in the War Office records), and another file which was a duplicate of the following file. All fixed.

Permalink 01:44:13 pm, by mholmes, 22 words, 252 views   English (CA)
Categories: Activity log, Announcements; Mins. worked: 30

Colonial Despatches: CO 60 04 page-images now available

Colonial Office volume 60/4 page images have now been processed and are available on the website. These cover BC despatches to London from 1859.

06/10/10

Permalink 02:43:04 pm, by mholmes, 24 words, 66 views   English (CA)
Categories: Activity log; Mins. worked: 90

Meeting with Eng dept folks

Had a discussion with four faculty members from the English dept, out of which we hope some projects relating to the Despatches will emerge.

Permalink 08:53:06 am, by mholmes, 21 words, 65 views   English (CA)
Categories: Activity log; Mins. worked: 45

Tweaked and fixed up presentation

Couldn't size SVG images reliably, so turned them into PNGs. Tested on laptop connected to projector via HDMI-to-DV, which works great.

05/10/10

Permalink 03:38:14 pm, by mholmes, 60 words, 64 views   English (CA)
Categories: Activity log; Mins. worked: 60

Copying page-images for JH

JH wants copies of all the page-images on a 1TB hard drive he's written, so I'm copying them over. I had to reformat the drive, which came with some Windows software and an odd partition structure which made it unusable. I'll have to leave some of the copy operations going overnight, but I'm hoping they can be done by tomorrow.

Permalink 10:34:54 am, by mholmes, 56 words, 64 views   English (CA)
Categories: Activity log; Mins. worked: 120

Finished presentation and set up projector

Finished the presentation, then GN set up the projector, which now has cables attached and ready, and we tested it. It'll probably need some reconfiguration tomorrow morning when I bring in my laptop, because the SVG graphics aren't easy to control without using pixel sizes, but other than that we should be good to go tomorrow.

04/10/10

Permalink 03:40:44 pm, by mholmes, 49 words, 59 views   English (CA)
Categories: Activity log; Mins. worked: 180

Planning a presentation for researchers

KSW and I have been working on a presentation for Wednesday, when we have some researchers coming who might be interested in projects relating to the Despatches. I've been working on it for most of the day. Still not sure exactly what to pitch and how to pitch it...

30/09/10

Permalink 03:30:22 pm, by mholmes, 59 words, 63 views   English (CA)
Categories: Activity log; Mins. worked: 30

"R" and "Rt" to "Right" preceding "Honourable"

Added the markup for these abbreviations manually, because here seems to be a bug in oXygen's handling of the replace operation when referring to a captured group in the case when a lookahead positive assertion is made. There were only 64; I was able to find them with a regex search in oXygen, but had to mark them up manually.

Permalink 09:58:33 am, by mholmes, 50 words, 78 views   English (CA)
Categories: Activity log; Mins. worked: 20

"Honble" to "Honourable" abbr/expan markup added

This was the regex:

(?<!abbr>)([Hh])(on<hi rend="[^"]+super+[^"]+">ble\.?</hi>\.?)
<choice><abbr>$1$2</abbr><expan>$1onourable</expan></choice>

Now looking at the variety of preceding "R", "Rt" etc. for "Right Honourable".

Permalink 08:44:13 am, by mholmes, 149 words, 61 views   English (CA)
Categories: Activity log; Mins. worked: 30

"Would" and "Should" abbreviations done

I've implemented the following regex replaces:

(?<!abbr>)([Ss])(h<hi rend="[^"]+super+[^"]+">d\.?</hi>\.?)
<choice><abbr>$1$2</abbr><expan>$1hould</expan></choice>
(?<!abbr>)([Ww])(<hi rend="[^"]+super+[^"]+">d\.?</hi>\.?)
<choice><abbr>$1$2</abbr><expan>$1ould</expan></choice>

For the record, the first one of these would find the following:

w<hi rend="vertical-align: super; font-size: 80%;">d</hi>

and mark it up like this:

<choice><abbr>w<hi rend="vertical-align: super; font-size: 80%;">d</hi>.</abbr><expan>would</expan></choice>

Changes are being uploaded to the database now. Committed a fresh revision to SVN after each operation.

29/09/10

Permalink 05:01:03 pm, by mholmes, 140 words, 60 views   English (CA)
Categories: Activity log; Mins. worked: 30

Regex and replacement for "shd" (= should) and "wd" (= would) (also a match for "wh")

The following are good to go:

(?<!abbr>)([Ss])(h<hi rend="[^"]+super+[^"]+">d\.?</hi>\.?)
<choice><abbr>$1$2</abbr><expan>$1hould</expan></choice>
(?<!abbr>)([Ww])(<hi rend="[^"]+super+[^"]+">d\.?</hi>\.?)
<choice><abbr>$1$2</abbr><expan>$1ould</expan></choice>

This is an expression for matching "wh", which as far as I can tell is always expanded to "which", but there are 1128 instances of it in the documents, so we should look at as many instances as possible to conclude that this is always consistent (and that it doesn't sometimes stand for "who", "whom", "what" etc.):

(?<!abbr>)([Ww])(<hi rend="[^"]+super+[^"]+">h\.?</hi>\.?)
Permalink 04:57:56 pm, by mholmes, 60 words, 63 views   English (CA)
Categories: Activity log; Mins. worked: 60

Automated abbreviation markup started

I've marked up instances of abbreviations of "government" and "governor", and I'm uploading them into the database. I've also set up a system whereby a script makes a copy of the core correspondence files (currently sorted into folders by their respective years) into a single correspondence folder on my hd, so I can upload more conveniently into the db collection.

Permalink 04:12:19 pm, by mholmes, 38 words, 49 views   English (CA)
Categories: Activity log; Mins. worked: 90

Team meeting on editorial policy

Met with the team to discuss some of JMH's concerns about the way our markup is being done. Made some modifications to the print stylesheet, which had got out of date and was showing images and irrelevant underlining.

Permalink 01:01:40 pm, by mholmes, 288 words, 46 views   English (CA)
Categories: Activity log; Mins. worked: 75

Automating markup of common abbreviations

JMH is concerned that we have accurately transcribed the abbreviations of common words such as "Govt" with superscripts, that appear all over the place. It has occurred to me that we could enable expansion of these quite easily using a search-and-replace, like this:

[Gg]ov<hi rend="[^"]+super+[^"]+">[^<]*t</hi>

which finds all the various abbreviations for "government" without finding those for "governor". We can do this, for a range of common abbreviations. I've already written some XSLT to make them into mouseovers as we do with the <choice>/<sic>/<corr> sets.

The above could be captured and used as a backreference with <abbr> wrapped around it, with the following replacement:

<choice><abbr>$0</abbr>
<expan>Government</expan></choice>

Initially I thought it might be simpler to replace instances beginning with a capital and those beginning with a lower-case letter separately, so that we can provide the accurate expansion in each case, but see below.

EDIT: I've refined the regex so that it won't operate on an instance that has already been processed, since we'll probably have to run it on files multiple times. This seems to be the best way to do it, using a negative lookbehind assertion, and capturing the first letter separately so we can reproduce capital or lower-case:

(?<!abbr>)([Gg])(ov<hi rend="[^"]+super+[^"]+">[^<]*t</hi>)
<choice><abbr>$1$2</abbr><expan>Government</expan></choice>

This seems to be working, but I'll need to do some more careful testing before setting it loose.

Permalink 10:23:02 am, by mholmes, 27 words, 354 views   English (CA)
Categories: Activity log, Announcements; Mins. worked: 30

Colonial Despatches: CO 305 vol 13 manuscript images now available

Colonial Office class 305 volume 13 page-images, covering Public Offices correspondence from 1859, are now available on the Colonial Despatches site. They can be accessed through the Manuscript images page.

Permalink 09:20:33 am, by kim, 62 words, 140 views   English (CA)
Categories: Activity log, Documentation; Mins. worked: 5

Archive Film Reels, Strange Order nin C0 60/4

The images in CO 60 / 4 [found in folder B-080], and possibly more, are photographed in reverse order. For example, page 1 actual is image number B-080-00788.jpg, page 2 actual is B-080-00787.jpg, page 3 is B-080-00786.jpg, and so on.

For the purposes of image processing, I downloaded the Volume 4 files and used KRename to rename them in a more sensible fashion.

27/09/10

Permalink 08:54:30 am, by mholmes, 124 words, 64 views   English (CA)
Categories: Activity log; Mins. worked: 20

Setting up SVN for Coldesp

Setting up a new Subversion repository so that multiple users can work on the same files at the same time. I didn't document this carefully enough last time, so I'll go through this in detail:

  • On the local machine, establish a file structure which contains only the files and directories you want to manage through SVN.
  • At the command line, in that location, build the initial repo structure. In this case, I did this:
  • svn checkout https://[path-to-repo]/trunk/xml .
  • Add all your files and folders to the tree, like this:
    svn stat | grep "^?" | awk '{print $2}' | xargs svn add
    
    This command is suggested and nicely explained by this article.
  • svn commit

23/09/10

Permalink 03:59:53 pm, by kim, 47 words, 221 views   English (CA)
Categories: Announcements; Mins. worked: 0

CO 305, Vol 12 Images Posted to Manuscript Images Gallery

CO 305, Volume 12 has been uploaded to the Manuscript Images page. This collection contains 1859 correspondence relevant to Vancouver Island, under the headings of Admiralty, Board of Trade, Council Office, Emigration Office, and Foreign Office. In the coming months, these images will be linked to their respective digital transcriptions.

Permalink 11:14:08 am, by kim, 38 words, 236 views   English (CA)
Categories: Announcements; Mins. worked: 0

CO 305, Vol 11 Images Posted to Manuscript Images Gallery

CO 305, Volume 11 contains correspondence for the year 1859. You can access these images, along with all others posted to date, from our Manuscript Images page. In the coming months we will link these images to their respective digital transcriptions.

22/09/10

Permalink 04:40:05 pm, by mholmes, 17 words, 44 views   English (CA)
Categories: Activity log; Mins. worked: 15

Corrected the FO925/1383 map metadata

The title was wrong. Created my own, since the NAC metadata was the source of the error.

Permalink 04:08:03 pm, by mholmes, 55 words, 190 views   English (CA)
Categories: Activity log, Announcements; Mins. worked: 30

RSS feeds available from the Despatches site

We have made available two separate RSS feeds for the Colonial Despatches site. One is for announcements only; it can be accessed from an icon in the footer of every page. The other is for all the project blog postings, and this is available on the Development page (click on About on the main menu).

Permalink 03:03:55 pm, by mholmes, 144 words, 49 views   English (CA)
Categories: Activity log; Mins. worked: 45

Fixes for maps

CP wrote to point out that we had erroneously included a map from NSW in our collection, because the NAC had erroneously sent it to us. I've now removed it. Also, we had a wrong page ref in one of the "press coverage" items (now fixed), and one of our maps, FO925-1383, appears to have the wrong title information. I got the info from the spreadsheets supplied by the library, so the error originates there; I've asked CP if he can find out the correct information.

He also requested that we link to the library's map collection, but before I can do that, I need a list of mappings between metadata I have (such as the <idno type="libFileName">FO925-1383</idno>, and the collection and item numbers in the library's CONTENTDM system. That hasn't been available so far.

20/09/10

Permalink 01:14:59 pm, by mholmes, 10 words, 200 views   English (CA)
Categories: Activity log, Announcements; Mins. worked: 30

CO 305 11 images posted on the site

Scans from CO 305/11 are now accessible in the image browser.

Permalink 11:05:07 am, by kim, 39 words, 54 views   English (CA)
Categories: Activity log; Mins. worked: 5

CO 305/11 images complete

I completed all the 1859 images in the CO 305/11 collection. I have processes the 800px and 60px images and informed Martin to this effect, so that he can upload them to the site. I will now move on to CO 305/12.

16/09/10

Permalink 11:58:35 am, by mholmes, 31 words, 47 views   English (CA)
Categories: Activity log; Mins. worked: 90

Progress meeting

Met with KSW, JL, and CP to discuss the next phase. Discussed possible involvement of other faculty, possible grant sources, editorial policy, and the priority sequence for work in this phase.

15/09/10

Permalink 11:25:42 am, by mholmes, 13 words, 55 views   English (CA)
Categories: Activity log; Mins. worked: 60

Preparing for meeting tomorrow

Went through the IB grant application with KSW in preparation for tomorrow's meeting.

14/09/10

Permalink 04:16:50 pm, by kim, 30 words, 55 views   English (CA)
Categories: Activity log; Mins. worked: 5

CO 305/11 images

Processed hundreds of images for the Volume 11 collection. I predict that I will complete the processing of the remainder by next Monday afternoon. I am at 383 of 433 original images, presently.

13/09/10

Permalink 03:04:43 pm, by mholmes, 41 words, 64 views   English (CA)
Categories: Activity log; Mins. worked: 30

CO 305 10 updated

KSW has re-processed the scans to get better quality images, and re-linked any 1859 files that were linking to them (only a few dozen). I've uploaded the new image files, amended the scan list in the db, and uploaded the changed documents.

Permalink 02:04:36 pm, by kim, 22 words, 56 views   English (CA)
Categories: Activity log; Mins. worked: 5

CO 305/10 images (re)processed

I have completed the reprocessing of the images for CO 305/10. Next, I will update the image URLs in the necessary XML files.

02/09/10

Permalink 09:12:18 am, by mholmes, 15 words, 61 views   English (CA)
Categories: Activity log; Mins. worked: 15

Retrieved stats for Jan-Aug

Once a month I grab the stats for the preceding months, for grant reporting purposes.

17/08/10

Permalink 08:05:45 am, by mholmes, 112 words, 61 views   English (CA)
Categories: Activity log; Mins. worked: 30

Redirects and DNS update for ancestor govlet site

There are two versions of the govlet site on hist66, one (the current one) in /BCCOR/, and the other (well obsolete) in /govLetter/. Google has indexed the latter, so I've added a redirect to each of the index pages in /en/ and /fr/, changing them from .html to .php. I initially tried to use a .htaccess redirect for the whole /govLetter/ directory, but this doesn't work -- perhaps .htaccess redirects are not allowed on unix.uvic.ca, or perhaps there was something wrong with my syntax.

Also wrote to JS and sysadmin to get the virtual host set up on Lettuce, and the domain name pointed at it in the UVic DNS.

16/08/10

Permalink 03:51:13 pm, by mholmes, 18 words, 65 views   English (CA)
Categories: Activity log; Mins. worked: 30

Finished backup of archive data to Rutabaga...

Just in time for one of the Rutabaga drives to fail, so it'll be coming down this evening...

Permalink 11:14:09 am, by mholmes, 120 words, 56 views   English (CA)
Categories: Activity log; Mins. worked: 60

Moved govlet.ca site from hist66 to TAPoR

During development, the govlet.ca site was hosted on web.uvic.ca, in the hist66 account, but now development is complete, I've moved it over to home1t/govlet, which is its long-term home. The domain is currently pointing at the hist66 account; once we're all happy it's working properly in the new location, I'll get sysadmin to re-point it to home1t.

Found and fixed two errors, one an unescaped ampersand in a @title value, and the other a French menu included in one of the English pages. I haven't tested laboriously -- there are a lot of pages -- but I've fixed everything that I've seen so far. Waiting for JL's approval to ask for the domain change.

13/08/10

Permalink 02:32:49 pm, by mholmes, 100 words, 86 views   English (CA)
Categories: Activity log; Mins. worked: 60

First Nations page progress

I've been working on sorting and grouping the First Nations names in a logical manner, and I've basically got the code working (on my local machine), except that the leading "the" is problematic -- sometimes it's "The", sometimes "the", and I'm basically convinced that it shouldn't be there at all. The tags in the documents need to be changed from:

<name type="fn">The [whatever]</name>

to

The <name type="fn">[whatever]</name>

for both upper- and lower-case versions of the article. I think I can do this with a simple search-and-replace.

Permalink 10:17:35 am, by mholmes, 85 words, 63 views   English (CA)
Categories: Activity log; Mins. worked: 90

Updated the press coverage page, published First Nations page

One new article was reported to me, but searching myself, I was able to find two other new ones, and I was also able to find web links for many of the older articles, which had previously been unlinked. Also I was intrigued to find that we're cited in the Wikipedia article for Pelly. Updated the About page.

Also published the First Nations groups index page. I still need to do two things with this:

  • Group identical forms of names.
  • Remove leading "the" before sorting.

12/08/10

Permalink 04:04:39 pm, by kim, 51 words, 65 views   English (CA)
Categories: Activity log; Mins. worked: 5

Kim's Pre-Holiday Report

Here's brief summary of where we are at:

The places file is complete and stands at 184 entries, so too is the vessels file at 88 entries.

As for the other XML files, I have published this sheet that details our progress to date.

I shall return in early September to work on 1859.

Permalink 11:25:49 am, by mholmes, 123 words, 65 views   English (CA)
Categories: Activity log; Mins. worked: 120

New reel data arrived

We've just received the following new scans requested by CP (nearly 100GB in total). I've copied them locally, and to home1t/coldesp/archive, and will copy them also to Rutabaga tomorrow. In the process, I moved a few other bits and pieces into home1t/coldesp/archive, to tidy up the coldesp directory a little.

CO 60, 1859-1871, Vol. 4-44 and an ancillary reel

  • B-80
  • B-81
  • B-82
  • B-83
  • B-84
  • B-85
  • B-86
  • B-87
  • B-88
  • B-89
  • B-90
  • B-91
  • B-92
  • B-93
  • B-94
  • B-95
  • B-96
  • B,97
  • B-98
  • B-99
  • B-100
  • B-101
  • B-102
  • B-103
  • B-104
  • B-105
  • B-106
  • B-107
  • B-108
  • B-109
  • B-206 (31 reels)

CO 305 1859-1866 from vol. 11 - 30

  • B-237
  • B-238
  • B-241
  • B-242
  • B-243
  • B-244
  • B-245
  • B-246
  • B-247
  • B-248
  • B-248
  • B-250
  • B-251 (13 reels)

CO398 1858-1866 from Vol. 1 – 7

  • B-890
  • B-891
  • B-892 (3 reels).

10/08/10

Permalink 04:57:22 pm, by mholmes, 15 words, 76 views   English (CA)
Categories: Activity log; Mins. worked: 150

Grant application: discussion meeting

Long meeting at the library for discussion of a possible application to Canada Interactive Fund.

09/08/10

Permalink 08:34:29 am, by mholmes, 14 words, 68 views   English (CA)
Categories: Activity log; Mins. worked: 20

Updated the db

Downloaded all new material from the working account and updated the files in eXist.

Permalink 08:13:14 am, by mholmes, 47 words, 59 views   English (CA)
Categories: Activity log; Mins. worked: 20

Grabbed Coldesp stats from Urchin 5

Got the latest set of Coldesp stats from Urchin 5 (webstats.uvic.ca). Urchin 6 is available on http://megapode.comp.uvic.ca:9999, but it doesn't provide the same range of stats in the same formats as I've been downloading, so I'm sticking with 5 until I'm forced to change.

03/08/10

Permalink 03:54:01 pm, by kim, 52 words, 69 views   English (CA)
Categories: Activity log; Mins. worked: 5

Vessels file update

I continue to puzzle out the inconsistencies in the vessels files. Only a few more entries to complete for those that had no content. From there, I will do a quick copyedit of the remaining files, then wait to do a final proof until I can see them live on the site.

29/07/10

Permalink 04:59:55 pm, by kim, 172 words, 100 views   English (CA)
Categories: Activity log; Mins. worked: 5

Week-end update for July 26 - 29

This week saw me complete the following tasks:

  • Finished adding content to the placename file; once it is refreshed, I will do a final proof
  • Worked through the vessels file to see what needed doing, and created a checklist to keep things on track
  • Proofed the vessels file as it stands now, found few silly errors in the process Added content to a few vessel entries that had no content
  • Updated the Guidelines document. There were several sections that I had put on hold, so it felt great to finally get them into the document

Looking ahead to past the long weekend, I will continue with the vessels file. I will focus on adding content to the entries absent of the same, even if it is drawn exclusively from the despatches. Each vessel write-up has the potential to become a project unto itself! So, I won't spend too much time on the file afterwards. Give it a proof and, hopefully, turn it over to a research student in the fall for tweaking.

22/07/10

Permalink 04:55:46 pm, by kim, 106 words, 90 views   English (CA)
Categories: Activity log; Mins. worked: 3

Place Names Completed, Sort of...

I have completed content for all the placenames to date. There are some loose strings to snip, but the bulk of the labour is done, which puts us at 180 entries. As the placenames have taken on a life of their own, in terms of scale, I am concerned about errors.

So, I have created a checklist to confirm that each entry has (1) content, (2) sources, formatted correctly, (3) correct geo-coordinates, and (4) a final proofing. I figure that it will pay to do this work now, rather than have our dear readers point out our errors later! It shouldn't take too long, as much of it is done already.

19/07/10

Permalink 04:01:46 pm, by kim, 49 words, 81 views   English (CA)
Categories: Activity log; Mins. worked: 3

Some Technical Improvements

Greg has been a fine chap and set me up on a new machine. I have spent some time, as needed, getting it set up. I will be using GRsync to do nightly backups to my server space on Home1t, rather than always dragging and dropping. Good times!

15/07/10

Permalink 04:53:26 pm, by kim, 80 words, 84 views   English (CA)
Categories: Activity log; Mins. worked: 3

Week-end update for July 12 -15

I have whittled the remaining placenames down to a dozen to go! I hope to complete these by next Tuesday at the latest. I was distracted, positively, by the task of popping out to pick up UVic Ceremonies' pictures from the Colonial Despatches launch at Government House. I have posted the pictures to the coldesp server in a folder called "coldesp_launch_pics." I presume that we are to review the files and purchase any of the ones that appeal.

08/07/10

Permalink 04:36:52 pm, by kim, 51 words, 85 views   English (CA)
Categories: Activity log; Mins. worked: 3

Week-end update for July 5 - 8

I continued to work the placenames file this week, and have whittled the list down to 33 to go: a daunting number, but I should be through them by the end of next week. From there, I will move on to tune up the vessels file, as it needs a little attention.

07/07/10

Permalink 10:16:57 am, by kim, 58 words, 96 views   English (CA)
Categories: Activity log; Mins. worked: 3

Duwamish Tribe

Just to have a place to have this recorded...

In the places.xml file, the entry for Duwamish River calls for special charactes, based on the Indigenous spelling for Duwamish, seen here: http://www.duwamishtribe.org/culture.html

I used the # unicode characters as follows: DḵẖʷʼDuwʼAbsh

06/07/10

Permalink 04:20:53 pm, by kim, 40 words, 57 views   English (CA)
Categories: Activity log; Mins. worked: 3

Places updated

Spent the day working on the place-name entries. Some of them are turning out to be fairly tricky to pin down! So far, so good. I have tracked down some great new sources that I will use in the future.

02/07/10

Permalink 10:04:08 am, by mholmes, 35 words, 59 views   English (CA)
Categories: Activity log; Mins. worked: 20

Saved stats for Coldesp

I'm now saving the stats for Coldesp once a month, from UVic's Urchin; we need to do this to comply with the grant requirements. Also wrote to JS to set up monitoring for govlet.ca.

29/06/10

Permalink 04:23:22 pm, by kim, 33 words, 54 views   English (CA)
Categories: Activity log; Mins. worked: 3

1846 - 1857 Update: FN tags

I have edited the instances of "fn" tags for 1846 - 1857 XML files. I caught a few more groups in the process, and created a list of spelling variants that I can use for 1858.

28/06/10

Permalink 08:29:55 am, by mholmes, 127 words, 50 views   English (CA)
Categories: Activity log; Mins. worked: 30

Fixed display bugs in bios page

KSW reported a failure to show some dates, and issues with bolding of names on the bios page. The date problem was caused by uncertain dates without text content -- I had been assuming that where dates were uncertain, a useful piece of text would be supplied in the tag content, but now where that's not the case, I'm using the @when attribute instead. Uncertain dates are also now being supplied with a following question-mark through CSS.

Bolding problems, along with some ordering issues I noticed myself, were caused by my failure to realize that there would be some people having no <surname> or <forename>, but merely a <roleName>. That's now fixed, and names are all bolded appropriately and sequenced correctly.

25/06/10

Permalink 04:23:34 pm, by kim, 31 words, 79 views   English (CA)
Categories: Activity log; Mins. worked: 3

Update on place, people, and date tags

As hoped, I managed to complete the people, places, and date tags for the remaining years, up to and including 1857.

Along the way, I cleaned up a number of biography entries.

24/06/10

Permalink 08:14:26 am, by mholmes, 421 words, 80 views   English (CA)
Categories: Activity log; Mins. worked: 120

Stats and content for the final report, and how they were generated

Spent some time updating the website a bit, and writing some content for the final report. The content has gone to CP, who is compiling it.

These are the stats I'm submitting for the final report, along with the method I used to get the info, so we can easily repeat the process at any time.

  • 629 despatch documents linked and annotated.
    Use the website search with no parameters except "To: 1857".
  • 782 individual people identified and tagged.
    In the eXist admin client:
    
    declare default element namespace "http://www.tei-c.org/ns/1.0";
    count(collection('/db/coldesp/bios/')//person)
    
  • 248 biographies fully completed.
    In the eXist admin client:
    
    declare default element namespace "http://www.tei-c.org/ns/1.0";
    count(collection('/db/coldesp/bios/')//person[not(contains(./note/p, 'not yet available'))])
    
  • 165 BC placenames identified and tagged.
    In the eXist admin client:
    
    declare default element namespace "http://www.tei-c.org/ns/1.0";
    count(collection('/db/coldesp/places/')//place)
    
  • 134 BC place "biographies" completed.
    In the eXist admin client:
    
    declare default element namespace "http://www.tei-c.org/ns/1.0";
    count(collection('/db/coldesp/places/')//place[not(contains(./desc, 'not yet available'))])
    
  • 99 seagoing vessels identified and tagged.
    In the eXist admin client:
    
    declare default element namespace "http://www.tei-c.org/ns/1.0";
    count(collection('/db/coldesp/vessels/')//list[@type='vessels']/item)
    
  • 88 vessel "biographies" completed.
    In the eXist admin client:
    
    declare default element namespace "http://www.tei-c.org/ns/1.0";
    count(collection('/db/coldesp/vessels/')//list[@type='vessels']/item[not(contains(./p, 'not yet available'))])
    
  • 13358 individual page-images created, at three different sizes.
    SSH into nfs.tapor.uvic.ca, then:
    cd /home1t/coldesp/www/jpg_scans
    ls -R jpg_800 | grep '.jpg' | wc -l
    
  • 6025 page images linked into despatch transcriptions
    In the eXist admin client:
    
    declare default element namespace "http://www.tei-c.org/ns/1.0";
    let $startPages := count(collection('/db/coldesp/correspondence/')//biblScope[@type='startPageImage'][contains(@facs, '.jpg')]),
    $pageBreaks := count(collection('/db/coldesp/correspondence/')//pb[contains(@n, '.jpg')])
    return $startPages + $pageBreaks
    
  • 218 individual maps digitized at three different sizes
    Go to the map gallery on the site; the total is shown at the top of the page.
    
  • 1472 HTML pages created or enhanced.
    1247 despatch docs incl. 1858
       1 browse by date page
       5 index pages (Index + sub-pages)
     219 map pages (gallery + 218 maps)
    ____
    1472 
    
  • Over 1400 XML source documents made available through the site.
    In addition to all the transcription and annotation documents, there are lots of XML feeds also used by the site and available, some linked, some not, many created as dynamic views of the data.
    

23/06/10

Permalink 01:17:09 pm, by mholmes, 104 words, 47 views   English (CA)
Categories: Activity log; Mins. worked: 120

Team meeting and some tweaks to site

Team meeting to discuss the launch and the way forward.Main takeaway for me is the requirement to complete bits of the final report that are my bailiwick asap.

Also added a page to the site, listing all the First Nations names and referring strings that have been tagged so far. This is not linked from anywhere yet; it's more so that we can get an idea of what's been tagged, and how easy it might be to impose an ontology on the names mentioned. We're still discussing whether to tag all referring strings, no matter how vague, or only names and name-like strings.

21/06/10

Permalink 06:06:33 pm, by mholmes, 28 words, 49 views   English (CA)
Categories: Activity log; Mins. worked: 180

Preparation for the launch...

Lots of final tweaking, and a rebuild of the portable ColDesp, with the latest revisions to documents included. We've installed and tested it on GN's laptop as well.

Permalink 06:06:26 pm, by mholmes, 28 words, 44 views   English (CA)
Categories: Activity log; Mins. worked: 180

Preparation for the launch...

Lots of final tweaking, and a rebuild of the portable ColDesp, with the latest revisions to documents included. We've installed and tested it on GN's laptop as well.

Permalink 04:27:41 pm, by kim, 33 words, 48 views   English (CA)
Categories: Activity log; Mins. worked: 3

Update in Advance of the Colonial Letters' Launch

I have cleaned things up as much as possible in the time allowed. I will prepare a brief report of where things stand, and discuss this at the wrap meeting on the 23rd.

Permalink 11:22:43 am, by mholmes, 141 words, 49 views   English (CA)
Categories: Activity log; Mins. worked: 30

Fix for vessel name formatting, and a bug to be fixed

KSW reported that vessel names are not italicized inside vessel bios, which turned out to be caused by the fact that we don't process vessel name tags unless they have @keys; @keys will be added. I also tweaked the CSS so that such names are italicized automatically when they're in the vessel bio context, but not in the context of the main documents, where we shouldn't alter the text style.

There's one oddity in the vessel bio page, which is that when you click on the name of a vessel from inside another vessel's bio, the clicked-on vessel's bio appears as a popup; but in that process, it's extracted from the main list, and is never replaced in it when you close the popup, causing it to disappear from the list. That's obviously a bad idea. I'll be working on it.

18/06/10

Permalink 10:38:49 am, by mholmes, 34 words, 45 views   English (CA)
Categories: Activity log; Mins. worked: 15

Tweaked timings for presentation

After watching the presentation through on the TV at GH, I figured the final bit (snippets) needed longer display times for each of the items, so I've doubled those. I think we're ready now.

17/06/10

Permalink 03:40:02 pm, by kim, 13 words, 39 views   English (CA)
Categories: Activity log; Mins. worked: 3

1854 Update

Place names and people names, as well as dates, have been completed for 1854.

16/06/10

Permalink 05:16:18 pm, by mholmes, 318 words, 43 views   English (CA)
Categories: Activity log; Mins. worked: 360

Tested presentation at Govt House, and built portable ColDesp

Went down to GH with GN and tested our presentation played from the laptop through their large TV. Looks pretty good, although it can't do widescreen and it's CRT TV res. Might not do for showing the website, but that's not a problem really. We can find out about that on the day.

Spent most of the afternoon building a portable ColDesp, and learned that:

  • Paths to images in the following files need to be changed:
    • docTemplates.xsl
    • strings_en.xsl
    • imageBrowser.xq
  • I put the page-images in a subfolder /site/jpg_scans, and the maps in /site/maps.
  • I also had to patch the sitemap.xmap file to add this:
    <!--Start hacks for portable version. -->
              <map:match pattern="jpg_scans/**.jpg">
                <map:read mime-type="image/jpeg" src="jpg_scans/{1}.jpg"/>
              </map:match>
              <map:match pattern="maps/**.jpg">
                <map:read mime-type="image/jpeg" src="maps/{1}.jpg"/>
              </map:match>
    <!-- End hacks for portable version. -->
    
  • Testing this on the laptop, I discovered that I had to set JAVA_HOME in my .bashrc file to get Tomcat to start up properly:
    export JAVA_HOME=/usr/lib/jvm/java-6-sun
    export PATH=$PATH:/usr/lib/jvm/java-6-sun
    
    I've also used java-6-openjdk on my main machine, and both seem to work fine for Tomcat and Cocoon.
  • I also discovered that Chromium refuses to access an http://localhost:8080 address unless there is an internet connection! Firefox also balks initially, but if you uncheck "Work offline" on the File menu, it works. That was unexpected.

Otherwise, everything is working fine. I also made a small change to the schedules page at JL's request.

Note to self: having paths distributed across three files makes no sense. Could they be centralized? The problem really is the XQuery file, which can't get info from an XSLT file.

15/06/10

Permalink 03:39:05 pm, by mholmes, 30 words, 38 views   English (CA)
Categories: Activity log; Mins. worked: 30

Retrieved foam boards from Ceremonies...

... and checked that they would fit in the car on the way back. We still need to find out if they need us to take the easels down or not.

14/06/10

Permalink 10:18:53 am, by mholmes, 29 words, 39 views   English (CA)
Categories: Activity log; Mins. worked: 30

Calculating the scope of the remaining work

Did some rough calculations of the scale of the remaining work and possible costs, based on work accomplished so far, for JL, who needs this for a meeting tomorrow.

11/06/10

Permalink 03:58:46 pm, by kim, 30 words, 44 views   English (CA)
Categories: Activity log; Mins. worked: 3

XML file update

I continue to work through the years of 1854 - 1857. I estimate that I should complete the tagging of places, people, and dates in these files by the 18th of June.

08/06/10

Permalink 05:09:00 pm, by mholmes, 16 words, 61 views   English (CA)
Categories: Activity log; Mins. worked: 50

Proofing the launch program

Lots of fixes required. I think the text got mangled or re-typed badly at some point.

02/06/10

Permalink 03:31:33 pm, by kim, 39 words, 40 views   English (CA)
Categories: Activity log; Mins. worked: 3

1852 and 1853 update

Both 1852 and 1853 have had people and places tagged. It took roughly 3 days to complete 1853. Should this pace be possible to maintain, we should get through the remaining years by June 22, bu this is just a guess at this point.

01/06/10

Permalink 04:24:25 pm, by kim, 41 words, 49 views   English (CA)
Categories: Activity log; Mins. worked: 3

Coldesp launch booklet

Carmen Koning, External Marketing Officer, has all the materials required to begin the launch booklet. I asked that she liaise with me as needed on anything technical. I passed on the the email that Kerra StJames sent me regarding payment details.

Permalink 04:21:02 pm, by kim, 75 words, 41 views   English (CA)
Categories: Activity log; Mins. worked: 3

Posters have arrived

All seven posters have arrived, and they look great. Although, for some reason, the "anatomy of a despatch" poster had the URL at the top left of the page cut in half. I will print off a sticker to cover up the mistake.

All account and payment details will be handled through Kerra StJames of the Ceremonies and Events office. I provided Island Blue with the FAST code Kerra had relayed to me by phone.

31/05/10

Permalink 02:38:10 pm, by mholmes, 28 words, 49 views   English (CA)
Categories: Activity log; Mins. worked: 60

Checked out the Coldesp launch boards

KSW and I went downtown to Island Blueprint to check the proofs of the foam boards. They look excellent. Should be finished by the end of the day.

28/05/10

Permalink 04:00:17 pm, by kim, 44 words, 59 views   English (CA)
Categories: Activity log; Mins. worked: 3

Coldesp Launch Materials

I managed to get the last host of changes in the booklet incorporated into the latest draft, which I have sent as a PDF to Kerra and John for review. Kerra mentioned that she wanted to add the names of the recently hired actors.

Permalink 02:21:59 pm, by mholmes, 85 words, 57 views   English (CA)
Categories: Activity log; Mins. worked: 240

Presentation: first draft finished, new enhancements

Added two switches that can be set from the URL search component that can control the speed of the presentation, and the starting point, so it's no longer necessary to comment out the earlier parts while you're working on the later ones.

Reworked a lot of the existing presentation, and wrote the two final sections, Statistics and Snippets. I think the whole runs to about 5 minutes now, but I haven't timed it to see exactly. All tested and working in Firefox, Opera, Chrome and Epiphany.

25/05/10

Permalink 04:19:50 pm, by kim, 52 words, 48 views   English (CA)
Categories: Activity log; Mins. worked: 3

CO 410 Update

I have finished tagging the CO 410 files with "biblscope" and "pb" tags up to and including 1858. Since 1858 is off the radar for this launch, I will add the "pb" tags only, but they still require proofing and tagging of places, and people, in some cases. I have made a note of this.

Permalink 03:23:34 pm, by mholmes, 77 words, 61 views   English (CA)
Categories: Activity log; Mins. worked: 180

Finished the map section

I now have five different maps in the map section, showing a fair cross-section of the different types we have. That's probably enough for that section. Next I'm going to see if I can find any snippets that are short and snappy enough to work. I have many of KSW's, and a few of my own harvested today, but most are too long to work in this kind of presentation. 19th-century bureaucratic prose rarely contains sounds bites.

Permalink 11:23:23 am, by mholmes, 101 words, 49 views   English (CA)
Categories: Activity log; Mins. worked: 45

Webkit bug in presentation code fixed

It appears to be the case that if you call JQuery's animate() passing any set of options, in Webkit browsers, where the starting value and ending value for e.g. top or left are the same, the browser will elect to animate them anyway, setting them to zero before starting the animation. This was throwing out all my custom animations in Chrome and Epiphany. So I've rewritten the code so that where a value is not going to be altered, it's never actually passed in to the animate() call. The presentation is now functioning correctly on FF, Webkit and Opera engines.

22/05/10

Permalink 06:03:29 pm, by mholmes, 539 words, 46 views   English (CA)
Categories: Activity log; Mins. worked: 30

Ideas for making the kiosk presentation into a manual one

Greg raised the possibility of using the kiosk presentation code I've been working on as the basis for a manually-controlled presentation, as a sort of replacement for S5, and I've been thinking hard about how that could be done. These are my ideas so far:

  • Switching between kiosk (automated/timed) and manual should be accomplished by a parameter in the URL search component (display=kiosk, display=manual).
  • if "kiosk", continue as currently with timings.
  • If "manual", then create the HcmcSlideList object as currently, but then call a new method on it: setupManual(). Don't call next().
  • setupManual works through the complete set of slides, and figures out the sequence of hide and show events. It adds them all to an array, in a form in which they can be called using eval().
  • A call to HcmcSlideList::manualNext() passes the next item in that array to the eval() command, causing the next slide to be shown or hidden.
  • Each HcmcSlide object keeps a reference to its original parentNode as one of its properties, stored when it's created.
  • The HcmcSlideList object maintains two other arrays: hiddenSlides and shownSlides. Whenever an item is shown, its index in the HcmcSlideList.list array is pushed to shownSlides.
  • When the call results in a hide action, this is what happens (pseudo-code):
    slides.list[i].hide();
    slides.hiddenSlides.push(slides.list[i].parentNode.removeChild(slides.list[i]));
    
    This removes the hidden slide from the page, but keeps in in an array so it can be restored if necessary.
  • When a slide is shown, its index is pushed to the shownSlides array.
  • The purpose of all this is to enable us to step backwards through the sequence. This is what we can do, when HcmcSlideList::back() is called:
    • Decrement HcmcSlideList.counter.
    • Check whether the new "current" command was a show or a hide.
    • If it was a hide, then pop the last item off the hiddenSlides array, and append it again to its parentNode.
    • If it was a show, then pop the index of the slide that was shown from the shownSlides array, and set it to "display: none".

This should enable backtracking through all the slides, without any transitions -- which is probably what you want when going back through the slides. Because we're decrementing the counter, moving forward again will run the hide and show transitions as expected, so you can resume moving through the array. There are only two issues:

  1. If you backtrack through (say) five slides, to show something again to the audience, you're going to have to move forward through five transitions to get back to where you were. These transitions will take several seconds (depending on your settings), so you could end up taking (say) ten seconds to get back to where you were. That said, if you call HcmcSlideList::manualNext() several times in rapid succession, although each individual transition will take as long as it takes, they'll all run simultaneously, so that might not be a big problem.
  2. There's no easy way to "jump" to a particular spot in the presentation. We could get around this by adding an extra "jumpTo" parameter in the URI, which would cause the setup code to run all transitions prior to the jumpTo point simultaneously.

21/05/10

Permalink 03:04:34 pm, by kim, 87 words, 56 views   English (CA)
Categories: Activity log; Mins. worked: 3

Despatch Launch: Booklet Update

I met with John, who gave me some instructions for the booklet. He then sent on the first draft of the booklet copy. I used both to build a working draft, which will be used as a guide by the UVic communications team, on Kerra's side.

I sent Kerra a copy, in PDF form, for her feedback. I will hear back from her early next week, and incorporate her changes. Still to be finalized from our team: some image placement, copytext, and final proofing of the same.

Permalink 02:58:42 pm, by kim, 60 words, 52 views   English (CA)
Categories: Activity log; Mins. worked: 3

Despatch Launch: Poster Update

In consultation with the team I have completed 7 poster boards for display at the June 22nd launch. I have shipped JPG versions to Kerra, who has responded she will review them early next week.

As for proofs, once Kerra gets back to me with her changes, if any, I will drop off PDFs at Island Blueprint, hopefully, by next Wednesday.

Permalink 02:21:00 pm, by mholmes, 278 words, 54 views   English (CA)
Categories: Activity log; Mins. worked: 300

More work on presentation

Finished off the section on the digital despatch, with lots of screenshots, and began the section on maps.

This required me to tackle the remaining features I hadn't added yet, for a variety of different transitions used to show content, because I need to have content arriving from different directions in this section. I've now refactored all the showing and hiding code, and I have a total of six different transitions for showing and six for hiding a slide. I've cleaned up a lot of that code, so it's now less than 200 lines, and I've solved one issue with object encapsulation that was puzzling me. When you make a setTimeout call, you need to pass a string containing the code to be executed when the timeout is up. My problem was that I wanted to set the timeout from inside an object method, calling another method of the same object; however, the "this" keyword won't work, because the timeout is executed outside the object scope. The solution was to add a parameter to the object constructor in which the name of the global pointer that refers to the object is passed in; this way, the object "knows its own name", and can set a timeout calling its own method using that variable name.

There's one remaining issue with webkit browsers (Epiphany and Chrome) whereby moving show transitions always seem to start from top left, even when they're asked not to. I'll work on that next week. I may have to explicitly set the slide's left and top to what they already are (left = offsetLeft + 'px') so that they can retrieve a working starting value for the motion calculations.

20/05/10

Permalink 05:11:12 pm, by mholmes, 168 words, 42 views   English (CA)
Categories: Activity log; Mins. worked: 180

More work on the presentation code

I realized that the presentation code really needed to be able to handle variable triggers for the following item, so that items can appear in quick succession where necessary, so I've revised the back-end code. It's now quite flexible and getting rather sophisticated; it'll definitely be usable for future kiosk-style presentations, and GN also suggests we make it switchable into a manual presentation. I think that could be done by setting a flag in the JavaScript whereby, instead of triggering the presentation on startup, the code instead parses through the list of slides and figures out what order each of the appearance and disappearance actions would occur in, and then places them in an array, where each can be triggered manually through an eval() call. The only difficulty there is that it's going to be hard to make it possible to go back. But I'll keep thinking about that.

I also tweaked the existing content, and added some more bits. I think the content is about 50% complete now.

17/05/10

Permalink 03:40:06 pm, by mholmes, 117 words, 36 views   English (CA)
Categories: Activity log; Mins. worked: 240

Putting content into the presentation

JL has provided a lengthy description of the project, from which I'm trying to harvest the key points and turn them into the meat of the automatic presentation, illustrating it with images wherever possible. I'm probably about a third into it by this point, still working on the best ways to handle the requirement to display on a variety of screen and font sizes. I've settled on positioning and sizing in percents, on the basis that I can then adjust font sizes through the browser to get the best use of screen space (= largest display text that will fit).

It's still pretty spartan, but once the content is done I can start tarting it up a bit.

13/05/10

Permalink 03:50:21 pm, by mholmes, 141 words, 45 views   English (CA)
Categories: Activity log; Mins. worked: 240

More on the presentation

The presentation seems to be coming together. JQuery animation has some problems, so as happened before when I tried to use JQuery, I've ended up using less and less of it, and writing more of my own code. I have a slide object and a slide list object, and the list manages slide timing and transitions, and I've set it up so that the original slide <div>s are placed on the page where you want them to end up, using @style attributes. The display time for a particular slide is placed in its title attribute (a hack, but not actually breaking any rules), and the slides are shown in the order of the <div>s in the page. That means you can create the slide show by editing HTML, and not worry much about the script.

12/05/10

Permalink 04:29:45 pm, by mholmes, 17 words, 49 views   English (CA)
Categories: Activity log; Mins. worked: 120

Working on the presentation

Working on basic code for an animated presentation using JQuery. Found two bugs in JQuery already. Grrr.

Permalink 03:00:15 pm, by kim, 16 words, 40 views   English (CA)
Categories: Activity log; Mins. worked: 3

CO 410/1 update

The three sizes of images for the co410/1 group has been uploaded to the coldesp server.

Permalink 12:47:28 pm, by mholmes, 7 words, 38 views   English (CA)
Categories: Activity log; Mins. worked: 60

Meeting re launch

Discussed the foam boards and the program.

11/05/10

Permalink 03:57:27 pm, by mholmes, 27 words, 38 views   English (CA)
Categories: Activity log; Mins. worked: 60

Reworked original poster for new launch

Dug out my original launch poster, and added a faded-out map in the background, then reworked it for the new launch at 36 x 48. Seems to work OK...

10/05/10

Permalink 05:10:19 pm, by mholmes, 44 words, 34 views   English (CA)
Categories: Activity log; Mins. worked: 60

Meeting about graphics for launch

Met with JL and KSW to talk about graphics for the launch. Foam boards are more or less organized; we've selected a map (co_700-bc_2_van_isl_1854), and we'll work on the "splash" poster tomorrow. Also made a tweak to logos on CP's instructions.

Permalink 05:08:46 pm, by mholmes, 21 words, 44 views   English (CA)
Categories: Activity log; Mins. worked: 180

Marked up one more map

Time-consuming markup of one of the Spanish maps of Juan de Fuca Strait. I'm slowly getting familiar with the placenames, though.

Permalink 10:53:40 am, by mholmes, 30 words, 42 views   English (CA)
Categories: Activity log; Mins. worked: 60

Marked up another map

Marked up a 1789 map of Ahousat. It doesn't look much like modern-day Clayoquot Sound, though, so I've found it almost impossible to set the Google Map coordinates with any precision.

07/05/10

Permalink 04:15:24 pm, by kim, 247 words, 53 views   English (CA)
Categories: Activity log; Mins. worked: 5

Poster Boards for the ColDesp Launch in June 2010

I have produced 3 poster boards this week, with a fourth on the way. Here is a quick rundown on each.

1. Splash Poster: this is, more or less, a logo-only poster, that showcases the project name, intention, and years of coverage [incomplete].

2. Despatches by Numbers: lists some key numbers association with the project thus far, such as how many images produced, instances of various things, years covered in the collection, and so on [draft complete].

3. What's in a Despatch: anatomy of a typical letter, which points out--with pictures and connector-lines--details such as the despatch number, salute, marginalia, and more. The intention here is for the viewer to get a quick sense of the intricacy and minutia of the handwritten letters [draft complete].

4. What's in a Digital Despatch: as above, but the digital version. This will give viewers a sense of how we "transform" the handwritten letter into something you can "interact" with on the web. It shows some of the basic features of the site, such as the navigation bar [draft complete].

5. The Despatch Maps: this page is yet to be designed, but it will function as above (see 3 and 4) to show viewers the things they can expect to do and find with respect to our online maps [incomplete].

6. The Douglas Collage: this is a bit of fun, and a way to show the viewer the sheer volume of the letters available. It will feature a collage of images built from the images in the collections [in process].

Permalink 03:58:51 pm, by kim, 24 words, 40 views   English (CA)
Categories: Activity log; Mins. worked: 3

CO 410/1 update

Turns out that our scans of the microfilm for this volume were correct, after all. the pages in the original volumes are stamped incorrectly.
Permalink 01:22:48 pm, by mholmes, 163 words, 39 views   English (CA)
Categories: Activity log; Mins. worked: 180

Marked up one more map, and learned a lot

Working on the second of the Spanish maps, from 1779, I had only a few items to mark up, but a huge amount of quite interesting research to do. I eventually discovered, from some web research and parsing through the lengthy Spanish caption, that the longitude readings on the map are relative not to Greenwich but to San Blas (see this ref, which helped to figure out the range of the map. I also identified a few places that were previously mysterious, including Punta de los Mártires (Point Grenville, Washington State), Entrada de Hezeta (apparently the Columbia River delta), and Las tres Marias (now Islas Marías, Nayarit, Mexico). RS from Hispanital has helped me a lot with the transcriptions.

Did a bit of research for the next map, which I haven't started yet: it seems that Friendly Bay is now Yuquot, and Scott's Bay may be Eagle Bay (or may not), and is probably in Barkley Sound. More work to do there.

Permalink 01:17:34 pm, by mholmes, 19 words, 47 views   English (CA)
Categories: Activity log; Mins. worked: 60

Multifarious logos added

Added several more logos to the credits page, some of which had to be pieced together from component parts.

06/05/10

Permalink 02:24:03 pm, by mholmes, 149 words, 43 views   English (CA)
Categories: Activity log; Mins. worked: 300

Maps now working

Work completed today:

  • Pushed the updated map code up to the main site, and tested with IE. No problems.
  • Added <ref type="map" cRef="[mapId]#[placeId]"> handling so that you can link directly to a spot on the map from any TEI document.
  • Updated the Guidelines document to include this info.
  • Added more detailed explanations to the map gallery page.
  • Updated the Development page to include details of the maps, and other recent changes.
  • Added a Maps link to the main menu.
  • Added info and a link to the Indexes page.
  • Marked up the first two maps (by date) in the collection, so that there are some obvious examples of annotated maps to look at. We now have six marked-up maps.
  • Updated V52105.xml to include two footnotes pointing to the map which was extracted from this despatch, which is marked up.
  • Various testing and minor tweaks.

05/05/10

Permalink 04:51:25 pm, by mholmes, 107 words, 34 views   English (CA)
Categories: Activity log; Mins. worked: 360

Basic map work finished

These features are now complete:

  • Placename features on marked-up maps now link to a popup showing the placename information.
  • Place popups now retrieve reference lists which include links to specific maps where these maps include links to the places.
  • Navigating to a map with the xml:id of a place as the hash on the URI results in that place being outlined and centred, and its popup displaying.
  • Licence info is displayed underneath all the maps, as well as the map gallery.
  • Various minor tweaks to UI and layout.

I think this is basically ready to go now. I'll push it up to the site tomorrow morning.

04/05/10

Permalink 03:17:34 pm, by mholmes, 97 words, 36 views   English (CA)
Categories: Activity log; Mins. worked: 60

Added KML output from geo coordinates in maps

The maps will eventually mostly have geo coordinates relating them to the area they cover, and I've now modified the code which creates KML files from place information in the places collection so that it can also create KML based on a map document id. This enables us to create a link to Google Maps, passing the KML URI to Google, to see the area laid out over a modern-day Google map.

Can't test this till we have it up on the main site, because it works from a hard-coded site URI which is supplied to Google.

03/05/10

Permalink 03:34:41 pm, by mholmes, 106 words, 60 views   English (CA)
Categories: Activity log; Mins. worked: 360

Much progress with map display

Working most of the day on the map display, and getting close to completion. I've now got the menus functioning as required -- looking pretty much like the main menu on the rest of the pages, but right-aligned; and I've got the access keys working for all the menus except for the items in the drop-downs. The site banner graphic is now included, as is the metadata (both displayed for the reader and included in Dublin Core in the header). There's a lot more tidying up to do, and I need to test on IE (it works on the other browsers), but we're pretty close now.

Permalink 12:09:20 pm, by kim, 82 words, 48 views   English (CA)
Categories: Activity log; Mins. worked: 3

CO 6/26 in 1858

Nine files in 1858 require proof against their respective images, which are in CO 6/26 [on the coldesp server already in LAC scans folder: B-3009 and B-3010]:

B585HB06.scx, B585HB07.scx, B585HB09.scx, B585HB10.scx, B585HB11.scx, B585HB14.scx, B585HB35.scx, B585MI01_A.scx, B585MI02.scx.

For now, since we are scheduled to deliver up to 1857 (for our June 22nd, 2010 launch), we will put the processing of the images required for these files on hold.

30/04/10

Permalink 03:45:19 pm, by kim, 35 words, 72 views   English (CA)
Categories: Activity log; Mins. worked: 3

Placenames caught up

A nagging hangover from Before the Great Image Processing was several placenames without write-ups, largely from 1852. I have finished these off, and added a few others. We are now at a total of 133 placename write-ups!

28/04/10

Permalink 03:58:32 pm, by mholmes, 161 words, 58 views   English (CA)
Categories: Activity log; Mins. worked: 180

More work on integrating the map display code

I've managed to make the map sizing code relative -- MJ's page layout was done in pixels originally, but the underlying map handling is within a box that can be sized easily, so that wasn't too hard. I've also started integrating the site style a little bit, starting with the menu, which now looks a bit like the main site menu. I have to decide what components of the main menu I want to include on this page -- there isn't room for them all. Probably just Home and Map Gallery. I'll need to enable keystroke navigation somehow, to comply with our accessibility policy, and that will be quite hard. Then I have to find a good place to put the copyright/disclaimer info on the page, and find space for the metadata about the map. It's going to get a bit crowded. No room for the header graphic, unfortunately, although I'll have to try it out just to be sure.

27/04/10

Permalink 05:25:55 pm, by mholmes, 232 words, 49 views   English (CA)
Categories: Activity log; Mins. worked: 360

Map display code coming along

Working all day on integrating MJ's map display code into the site. Much of it was positioned and sized in pixels, which is natural because the main component is an image sized in pixels, but it is possible if you're careful to size all the text components in ems and percentages to make for flexibility. I laid out the menus using a different approach (inline-display list items, with the right menu floated, rather than floating list-items). Then I started work on the display of the annotations. I've arrived at a solution that I think works pretty well, but there are still some oddities with regard to positioning that I need to work on, because I'm placing the annotations in a static location. I'm half-way there.

This all precedes somehow adding the main site styling to the page, which is going to be a challenge because of space constraints. I also have an issue with some zones not showing up; these appear to be the ones which have customized id attributes because they're going to be pointing at places in the places database.

While working on this, I also re-organized the map gallery display, to make it less confusing. The mouseover popup now appears in the right margin rather than over the top of the moused-over image, which means that the original is not obscured, so you can more easily click on it.

23/04/10

Permalink 04:19:26 pm, by kim, 52 words, 57 views   English (CA)
Categories: Activity log; Mins. worked: 3

CO 410 images processed, for Volume 1

I have processed the images for CO410, Volume 1. But, I am concerned that there is a page missing, so I have asked Chris to re-order the reel. Better safe than sorry. Once the reel arrives, it should take only minutes to find and incorporate the file, if it is indeed our mistake.

22/04/10

Permalink 01:16:29 pm, by mholmes, 35 words, 60 views   English (CA)
Categories: Activity log; Mins. worked: 120

Meeting at Govt House

Meeting with KSJ and KSW at Govt House to discuss arrangements for the launch in June. Took notes and video of the location, and agreed on some basic plans for AV, foam board visuals etc.

21/04/10

Permalink 05:23:22 pm, by mholmes, 96 words, 49 views   English (CA)
Categories: Activity log; Mins. worked: 20

Latest stats

CP was doing a report and needed some stats, so I generated them. There are currently:

There are:
  • 11,770 processed page-images, in three different sizes (thumbnail, 800px, and full size), for a total of 35,310 individual images
  • 4,607 individual <pb> tags with image links, along with 997 start page image links in file headers, for a total of 5,604 page-images linked into documents
  • 41,630 tagged people's names
  • 2,209 tagged placenames
  • 1,039 tagged vessel names
  • 218 maps with basic metadata in the database

As yet, we have no processed page-images from CO 6, but we have 10,652 images waiting to be processed from CO6/18 through CO6/36.

Permalink 11:59:50 am, by kim, 31 words, 63 views   English (CA)
Categories: Activity log; Mins. worked: 3

Renamed RG7 images

As we added two new images to the RG7, volume 1, collection, I had to rename the 60px, 800px, and full size images, respectively. I completed this and uploaded the renamed images.

20/04/10

Permalink 04:51:30 pm, by mholmes, 31 words, 50 views   English (CA)
Categories: Activity log; Mins. worked: 30

Updates and rearrangements

Brought down all the latest updates from KSW's changes, and deployed them to the db. Rewrote the backup-to-local scripts to find stuff in the places it's now located on the server.

Permalink 03:53:25 pm, by kim, 41 words, 65 views   English (CA)
Categories: Activity log; Mins. worked: 3

RG7 G8C: missing images restored

It turned out that we missed only two images in RG7-G8C, volumes 1, 2, and 3. I have processed them and added them to the server in the appropriate folders. I then returned the film reel to Chris' desk, in his office.

16/04/10

Permalink 04:23:25 pm, by kim, 55 words, 60 views   English (CA)
Categories: Activity log; Mins. worked: 3

Update on PB tags

I have roughly 200 xml files left in 1858 for PB-tagging. So, over 400 completed already! I forecast roughly 4 days to finish the rest.

Caitlin will use up the rest of her hours on time, and this should leave us with a much-improved vessels database. I will work with her next Friday to tune it up for publication.

15/04/10

Permalink 03:18:45 pm, by mholmes, 73 words, 53 views   English (CA)
Categories: Activity log; Mins. worked: 120

Launch meeting with Ceremonies

Meeting with UVic Ceremonies to plan the launch. Planning for AV will have to wait until May, when we can get down to Gov't House with KS to find out the lay of the land. In the meantime, following the meeting, KSW and I did some research on poster and foam board printing, since it's likely that some maps and pictures on foam boards might be a good alternative to projectors and screens.

09/04/10

Permalink 04:17:13 pm, by kim, 66 words, 56 views   English (CA)
Categories: Activity log; Mins. worked: 5

Leanna leaves, we soldier on

Thanks to Leanna's relentless capacity for toil, bless her, we are in better shape. I should be able to complete the PB tags for the remaining 1858 XML files by, roughly, April 20th, possibly sooner.

From there, I will process the images for 410/1, which should take a few days. Then, I will move on to tagging people and places in years 1852-1857 inclusive--as 1858 is, presumably, done already.

06/04/10

Permalink 01:35:30 pm, by mholmes, 60 words, 49 views   English (CA)
Categories: Activity log; Mins. worked: 120

Second pass at the map gallery

I've enhanced the map gallery so that something slightly more attractive than a tooltip shows the details of the maps when you mouse over them. I've also added some drop-shadows, so the maps look more like other documents on the site. I think there may be some more cosmetic changes in future, but I think this will do for now.

01/04/10

Permalink 02:12:04 pm, by mholmes, 25 words, 56 views   English (CA)
Categories: Activity log; Mins. worked: 30

1855 images posted

Images for 1855 (CO 305 06) are now on the site, indexed and working, and I'm running the rsync operation from yesterday to back them up to Rutabaga.

31/03/10

Permalink 04:34:47 pm, by mholmes, 114 words, 98 views   English (CA)
Categories: Activity log; Mins. worked: 90

Backups of jpegs in www folder

Finally did something I've been meaning to do for ages: created a full backup of the processed page-images from Lettuce to Rutabaga. This took literally all day (I throttled rsync a bit to make sure Lettuce didn't struggle). I realized I could also have logged on directly to nfs.tapor.uvic.ca, taking Lettuce out of the equation; might do that in future. This is the process: ssh into lettuce, navigate to the coldesp www folder, and run rsync --verbose --progress --stats --compress --recursive --times --bwlimit=10000 jpg_scans/ -e ssh mholmes@rutabaga.hcmc.uvic.ca:/"home/mholmes/backups/Martin/Colonial\ Despatches/www/jpg_scans/" . I did the equivalent for the maps directory as well.

30/03/10

Permalink 04:11:37 pm, by mholmes, 197 words, 92 views   English (CA)
Categories: Activity log; Mins. worked: 300

Finished the map gallery page

After more hassle than I expected getting forms to submit with correct values, I now have a flexible and responsive map gallery, with paging features, and the ability to change the sort sequence and the number of items displayed on a page. Next is getting the individual map display code working, which means figuring out how MJ's code works, and trying to fit it into the context of the site.

Note to self: remember, in future, that rather than trying to get a form to submit itself in the old-fashioned manner, it's much simpler to write a bit of script that grabs all the values you want and constructs a GET URL, then sets the location to that. Even simpler might be AJAX, but in this case it seemed overkill so I went the route of a traditional form page. And gave myself loads of trouble as a result (e.g.: if you try to trigger submission from the onchange event of a <select> element, you'll have trouble getting the value of the selected option, and the onsubmit event of the form will never fire).

Tested and working in FF, WebKit (Epiphany), Opera, and IE8.

29/03/10

Permalink 03:15:37 pm, by mholmes, 96 words, 77 views   English (CA)
Categories: Activity log; Mins. worked: 300

Writing the map browser

Spent most of the day writing the map browser for our collection of 219 maps. I have it basically working, in a way that's similar to the Mariage site (but much simpler from a CSS point of view). I haven't yet got the Previous and Next buttons working -- in fact, it'll probably be a bit more sophisticated than that, allowing the user to decide how many maps to view on one page, etc. But the basics are all done, including some complicated bits to lay out the variously-sized thumbnails in an even manner on the page.

26/03/10

Permalink 01:55:45 pm, by mholmes, 53 words, 81 views   English (CA)
Categories: Activity log; Mins. worked: 30

Uploaded RG7 8 images, and fixed image browser

The identifiers for the RG7 images differ slightly from the format of the CO ones, in that the central component (G8C in this case) cannot be reduced to an integer, so when I uploaded these images I had to tweak a bit of XQuery to make the image browser display them properly.

25/03/10

Permalink 02:59:07 pm, by kim, 54 words, 79 views   English (CA)
Categories: Activity log; Mins. worked: 3

Processed NAC/RG7 Volume 1 images

NAC, or RG7, G8, Volume 1 covers the years 1849-58, inclusive. So, even though we have RG7 volumes 2 and 3, I will ignore those for now, until we need them.

Also, Chris has reordered the RG7 reel, as we think we might have missed an image. If so, it will take only minutes to find it.

22/03/10

Permalink 10:49:31 am, by mholmes, 73 words, 144 views   English (CA)
Categories: Activity log; Mins. worked: 30

Uploaded 1856 images

Small changes to the 1856 image file names required a complete re-upload, which is slightly complicated by the need to delete some files whose names are no longer in use. Lots of cautious rsync operations did the job; then I had to update the XML file which lists all the scans. That file's getting a bit big, so I'm wondering whether we might start breaking it down for ease of maintenance at some point.

16/03/10

Permalink 04:32:59 pm, by mholmes, 68 words, 92 views   English (CA)
Categories: Activity log; Mins. worked: 60

Laboriously reconstructed four markup files

The four trial markup files had to be integrated back into the map setup, because the format of the files and the metadata content is now more detailed, and the "large" size maps are now slightly differently sized from the ones I was working with when I did the markup. This involved some juggling and moving all the markup zones in some of the files. That's now done.

Permalink 03:14:26 pm, by mholmes, 42 words, 86 views   English (CA)
Categories: Activity log; Mins. worked: 180

Completed 218 maps

Completed the creation of XML files for the maps I've identified as relevant (now down to 218). The images are on the server, and I can now start looking at the creation of a gallery and integrating the rendering code MJ has written.

15/03/10

Permalink 05:13:15 pm, by mholmes, 85 words, 79 views   English (CA)
Categories: Activity log; Mins. worked: 360

Progress on map markup

Today I finished my little QT app to handle generating the IMT TEI files for the maps, and began using it to churn through the maps. So far, I've produced 160 XML files and associated map resizings. Found a couple of oddities in the process, and there are many maps where the metadata from the Excel spreadsheet is generic to a group, rather than specific, and these will need to be enhanced. But basically the process is working. I'm done up to 1861; I'll start from 1862 tomorrow.

Permalink 09:56:26 am, by mholmes, 30 words, 73 views   English (CA)
Categories: Activity log; Mins. worked: 60

1854 and 1856 images integrated into site

I've uploaded the images from CO 305 05 and 07, and added entries to the db for all of them. This leaves CO 305 06 (1855) to be done, along with the CO 410 and RG images.

12/03/10

Permalink 04:42:55 pm, by kim, 37 words, 90 views   English (CA)
Categories: Activity log; Mins. worked: 3

1854 and 1856 images processed!

Leanna finished 1854 and I finished 1856. We will now move on to adding the page-break tags for these years. Firstly, however, I will add the two or three new place-name entries gathered from 1852 (completed earlier in the week).

Permalink 01:59:02 pm, by mholmes, 102 words, 82 views   English (CA)
Categories: Activity log; Mins. worked: 240

QT app to process images coming along

This is what the app now does:

  • Creates a quarter-size image.
  • Creates a 1000-px image.
  • Creates a thumbnail.
  • Fills in all the metadata it can derive from the image.
  • Pops up a dialog box for some other data which can only be got from the Early British Columbia Maps.xls file (from JF). I'm currently entering that manually, but I've mapped out a routine which will enter all the data from the clipboard after a row in the spreadsheet is copied to it; that will make the process way quicker. Then I'll be ready to go ahead with generating all the data.

11/03/10

Permalink 05:21:07 pm, by mholmes, 118 words, 79 views   English (CA)
Categories: Activity log; Mins. worked: 120

A cunning plan for auto-generating IMT files

I need to auto-generate IMT files for all my maps, and populate them with as much metadata as possible. To that end -- and to continue my learning process with QT -- I'm writing a little QT app to do it. So far, it can:

  • Parse a directory full of images.
  • Get the filename, width, and height of each image.
  • Load a template file with placeholders.
  • Insert the filename, width and height into the template.
  • Report its progress using a progress bar and status bar.

Much of the rest of the metadata will be simple, but some will not -- I'll have to parse the Excel spreadsheet to get descriptive information, for instance, which will take some work.

10/03/10

Permalink 02:10:35 pm, by mholmes, 63 words, 98 views   English (CA)
Categories: Activity log; Mins. worked: 30

Thumbnails generated

Copied all the JPGs into a thumbs directory and created all the thumbnails thus:

mogrify -resize 100 *.jpg

Ready to start looking at how to generate all the XML files. That's going to be a bit of a hard one, but I think it might be possible to do it with a little QT app, and it might save an awful lot of time.

Permalink 10:27:03 am, by mholmes, 71 words, 87 views   English (CA)
Categories: Activity log; Mins. worked: 120

Files renamed, some replaced, errors fixed

After a process of careful renaming, as well as checking against the spreadsheet to correct original naming errors and replacing some more unstitched components with complete versions, I now have 220 maps with relatively useful filenames which are designed for the web. Many of the filenames do not yet reflect the contents or include the year, but that can be done later. We have enough to be working with for the moment.

09/03/10

Permalink 01:48:32 pm, by mholmes, 26 words, 79 views   English (CA)
Categories: Activity log; Mins. worked: 60

Stitched fragments of maps now replaced

I now have a collection of usable maps (226 in all), sorted by year. Next steps:

  • Filename-normalization
  • Auto-generation of 1000px variants and thumbnails
  • Auto-generation of XML files.
Permalink 11:47:43 am, by mholmes, 51 words, 75 views   English (CA)
Categories: Activity log; Mins. worked: 180

New maps now sorted

Worked through the 30 or so new maps from DVDs (and created a local copy of the DVDs). These are now sorted by year, and their filenames have some useful info in them. Next I have to find all the stitched items that JF did, and replace my unstitched fragments with them.

05/03/10

Permalink 03:58:50 pm, by kim, 90 words, 66 views   English (CA)
Categories: Activity log; Mins. worked: 5

Kim's Week-end Update

Some good news. Leanna has moved on to parsing the 1854 images, and is well under way. I have completed all but two (curses!) 1852 files (places, people, vessels, and page breaks). I should be ready to post 1852 on Monday afternoon. Lastly, Frank has submitted 13 biographies. I will proof and post these once 1852 is complete, just for a brief change of pace.

Looking ahead, I will move on to prep/parse the 1855 images, then see where Leanna is at. Image-prep is a priority now, as we can't do much without the images!

Permalink 01:48:58 pm, by mholmes, 31 words, 79 views   English (CA)
Categories: Activity log; Mins. worked: 60

Images uploaded for 1853

LSPW has finished processing and linking the 1853 images, so all those files have been added into the system (CO 305: 4). Also trimmed out one file which was a dupe (V525HB00, duplicates 01).

Permalink 01:47:41 pm, by mholmes, 80 words, 76 views   English (CA)
Categories: Activity log; Mins. worked: 45

Fixed Coldesp bug

Fixed the bug which was causing abstracts do disappear on a page accessed through a search. It was a trivial bug, but took a while to find because I'd confused myself by leaving and old version of an XQuery file in the tree, alongside a new one with a slightly different name. Doh.

In the process, though, I finally set up a local copy of the site running in my local Tomcat, which makes for easier debugging and dev work.

02/03/10

Permalink 05:46:33 pm, by mholmes, 214 words, 61 views   English (CA)
Categories: Activity log; Mins. worked: 60

Trying but failing to solve a mystery

KSW discovered a very odd piece of behaviour on the site. When you go directly to a document with an abstract, the abstract appears as normal. If you go to the same document with a search string in the URL (as you would if it were the result of a search), then the abstract does not appear. I've been working on this for an hour, and I can't figure it out. So far, I know:

  • The problem appears to be in the XQuery which generates the XML.
  • That code branches depending on whether there's a search string or not. If there's a search string, it constructs a new document based on the old document, but including the search results and search data.
  • The function which constructs the document appears to be correctly inserting the <notesStmt> component of the <teiHeader>, but it's failing to do so. It's successfully inserting other components of <fileDesc>, which is very odd indeed.
  • One possibility is the Java library I wrote which creates the XQuery search code based on the input search string. I'll have to look at that. However, it seems unlikely that the search string could modify the actual components of the file which are returned. This is a really puzzling one.

26/02/10

Permalink 04:38:00 pm, by kim, 49 words, 62 views   English (CA)
Categories: Activity log; Mins. worked: 3

1852 under way, and more

The placenames were given a "final" edit. Then, I moved on to the various tagging elements in 1852. Mostly through the more common people and places. I expect that it should take roughly 5 days to complete all tagging and page-break tags. Meanwhile, Leanna is roughly 1/3 into the page-break tags for 1853!

23/02/10

Permalink 04:12:21 pm, by kim, 80 words, 66 views   English (CA)
Categories: Activity log; Mins. worked: 3

Placename and Image Processing Update

Well, some good news: Leanna has finished the image processing for all of 1853. Next up, she will add the page-break tags for the same year. And, I have completed all the placename entries to date. There will, of course, be many more to come, but the backlog is over. So, I will now be better able to the placename database up to date. Finally, Caitlin has finished tagging the 1578 and 1858 vessels. The extra help is really starting to pay off!

19/02/10

Permalink 04:30:39 pm, by kim, 18 words, 50 views   English (CA)
Categories: Activity log; Mins. worked: 3

Microfilm scan update

Leanna and I finished scanning CO410 (V1), and CO410 (V2). We will tackle the RG series next week!

18/02/10

Permalink 04:46:34 pm, by kim, 44 words, 78 views   English (CA)
Categories: Activity log; Mins. worked: 3

Google Earth problems and (possible) cure

On Dandelion, which runs Karmic, Google Earth was a cludge-monkey. Then I tried installing version 5.0 from here <http://earth.google.com/intl/en/download-earth-advanced.html> -- be sure to select the radio button for 5.0! Now G Earth is running like hot butter.

Permalink 08:34:46 am, by mholmes, 22 words, 54 views   English (CA)
Categories: Activity log; Mins. worked: 20

Updates moved onto the site

Took a detailed look at newly-edited files from 1851 and 1858, validated them all, merged them into the regular tree, and updated the site.

16/02/10

Permalink 01:54:50 pm, by kim, 103 words, 51 views   English (CA)
Categories: Activity log; Mins. worked: 3

1851 complete

Took a final pass at 1851 an it looks ready to ship, barring any niggling errors we catch in the coming weeks.

I will now catch up on some placename write-ups, and then dig into 1852 on Thursday. From here on out, we are in "just-the-basics" mode. This means that we will tag people, places, vessels, First Nations groups (when we catch them), and add page-break tags. All matters of fine-tuning of format, or transcription, and the addition of abstracts, will be put on hold until time allows.

Meanwhile, Caitlin will continue with tagging vessels, and Leanna will continue with her work on the images.

12/02/10

Permalink 04:31:34 pm, by kim, 62 words, 62 views   English (CA)
Categories: Activity log; Mins. worked: 3

1851 nearly complete

As of Friday, Feb 12th, there are 6 abstracts to complete. This puts me on a predicted end for 1851. It might be good to have a quick check of the 1851 files, once we "complete" the year. This should wrap-up '51 by Monday afternoon!

Leanna is moving efficiently through the '53 images, and Caitlin has added at several new vessels to the vessels file.

10/02/10

Permalink 04:59:27 pm, by kim, 44 words, 51 views   English (CA)
Categories: Activity log; Mins. worked: 3

Kim's update

Caitlin is moving ahead nicely on the vessel tagging and placeholder entries. And, Leanna is underway with the image processing for 1853. Meanwhile, I am completing the abstracts for the remainder of 1851, a process that, barring interruption, I should complete by the end of Friday.

08/02/10

Permalink 04:42:24 pm, by kim, 65 words, 65 views   English (CA)
Categories: Activity log; Mins. worked: 3

Training Leanna

I walked Leanna through the processes associated with image processing, this afternoon. She will use Photoshop to start, but I showed her Picasa as well, in order to let her decide which looked easier to use. I suggested she work in Photoshop for half the day, and Picasa thereafter, before she decides.

I will ask Greg if it's alright to install Picasa on her machine.

Permalink 03:54:09 pm, by mholmes, 24 words, 75 views   English (CA)
Categories: Activity log; Mins. worked: 60

Meeting with LSPW, KSW and CP

Went over to CP's office to plan the microfilm ordering and digitization, and introduce LSPW. Some films should arrive within a week or so.

Permalink 09:40:42 am, by kim, 116 words, 51 views   English (CA)
Categories: Activity log; Mins. worked: 3

1851 update, and more

Managed to get through all of 1851 last week, save the abstracts, which I will likely complete by mid-to-late Tuesday. I also moved ahead in the time remaining last Friday, to tag all instances of Vancouver Island in the 1852 files.

Looking ahead, we have Leanna on board for 17 hours per week. For the first little while, she will focus on processing and optimizing the images for the remaining years, and she will add some extant images wherever possible. Like Caitlin, as she becomes more familiar with things, we will sort out what she wants to do most.

Caitlin will push ahead with the vessels database, and I will ask that she tackle a few vessel write-ups soon.

04/02/10

Permalink 08:54:31 am, by mholmes, 84 words, 53 views   English (CA)
Categories: Activity log; Mins. worked: 45

MJ work on map rendering

MJ and I have spent some time working out an approach to the map rendering which will keep the page fairly clean and free of clutter, and he's devised a CSS-based drop-down menu system for it. Today I've sent him an outline of the transformation that needs to be written to create the basic output; he'll work on that, and when it's ready, I'll integrate it into the main site (adding calls to headers etc.), and get familiar with the code in the process.

29/01/10

Permalink 02:13:12 pm, by mholmes, 29 words, 61 views   English (CA)
Categories: Activity log; Mins. worked: 90

Finished fourth map markup

This was the map from despatch 9099 (1852). Uploaded my places file and checked my coordinate outlines on Google Maps. I'm now ready to start writing the code for map rendering.

Permalink 02:11:43 pm, by mholmes, 7 words, 61 views   English (CA)
Categories: Activity log; Mins. worked: 90

Project meeting

Nothing in particular to report from this.

28/01/10

Permalink 03:36:16 pm, by mholmes, 57 words, 54 views   English (CA)
Categories: Activity log; Mins. worked: 180

On to my fourth map to mark up

The second map contained a detail which was an inverted plan of the fort, so I've extracted that into a separate image and marked it up separately. I've now completed markup of two three images (=two maps), and I'm onto the fourth. Once that's done, I'll stop and focus on the backend code for handling the maps.

27/01/10

Permalink 05:42:50 pm, by mholmes, 90 words, 278 views   English (CA)
Categories: Activity log; Mins. worked: 40

File and directory permissions issues

Our problems with group settings for files and directories have continued to plague us, and we discovered today that some of it was my fault; my permissions script was setting permissions to 2755 instead of 2775, due to a typo. Having fixed that, and had all three of us run the script, the only problems that remain are files and directories created by LR, on which she never ran the script. Those can't be deleted by us, unfortunately, since she's now left, so eventually we'll have to get sysadmin to do it.

26/01/10

Permalink 04:13:51 pm, by mholmes, 146 words, 68 views   English (CA)
Categories: Activity log; Mins. worked: 180

Updates to places; beginning work on the second map

Went through the handful of places I had created for my first map, and used the new Google Earth/XSLT process to add more detailed location information. The first trial failed in Google Maps, and I discovered it was because the Google Earth data comes out with 13 decimal places, which won't work; I tweaked the XSLT to produce 6 places only, and that works fine. So we now have some nicer outlines of places. The match between outlines in Google Earth and Google Maps is not precise, but it's close enough to live with.

Selected a second map to work with, and began marking it up. I've chosen a simple sketchmap by Vavasour of Victoria Harbour and the Gorge. That shouldn't take long, and will perhaps generate one or two more places. After that, I'll have to pick a more substantial map for my third pilot document.

Permalink 11:41:56 am, by mholmes, 86 words, 80 views   English (CA)
Categories: Activity log; Mins. worked: 30

Wrote XSLT for converting Google Earth KML to TEI

Google Earth gives us the option to designate a polygon on the map and export it as KML. I'm already producing KML from TEI <location>/<geo> tags to map our places onto Google Maps, but I've now written the reverse process so we can easily create our place markup in TEI using Google Earth. This will improve accuracy and save time over what we were doing before. KSW has a copy and has a transformation scenario set up in his oXygen environment.

Permalink 10:02:00 am, by mholmes, 797 words, 129 views   English (CA)
Categories: Activity log, Documentation; Mins. worked: 130

Instructions for marking up maps

This is a set of basic instructions for marking up maps for the Despatches project, using the Image Markup Tool. These instructions will be extended and refined as the process shakes down.

Preparation

  • Use the Image Markup Tool version 1.8.
  • Load an existing IMT file from a map already marked up, and save it with a new name.
  • Delete all the existing annotations.
  • Replace the image (File / Load an image
  • Go the the <teiHeader> editing area (TEI / Edit teiHeader, and change all the metadata you see to match the image you're working on. Note: For the date of the map, you want to provide (a) a date range using @notBefore and @notAfter, covering all the dates mentioned on the map (from surveys, revisions etc.); and (b) a single date as the text content of the date tag, which should be the date most prominently shown in the map title area. Where there is no useful date at all, you can use "n.d." in the date tag, and make an educated guess for the date range.
  • If possible, use Google Earth to discover the coordinates of the four corners of the map (or, if it's not rectangular, all of its corners), and encode these in decimal format (positive number followed by space followed by negative number) in the <geo> tags in the <sourceDesc>/<bibl> element. If that's not possible (because for instance the map is hopelessly inaccurate), then delete the <geo> tags.

Annotation

You'll see that there are three categories for annotations: transcription (areas in green), places (areas in blue), and notes (areas in red)..

Marking up places in the place database

The places category is used to define places which are included in our database of places (in the XML files included in xml/places. Of course, you may decide as you're marking up the map that a place on the map deserves to be in our places database, and add it to one of the XML files to make this happen. Encoding place information is documented in our main guidelines.

To mark up one of these places:

  • Select the places category.
  • Add an annotation, and size and position it to outline the place.
  • In the Annotation Title box, type the "normal" name of the place (the way it is referred to in the places XML file).
  • Click on the ID button, and insert the @xml:id attribute which is associated with the place in the places XML file. (You can easily discover this by going to the Places index and hovering your mouse over the name of the place).
  • In the Annotation Text box, inside the <p> tag, transcribe any text label which identifies the place on the map, and mark it up based on its appearance on the map.

Here's an example. Imagine that you're marking up Parry Bay on a map.

  • You draw a box around Parry Bay on the map.
  • You type Parry Bay in the Annotation Title box.
  • You set the ID to parry_bay.
  • You type PARRY BAY in the Annotation Text box (because on the map, it's labelled in capitals).

Transcribing other placenames on the map

All other text on the map, including placenames which are not in our database, is transcribed using the transcription category. This is how to transcribe placenames appearing on the map:

  • Select the transcription category.
  • Add an annotation, and position it around the text.
  • In the Annotation Title box, type a normalized, unabbreviated form of the text.
  • In the Annotation Text box, inside the <p> tag, transcribe and mark up the text as it actually appears on the map.

For example:

  • Annotation Title: Ned Point
  • Annotation Text: <p>Ned P<hi rend="text-decoration: underline; vertical-align: super; font-size: 80%;">t</hi></p>

Marking up titles, legends etc.

All other text on the map -- titles, publication info, explanatory text etc. -- is also marked up using the transcription category:

  • Select the transcription category.
  • Add an annotation, and position it around the text.
  • In the Annotation Title box, type a descriptive heading which identifies what the text is (Title, Publication information, etc.).
  • In the Annotation Text box, transcribe and mark up the text as it appears on the map.

Notes

Sometimes it's necessary to add an editorial note or explanation to something on the map which wouldn't otherwise be marked up. Use the notes category to do this.

Linking to and from maps

A link to a map document on the site looks like this:

<ref type="map" cRef="MPK1-59_10_vancouver_island_1846_detail">[Linked text]</ref>

The @cRef attribute contains the @xml:id (filename without .xml) of the map annotation document.

25/01/10

Permalink 03:15:27 pm, by mholmes, 231 words, 60 views   English (CA)
Categories: Activity log; Mins. worked: 240

Marked up the first map

I've now done a trial markup of a map (an 1846-48 map of the Vancouver Island coast east of Sooke). This has thrown up the following points for consideration:

  • I had to generate a lot of new placename entries. This will happen a lot initially, but the more we mark up maps covering the same areas, the less of this there will be. Therefore we should tend to mark maps of the same areas one after another, if possible, to keep the markup person familiar with the places and their keys..
  • Many places are not worth marking up as places; for these, I elected simply to transcribe their text. The question here is whether to put that text in the head or the body ("text") of the annotation. I've done both, and will have to revise. Probably, it should go in the body; the head should only be used if there is a descriptive label that could be attached to the text being transcribed (such as "title" or "legend").
  • We should presumably follow our normal rules and transcribe text with superscripts and dots below etc. I haven't done that yet in this case; I'll have to revise it.

Other than that, the process of marking up with Google Earth to find coordinates works pretty well, and IMT is working well under WINE now I've fixed a couple of issues with it.

Permalink 11:26:35 am, by mholmes, 103 words, 66 views   English (CA)
Categories: Activity log; Mins. worked: 30

Created a new Coldesp schema

The use of the IMT requires the facsimile element, which wasn't available at all when the original ColDesp schema was generated, so I had to generate another one. Working from the original ODD file, Roma produced a broken schema for some reason, so I started from scratch and added all the modules from the original ODD, and ended up with a working schema. It's a bit smaller than the original -- although it could be substantially reduced by running oddbyexample.xsl. I'll do that at some stage, but not now because the map markup will introduce lots of requirements we haven't had before.

22/01/10

Permalink 01:07:12 pm, by mholmes, 72 words, 68 views   English (CA)
Categories: Activity log; Mins. worked: 60

Began work on the first map markup

It looks as though the IMT has some small issues on Linux that I didn't know about (its own ident information is not available to it, so attributes remain unfilled in the saved file). But I have the first file started, and some transcription done; I still need to figure out exactly what kinds of copyright info and disclaimers need to be in the document. That should be addressed at our meeting.

Permalink 11:03:26 am, by mholmes, 372 words, 73 views   English (CA)
Categories: Activity log; Mins. worked: 60

Planning map metadata

The main repository for metadata about the maps will be the library, which is the formal custodian of the files and is best set up for storing and serving good metadata. Nevertheless, we'll need to store a significant amount of information in our <teiHeader>s, and before I figure out how best to store it, I want to enumerate it, along with some notes on potential issues:

  • Filenames of images should incorporate all original reference id info, as well as some extra helpful text. Example: FO925-1650_pt1_23_becher_pedder_bays_1846-48.jpg. @xml:id should be filename without extension.
  • Short title (taken from the map itself, preferably, and short enough to fit in small popup boxes). This belongs in the <titleStmt>.
  • Full title (the primary components of the title on the map, with as much information as possible), but not including legend info etc.). Also in the <titleStmt>, but both titles should also be included in the <sourceDesc>/<bibl>.
  • Who is responsible for the markup of the map (<titleStmt>/<respStmt>).
  • Area covered. This should be in two parts: one a textual description, and the other a rough outline in the form of GIS points for the corners of the map.
  • Date. This is very problematic; most maps have multiple dates, or none at all. I think if the full title contains a date, that should be the main date; other dates (publication, revisions etc.) should be incorporated by means of @notBefore and @notAfter. This allows us to sort them by date, but still show the full range of associated dates.
  • Publisher and publication place -- straightforward <publisher> and <pubPlace> tags in the <sourceDesc>/<bibl>.
  • Identifier information (FO925-1650 pt1 (23), for instance). It's not clear how generally useful this information is at this time; for the moment, rather than breaking it down into components, it should probably all be included with an <idno> tag.

That's probably all that we need for the moment. Other text on the map which does not identify locations should probably be transcribed, so we would have two categories of annotation: places and transcriptions.

21/01/10

Permalink 04:34:54 pm, by kim, 69 words, 52 views   English (CA)
Categories: Activity log; Mins. worked: 5

Frank visits lab to discuss bios

Frank and I covered a few things related to the bios.

I asked that he produce roughly a dozen before shipping them to me for copyediting. Also, I asked that he consult Chicago Style for issues related to citation, which we discussed. This is an ongoing process, and, as we reach a point of consistency with all of the above, I will add our "rules" to the guidelines document.

20/01/10

Permalink 04:36:49 pm, by kim, 35 words, 62 views   English (CA)
Categories: Activity log; Mins. worked: 5

1851 update

Finished tagging most common placenames and people names in all 1851 files, as well as all dates. Today was spent on the rest. Caitlin continues to work ahead on the vessels, and all is going smoothly.

Permalink 10:26:00 am, by mholmes, 35 words, 65 views   English (CA)
Categories: Activity log; Mins. worked: 60

Finished organizing maps

Went through the library's map set and substituted their stitched versions for all the fragments in my set. I now have a complete set of the maps I want to include, organized (roughly) by year.

18/01/10

Permalink 04:42:19 pm, by kim, 181 words, 61 views   English (CA)
Categories: Activity log; Mins. worked: 5

Update and Welcome to Caitlin

Caitlin is on board now, and working away on the vessel names. It is our hop that she will have time to tag all vessels with a unique ID for all years up to and including 1858; she will write placeholder information-entries for each new vessel (a quick process). Once this first round is complete, and should time allow, Caitlin will begin to draft vessel entries, or write the odd one for a bit of a brain-break from coding. We are happy to have her aboard the Good Ship Despatches!

I have completed place-name write-ups for 50 places, and have added placeholder entries for several since. Which puts us at 71 place-name records, and growing. As I turn attention back to working through the years ahead, I will diverge on Fridays, when time allows, to catch up on lagging place-names.Incidentally, I would have begun the work sooner, but I wanted to see how best to work with Caitlin's skills and leanings. Now that we see she is keen on vessels, I can dig into tagging names, places, and so on, in 1851, and beyond.

Permalink 10:30:06 am, by mholmes, 18 words, 68 views   English (CA)
Categories: Activity log; Mins. worked: 30

MS images for CO 305 vol 3 uploaded

Incorporated the latest images (1851 and 1852) into the db and uploaded the three sets of jpegs to the server.

15/01/10

Permalink 10:10:37 am, by mholmes, 98 words, 62 views   English (CA)
Categories: Activity log; Mins. worked: 120

Finished working through the map CDs

Next I have to replace the fragmented ones with the stitched copies created by the library.

Dating is going to be a problem. Some maps have no dates at all, or can only be dated by inference, but others have multiple dates -- dates of the original surveys, dates of adjustments added later, and date of publication. I think the only solution is to list each <date> with a @type attribute specifying what they mean, and also include one <date> element with no @when, but with @notBefore and @notAfter, to specify the complete range.

13/01/10

Permalink 04:28:54 pm, by mholmes, 22 words, 62 views   English (CA)
Categories: Activity log; Mins. worked: 180

Still working through the maps

I've got as far as CD #15, item FO925-1650 pt1 (10). Two and a half CDs to go. Many are mislabelled or misdated.

Permalink 02:26:50 pm, by mholmes, 162 words, 50 views   English (CA)
Categories: Activity log; Mins. worked: 60

Fixed a bug in display of popup info

I noticed that the "Mentions of this x in the documents" link was not showing up in the popup version of the person, place or vessel info, although it was still there in the index pages. This proved to be a by-product of my move to make all those lookups into AJAX requests; in the process, I had taken the individual items out of the context of the <list> or <listPerson> element they are normally contained by, and this had the extra effect of failing to trigger some of the required rendering code. When I put them back into a list, the "Mentions" link came back, along with some good display features (colour-coded titles depending on the type of item). This, though meant that they ended up displayed as list items, so they were offset to the left and had a bullet point. I've now added some CSS to eliminate that problem in the context of the popup.

08/01/10

Permalink 04:10:29 pm, by kim, 115 words, 59 views   English (CA)
Categories: Activity log; Mins. worked: 5

First week back after holidays (Jan. 2010)

A productive week. I managed to optimize the remaining images in the 1851-1852 folder, which are bundled together in the microfilm images. So, that makes roughly 1,200 separate images for the years stated. As for Friday, I have been catching up on editing and writing placename entries with speedy results. The aim is to keep them short, but as well-researched as possible, and cited fully along the way. Thanks the gods for Andrew Scott's recent book, The Encyclopedia of Raincoast Placenames.

I am told that we should have a new RA's labour to add to the fray in the next week or so. This will help speed of the mechanics of the machine to great effect!

Colonial Despatches

The Colonial Despatches is an XML database project which is creating a digital archive containing the original correspondence between the British Colonial Office and the colonies of Vancouver Island and British Columbia. The project lives at http://bcgenesis.uvic.ca, and the web application runs on the Pear dev Tomcat. The XML data is managed in SVN at http://revision.tapor.uvic.ca/svn/coldesp/.

Reports

Categories

2010
 << Current>>
Jan Feb Mar Apr
May Jun Jul Aug
Sep Oct Nov Dec

XML Feeds