Archives for: March 2012

30/03/12

Permalink 02:15:46 pm, by mholmes, 20 words, 87 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 169 - 2 = 167 hours G&T

Long lunch, and leaving early to watch the last of my NLP lectures for this week. Getting those hours down...

Permalink 02:14:44 pm, by mholmes, 132 words, 67 views   English (CA)
Categories: Activity log; Mins. worked: 15

Calculating a network graph of locations

Following our recent presentation, I've been thinking a lot about the idea that the true "map" of the project is the XML db, and it occurs to me that a good illustration of this would be a network map of locations in the database. This could be done by measuring, for every two locations that share a parent document, the proximity between them, and then by calculating the average proximity between each pair of locations across the whole db. Then you could use that to create a network graph.

I've been trying to think of ways to calculate the proximity of two XML tags, and I think it could be done with XPath:

string-length(concat($tagOne//following::text()[following::???$tagTwo], '')

although I'm not quite sure how to phrase the last bit...

Permalink 02:06:00 pm, by mholmes, 163 words, 112 views   English (CA)
Categories: Activity log; Mins. worked: 30

map_lookup.xml done

Simple XQuery to pull out the data:

xquery version "1.0";

declare default element namespace "http://www.tei-c.org/ns/1.0";
declare namespace tei = "http://www.tei-c.org/ns/1.0";

<maps xmlns="http://hcmc.uvic.ca">
{
for $t in //tei:TEI
return 
<map xml:id="{$t/@xml:id}">
{
if ($t//tei:title) then
<title>{$t//tei:title[1]/text()}</title>
else
()
}
{
if ($t//tei:idno[@type="penfoldNum"]) then
<penfold>{$t//tei:idno[@type="penfoldNum"]/text()}</penfold>
else
()
}
</map>

}
</maps>

I might have to add more data points to the output; in fact it might be worth just pulling out the whole of the sourceDesc. I'm currently looking at the possibility of enhancing my UniSymMetric Java class so it could be called as an extension function from XSLT in Saxon; that would give me a fallback when there's no Penfold number, and it might be handy in all sorts of other ways too.

Permalink 10:56:36 am, by mholmes, 306 words, 106 views   English (CA)
Categories: Activity log; Mins. worked: 60

Importing metadata from ContentDM

JD pointed me at an OAI feed from ContentDM, which is exactly what I need for my metadata harvesting. This is my plan:

I've started work on an XSLT stylesheet to do the job. The purpose of the stylesheet is to process detailed OAI metadata records which use Dublin Core identifiers into teiHeader elements suitable for adding to TEI documents Despatches project.

The OAI metadata is in the file oai_from_contentdm.xml, and originates in the UVic Library's ContentDM system. It contains 261 records relating to Early BC Maps, and most of these are maps also in the Colonial Despatches project collection. The ContentDM metadata is well-organized and has been considerably enhanced, so we're going to take that data and generate new teiHeader elements for our TEI files from it.

The first stage is to create a mapping between each of the fields in the OAI data and the location in the teiHeader where we propose to store it.

Input documents:

  • oai_from_contentdm.xml (OAI record set).
  • ../xml/maps/*.xml (TEI documents for each of the maps)
  • map_lookup.xml (simple XML document which hopefully provides enough data to allow this transformation process to retrieve the correct TEI document for each record in the OAI data. This lookup will be based on a number of factors, including Penfold number, title, and descriptive information. Creating this file is the next stage in the process.

Output documents:

  • ../xml/maps_enhanced/*.xml (from each TEI document we have, create an enhanced version which incorporates the original @xml:id and metadata, as well as the facsimile element with data about the image file, but also builds in the metadata gleaned from the OAI file. These files will eventually replace the original TEI files in the Despatches site, once the Map Gallery code has been rewritten to work with them.
Permalink 08:25:26 am, by mholmes, 178 words, 87 views   English (CA)
Categories: Activity log; Mins. worked: 30

Map confusion and metadata

Adding this as a task for me, long-term, because it needs to be part of the plan for the next phase of the project.

I had pointed JT at fo_925-1650_pt_1_24_vic_harbour_1847, which is Penfold 576, for the Kellett map of Victoria Harbour, but it turns out he wanted Penfold 577, which is fo_925-1807_vic_1848. I've slightly enriched the metadata for 577 using data from ContentDM, manually, but there should be a way to do this mechanically because the ContentDM metadata is organized into clear fields. Ultimately, it would be a good idea to find some way to get at this metadata and pull it into our headers, so we'll have to write a mapping between the two. Here's an example of the ContentDM data in HTML:

http://contentdm.library.uvic.ca/cdm/singleitem/collection/collection5/id/130/rec/2

It claims to be XHTML, but it's not even well-formed, never mind valid, so it couldn't be parsed with e.g. XSLT unless it was tidied first. Hopefully there's a more helpful feed from it. I'm contacting JD about that.

29/03/12

Permalink 05:10:57 pm, by mholmes, 27 words, 86 views   English (CA)
Categories: Activity log; Mins. worked: 30

Map dates need tweaking

Dating of maps is inconsistent for maps which have a notBefore and/or notAfter. Check them in the sorted gallery, find oddities, and normalize. Did some today.

Permalink 04:56:13 pm, by mholmes, 2 words, 78 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 168 + 1 = 169 hours G&T

Late duty.

Permalink 04:49:27 pm, by mholmes, 58 words, 126 views   English (CA)
Categories: Activity log; Mins. worked: 90

Re-worked cover

The bookstore is using lighter paper than we're used to, so my calculations for the spine width were off. I'd also omitted one of the editors from the cover, so a rework was necessary anyway. Got that done, and sent off the new PDF of the cover, and a new PDF for the document, incorporating Rudling's last-minute changes.

Permalink 04:47:56 pm, by mholmes, 47 words, 187 views   English (CA)
Categories: Activity log; Mins. worked: 60

Meeting with P & A: site nav finalized

With SA, synthesized all the various changes and suggestions into a single document, then met with the folks from P & A and finalized them all. SA has merged them back into the final spreadsheet, and we're ready to get to work. I created the primary folders.

Permalink 04:46:35 pm, by mholmes, 48 words, 89 views   English (CA)
Categories: Activity log; Mins. worked: 60

Housekeeping and bugfixing

Did some auditing of the "Marion's transcriptions" spreadsheet that we're using to keep track of the transcriptions awaiting markup, since PCA has been working on these; checked filenames and made updates and notes where appropriate. Also fixed file naming issue reported by PCA, and did some other housekeeping.

Permalink 12:04:59 pm, by sarneil, 79 words, 220 views   English (CA)
Categories: Activity log; Mins. worked: 15

add new tag or branch to svn repo

note to self on nuts and bolts
on local file system:
create the folder structure you want (if you're copying an existing local instance of an svn project, you have to delete the .svn file from each folder in that project)
on command line,
cd to parent folder of the one you want to add (that parent folder has to already be in svn)
svn add FOLDER_YOU_WANT_TO_ADD
svn commit -m "message about adding new folder"

Permalink 11:05:05 am, by mholmes, 194 words, 169 views   English (CA)
Categories: Activity log, Documentation; Mins. worked: 60

Adding maps to the site

JT provided two new maps for the gallery, so I've added those. I had to refresh myself on the procedure for doing this, so I'll detail it here:

  • Extract the bitmaps from the PDFs (if that's the format they come in) using pdfimages -j [pdffile] [outputprefix].
  • Create meaningful filenames based on repo, id numbers, and year.
  • Copy the full-sized originals into the correct year in [coldesp]/maps] on local drive. These will just be backed up locally.
  • Create a quarter-sized "large" image (max width 5000) in maps_lg.
  • Create a 1000px-wide version in maps_1000.
  • Create a 200px-wide version in maps_200.
  • Create a 100px-wide version in maps_thumb.
  • Create an XML file with the same name as the image file, and a matching @xml:id. It's simplest to model this on an existing file. Save it in xml/maps.
  • Fill out the metadata, and point the facsimile graphic at the right file name, with the right dimensions.
  • Add the XML file to SVN and commit it.
  • Upload the images to home1t, and the XML file into the db.
  • Test to make sure the map shows up in the gallery, and works properly on the site.
Permalink 10:52:47 am, by sarneil, 39 words, 493 views   English (CA)
Categories: Activity log; Mins. worked: 15

ignore .DS_Store files in svn

To tell your svn repository to ignore the eveil and ubiquitous .DS_Store files automatically (and quickly) created by the Mac OS, issue this command: svn propset svn:ignore .DS_Store . as found at http://soledadpenades.com/2009/07/02/keeping-ds_store-files-at-bay/
Permalink 10:46:15 am, by sarneil, 118 words, 102 views   English (CA)
Categories: Activity log; Mins. worked: 15

editing config files in new svn branches

There are three files in the site which contain database connection strings:
inc/config_EDIT_ME.inc
content/maps/include/conf_EDIT_ME.inc
content/maps/include/config_EDIT_ME.xml

In each of these three files, the values for the database connection string have been replaced with placeholders. You have to make a copy of each of those files with the following names:
inc/config.inc
content/maps/include/conf.inc
content/maps/include/config.xml
In the copies, substitute the correct values for your connection string.

If the folder is in svn (which it probably is), you'll need to use svn add to add each of the files to the repo, then do your svn commit.

Permalink 08:58:12 am, by mholmes, 7 words, 113 views   English (CA)
Categories: Activity log; Mins. worked: 15

Entered some corrections for Rudling article

... at the author's request, approved by JT.

Permalink 08:57:36 am, by mholmes, 43 words, 53 views   English (CA)
Categories: Activity log; Mins. worked: 15

Five more documents assigned to PCA

I've assigned the first five 1859 documents transcribed by MM to PCA; the 1858 documents are rather complicated, and the existing 1858 documents need some editing, so it's simpler to work on the 1859 documents for the moment. The Google spreadsheet records the status of each document.

28/03/12

Permalink 11:33:04 am, by Erin, 90 words, 47 views   English (CA)
Categories: Activity log; Mins. worked: 170

a little bit of everything

This past weekend I completed a T.E.I. workshop offered by the DHSI ... Today I organised a short list of little questions to ask Greg on Friday. I also looked into the Niceron latin text, need to get translation / maybe Helene to look at it for mark up... marked up Lancisi, and Terilli, went through all xml texts to insert and toggle bibliographical comments as well as gap reason="sampling"/ to replace [...] , and inserted pb/ and cb/ ... found zwinger and levinus pdfs, must locate the accounts within these texts.
Permalink 11:16:41 am, by sarneil, 63 words, 93 views   English (CA)
Categories: Activity log; Mins. worked: 30

relocate pointer to svn repo, update data files

used switch --relocate oldURL newURL to point local copy to new URL for svn repo.
example:
switch --relocate https://revision.tapor.uvic.ca/svn/reponame, https://revision.hcmc.uvic.ca/svn/reponame

updated my local files, then used the exist admin client to upload 4 modified data files to the database.
Root of svn tree is at https://revision.hcmc.uvic.ca/svn/hcmc/

Permalink 11:13:37 am, by sarneil, 143 words, 130 views   English (CA)
Categories: Activity log; Mins. worked: 90

Two instances of maps folder, one working, one not

On the verigin site there is a maps folder in the explosion folder which contains just empty index pages. http://www.canadianmysteries.ca/sites/verigin/explosion/maps/indexen.html There is also a maps folder in the context folder which contains index pages and maps pages: http://www.canadianmysteries.ca/sites/verigin/context/maps/indexen.html Rather than duplicate the data, I've changed all the links I could find to the first (non-functional) instance so they point to the second (working) instance. As some of those links are in the navigation bar for the explosion section, that violates the UI convention, so I've asked Merna if she wants me to leave things as is, or remove the links from the nav bar. Also, I've put a link to the working instance on the non-working instance in case anyone stumbles into the non-working instance.

27/03/12

Permalink 02:40:32 pm, by mholmes, 34 words, 42 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 176 - 1 = 175 hours G&T; 175 - 7 = 168 hours

Leaving early today for an appointment; taking tomorrow off with SA's agreement to work in peace on NLP coursework, and to burn up some of the G&T hours that have stacked up.

Permalink 02:38:30 pm, by mholmes, 36 words, 52 views   English (CA)
Categories: Activity log; Mins. worked: 180

NLP course: programming assignment #2

I'm half way through this, and I'll have to finish it at home. Deadline tonight, which I won't meet, but it's hard to get more than a few minutes of uninterrupted time during the work day.

Permalink 02:05:11 pm, by mholmes, 32 words, 177 views   English (CA)
Categories: Activity log, Tasks; Mins. worked: 10

Task: renaming of file in SVN and in db

DONE: The transcription of the document 58-01-21_HBC748.rtf is marked up as the file V585MI30, when it should be V585MI02_A. It is already up on the site.

Permalink 02:03:35 pm, by mholmes, 18 words, 184 views   English (CA)
Categories: Activity log; Mins. worked: 30

Volume 20 contents published on site

All page numbers entered and proofing attribute removed at JT's request. Vol 19 remains unpublished for a few months.

26/03/12

Permalink 05:14:09 pm, by mholmes, 2 words, 50 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 175 + 1 = 176 hours G&T

Planning discussions.

Permalink 04:24:32 pm, by mholmes, 87 words, 83 views   English (CA)
Categories: Activity log; Mins. worked: 60

Linked 26 vessels from Schedules

All vessels referred to in the Schedules which have obvious existing vessel bios have now been linked (including one correction to a typo, "Fartar" instead of "Tartar"). The remaining vessels, for which new vessel bios will be required, are:

Alexandra
Cameleon
Devastation
East Lotherian
John Bright
John Stephenson
John Stevenson
Kingfisher
Nanaimo Packet
Ossifree
Prince of the Seas
Random
Royal Charlie
Scout
Scylla
Severn
Shenandoah
Sutlej

It's likely that the John Stephenson and John Stevenson are the same vessel, and possible that they're actually the John Stevens.

Permalink 03:54:23 pm, by mholmes, 44 words, 84 views   English (CA)
Categories: Activity log; Mins. worked: 30

Changed William Allen xml:id

The William Allen was tagged as "william", which made it confusable with the Brig William ("william_brig"). I've now changed the vessel bio and all references to it to show "william_allen". Also fixed an encoding issue in an 1854 document that I stumbled across.

Permalink 03:28:03 pm, by mholmes, 25 words, 364 views   English (CA)
Categories: Announcements; Mins. worked: 0

Abstracts now added for 1854

Thanks to some excellent work from Petria Arienzale, abstracts have now been added for all 1854 documents. We now have abstracts for all years between 1846 and 1854.

Permalink 12:07:54 pm, by mholmes, 27 words, 46 views   English (CA)
Categories: Activity log; Mins. worked: 60

NLP course: last lecture of week 2 and second problem set

Finished off the last lecture (started at home), then worked through the problem set for the week. Got full marks first time, which was astonishing to me.

Permalink 10:06:43 am, by mholmes, 7 words, 128 views   English (CA)
Categories: Activity log; Mins. worked: 30

Entered keywords from JT

List of keywords for all articles entered.

Permalink 09:26:13 am, by mholmes, 24 words, 63 views   English (CA)
Categories: Activity log; Mins. worked: 75

Corrections to ellipsis punctuation and other issues

More proofing corrections from JT. In the process, found another misplaced footnote tag. Also added superscript handling to XHTML rendering (it was oddly missing).

23/03/12

Permalink 02:01:42 pm, by mholmes, 6 words, 76 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 177 - 2 = 175 hours G&T

Leaving early to watch NLP lectures...

Permalink 11:29:32 am, by Erin, 28 words, 84 views   English (CA)
Categories: Activity log; Mins. worked: 160

mar23

Server switched over. Transcribed, marked-up, and research Sigaud de La Fond, search Epitaphia Joco-Seria for cited epitaph (not there), noticed typo in places.xml (isle/ilse), researched Swertius.
Permalink 10:44:20 am, by mholmes, 20 words, 69 views   English (CA)
Categories: Activity log; Mins. worked: 60

Latest review for PCA

Reviewed PCA's latest work (excellent) and sent comments. Also noticed a couple of issues in other documents and fixed them.

Permalink 09:18:58 am, by mholmes, 24 words, 58 views   English (CA)
Categories: Activity log; Mins. worked: 60

Corrections to Iglesias

Received and entered corrections from the author; found many other issues which I corrected and reported to JT. This needs another proofing, I think.

22/03/12

Permalink 05:46:04 pm, by mholmes, 28 words, 58 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 175 + 2 = 177

Working with very long, slow transformations -- have to do other tasks while transformations run, then examine the results, tweak, and set it off again. Very slow process.

Permalink 05:44:53 pm, by mholmes, 172 words, 115 views   English (CA)
Categories: Activity log; Mins. worked: 180

More work on normalization

Met with CC and examined some of the outcomes from our rulesets. There's obviously a huge amount of tuning still to do, but it's also clear that before each rule is run, the word needs to be checked against the dictionary in case it's already OK; if it is, then we don't need to keep working on it. I've now implemented that by turning the spell-check dictionary into an XML file which is then indexed with xsl:key (I tried other string-finding methods but they were much slower). The transformation now takes substantially longer than it used to, but it's clearer what's happening. One issue might be archaic forms in the spell-check dictionary, of course.

Another issue is u/v variation. When we change one to the other, we often end up changing it back in a later rule. It seems likely that a better approach would be to change all u/v to another unused symbol, and then write rules based on context for changing that symbol to the appropriate output.

Permalink 05:39:38 pm, by mholmes, 19 words, 172 views   English (CA)
Categories: Activity log; Mins. worked: 60

Tech support for French

Posting time spent with LSPW figuring out how to port the old colloquium materials over to the Cascade site.

Permalink 05:38:53 pm, by mholmes, 27 words, 118 views   English (CA)
Categories: Activity log; Mins. worked: 30

Meeting re Beck site

Met with PAB and JT to discuss moving Beck to Cascade. The decision is to wait until PAB's PhD is finished, at the end of the year.

Permalink 05:37:59 pm, by mholmes, 7 words, 105 views   English (CA)
Categories: Activity log; Mins. worked: 90

NLP Course: fifth and half of sixth video

The equations are taking time to understand...

Permalink 09:26:19 am, by mholmes, 77 words, 125 views   English (CA)
Categories: Activity log; Mins. worked: 90

Fixing broken links on Hist and Wost

Broken link reports came in with many links on Hist still broken from before, so I went through them and fixed any that are genuine (many are not -- Xenu seems to report lots of links which are perfectly OK). Reported reasons and changes to TG. Also checked out some odd items on the otherwise-empty WOST report which seem to be for the CFUV site, not from WOST at all. Reported that to CB, who fixed it.

21/03/12

Permalink 03:30:53 pm, by mholmes, 7 words, 37 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 176 - 1 = 175 hours G&T

Long lunch, and leaving a mite early.

Permalink 02:44:07 pm, by mholmes, 275 words, 135 views   English (CA)
Categories: Activity log; Mins. worked: 180

Work on rulesets for normalization

I've been doing preliminary work on the text-extraction and normalization problem. I've completed the initial rather difficult task of extracting the text and linking it back to the original locations in the source document, and I've started playing around with normalization rules. I took the long series of substitutions CC sent me in an earlier email and encoded them as search/replace operations; I'm using a spreadsheet to store them, and generating the required XML block automatically from it. I've since tweaked a few of the rules. I'm working with duchesse_de_milan.xml as a test document initially, and I've hooked in a CSS stylesheet which makes it almost readable in original and "normalized".

More often than not, our current rules take a good word and turn it into something incorrect. That's partly because many of the rules are, as yet, underspecified; for instance, some rules should only act at the beginning of a word, and others only in very specific contexts. Working through the rules to improve them, based on the errors, will help a lot, and I think we'll also be able to improve the output by putting them in a particular order.

The other thing that's missing, at the moment, is a check on the word before it's normalized; I should be checking each word against a modern dictionary before anything is done to it, and only making changes if it turns out not to be a good dictionary word.

But I think we can see the scale of the task ahead of us. It'll take some months to refine our ruleset to the point where we're getting consistently good results.

Permalink 10:09:30 am, by mholmes, 134 words, 84 views   English (CA)
Categories: Activity log; Mins. worked: 30

Database connections versus physical connections

As part of our investigation of how the database becomes a new mapping tool, I ran the following code on the db:

xquery version "1.0";

declare default element namespace "http://www.tei-c.org/ns/1.0";

declare variable $placeId := "CHEA2";

let $containingDocs := 
distinct-values(for $r in //ref[@target = concat("mol:", $placeId)]
return $r/ancestor::TEI/@xml:id),

$linksInDocs := distinct-values(//TEI[@xml:id = $containingDocs]//ref/@target[starts-with(., "mol:")]/substring-after(., "mol:")),

$locationDocs := for $l in $linksInDocs where //TEI[@xml:id = $l]/facsimile order by $l return $l

for $d in $locationDocs
return concat($d, ': ', //TEI[@xml:id = $d]/descendant::title[1]/text()[1])

This reveals that 248 other locations are connected to Cheapside through documents in the database. On the map, my quick estimate is that around 50 items are connected (although we'll need to do a proper count of that).

Permalink 09:34:44 am, by mholmes, 2 words, 93 views   English (CA)
Categories: Activity log; Mins. worked: 30

NLP course: finished third and fourth videos

Pushing on...

20/03/12

Permalink 03:37:01 pm, by mholmes, 17 words, 75 views   English (CA)
Categories: Activity log; Mins. worked: 45

NLP course: second and some of third video

Slogging through these things. I'm watching them once at home, and then note-taking them again at work...

Permalink 02:38:39 pm, by mholmes, 58 words, 53 views   English (CA)
Categories: Activity log; Mins. worked: 60

MVP: reconfiguration of repo and addition of AT

AT will be starting work on Tarr, so I've added him to the SVN users, and reorganized the repo so Nostromo and Tarr get different folders. I've written a more elaborate set of SVN instructions, and when AT has Oxygen set up on his laptop, he'll spend some time working alongside KT to get familiar with the workflow.

Permalink 01:14:21 pm, by mholmes, 42 words, 64 views   English (CA)
Categories: Activity log; Mins. worked: 120

Finished volume 20 cover

Basically followed these steps as I've done before. This time there are too many reviews to fit on the cover TOC, so I've replaced the list of reviews with a single Reviews entry pointing at the first page of the review section.

Permalink 09:56:42 am, by mholmes, 29 words, 64 views   English (CA)
Categories: Activity log; Mins. worked: 30

Started on Vol 20 cover

Made a start on the cover for Volume 20, but I can't proceed very far until I know what the exact year specification is going to be for the volume.

19/03/12

Permalink 04:25:26 pm, by mholmes, 18 words, 41 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 175 + 1 = 176 hours G&T

On late duty (but posting a bit early because I'm shutting down to clean my desk before leaving...).

Permalink 04:15:16 pm, by mholmes, 18 words, 88 views   English (CA)
Categories: Activity log; Mins. worked: 30

SVN location change: update to documentation

The repos changed location today, so I updated the documentation on the site, and sent details to JJ.

Permalink 04:09:28 pm, by mholmes, 365 words, 238 views   English (CA)
Categories: Activity log, Academic; Mins. worked: 120

Basic tokenizing now working

I now have my XSLT module successfully reconstituting a line-broken word on both sides of the break, like this:

<ab corresp="mar:textnode#xpath(/*[1]/*[2]/*[2]/*[1]/*[2]/text()[10])"><seg>
                    </seg><w corresp="mar:offset#xpath(substring(., 22, 3))"><choice><orig>ant</orig><reg type="joined-2">imagiant</reg></choice></w><seg> </seg><w corresp="mar:offset#xpath(substring(., 26, 3))"><choice><orig>que</orig></choice></w><seg> </seg><w corresp="mar:offset#xpath(substring(., 30, 4))"><choice><orig>Vous</orig></choice></w><seg> </seg><w corresp="mar:offset#xpath(substring(., 35, 5))"><choice><orig>pren-</orig><reg type="joined-1">prendrez</reg></choice></w></ab><ab corresp="mar:textnode#xpath(/*[1]/*[2]/*[2]/*[1]/*[2]/text()[11])"><seg>
                    </seg><w corresp="mar:offset#xpath(substring(., 22, 4))"><choice><orig>drez</orig><reg type="joined-2">prendrez</reg></choice></w><seg> </seg><w corresp="mar:offset#xpath(substring(., 27, 7))"><choice><orig>quelque</orig></choice></w><seg> </seg><w corresp="mar:offset#xpath(substring(., 35, 8))"><choice><orig>intereſt</orig></choice></w><seg> </seg><w corresp="mar:offset#xpath(substring(., 44, 1))"><choice><orig>à</orig></choice></w></ab>

It's nasty-ugly but it's only intended for machines to read. Having the full form of the word on both sides of the linebreak means we'll be able to do n-grams properly, and having the two joined forms labelled differently (joined-1 and joined-2) means we'll be able to ignore one of them if we're reconstituting a continuous string.

Permalink 02:06:30 pm, by mholmes, 4 words, 82 views   English (CA)
Categories: Activity log; Mins. worked: 60

NLP course: first week 2 video

On to probability theory...

Permalink 02:05:58 pm, by mholmes, 76 words, 141 views   English (CA)
Categories: Activity log; Mins. worked: 60

Streetcar map fixed; other issues o/s

I've now fixed the streetcar map problem, by changing all filenames and references so that they're consistently referring to 1939 instead of 1936. The fixes have been committed to trunk, and put up on the website.

The other two issues remain outstanding; PD will get back to me with the correct firemap URL at Malaspina, and we'll wait until the DNS has been changed before addressing the problem with Firefox and captcha, since it seems to be cookie-related.

Permalink 10:03:23 am, by mholmes, 3 words, 54 views   English (CA)
Categories: Activity log; Mins. worked: 5

Update to PacificAsia site

...on MK's instructions.

Permalink 09:50:33 am, by mholmes, 4 words, 92 views   English (CA)
Categories: Activity log; Mins. worked: 15

jTEI review

Review of resubmitted article.

Permalink 09:11:14 am, by mholmes, 157 words, 139 views   English (CA)
Categories: Hit by a bus; Mins. worked: 15

How to include a pointer for a pseudonymous writer

If you look at Rudling 2, 20, you'll see an author with a pseudonym. In the biblio, the pseudonym is handled like this:

<biblStruct>
                      <monogr>
                        <author><name reg="Gunnarson, Karl: see Schulze, Karl Gunnar">Karl Gunnarson see Schulze, Karl Gunnar</name></author>
                        <title></title>
                        <imprint></imprint>
                      </monogr>
                    </biblStruct>
And the main entry looks like this:

                  <biblStruct>
                    <monogr>
                      <author><name reg="Schultze, Karl Gunnar (pseud. Karl Gunnarson)">Karl Gunnar Schultze (pseud. Karl Gunnarson)</name></author>
                      <title level="m">På Kanadas prärier</title>
                      <imprint>
                        <pubPlace>Stockholm</pubPlace>
                        <publisher>Folket i Bilds Förlag</publisher>
                        <date value="1939">1939</date>
                      </imprint>
                    </monogr>
                  </biblStruct> 

Permalink 09:05:41 am, by mholmes, 43 words, 59 views   English (CA)
Categories: Activity log; Mins. worked: 20

More corrections to volume 20

Tweaks to Lange (which was missing para breaks for some reason) and Rudling arising out of discussions with JT on Friday. Still a couple of questions outstanding. New markup structure for handling pseudonym in biblio will be documented under "Hit by a bus".

Permalink 08:43:47 am, by mholmes, 292 words, 108 views   English (CA)
Categories: Activity log; Mins. worked: 45

PD has checked the site: a couple of remaining issue

PD has checked the new VIH site on taprlans/www, and reports only these issues:

  • Captcha reports errors for him on FF. I can't reproduce this -- it works fine for me -- so waiting for more details.
  • There's a link on the maps.php page pointing to the "1936 streetcar map" which fails, defaulting to the panorama, because it should point to 1939. The confusion arises thus:
    The map itself has the date "1939" on it.
    
    The .map file (the basic definition file) is called
    "vicstreetcar1939.map".
    
    However. it contains pointers to images called:
    
    victoria-streetcar-1936.png
    victoria-streetcar-1936-key.png
    
    and this page:
    http://hcmc.uvic.ca/~taprhist/content/maps/maps.php
    
    has 1936 in its caption. However. if you go to the map viewer,
    click on the Maps menu and drill down to it. you see the caption "1939
    - Streetcar routes".
    
    I think that:
    	- The actual date is 1939.
    	- The images are wrongly named, as are the pointers to them in
    the .map file.
    	- The caption on the maps.php page is wrong. whereas the Maps 
    menu in the map viewer is right.
    
    Waiting for PD to confirm my analysis before changing the image file names, the .map file, and the caption/link on maps.php.
  • PD reports this:
    The link on the page describing the 1885 Fire Insurance Plans of Victoria 
    needs to be changed. The page in question is located at:
    
    http://hcmc.uvic.ca/~taprhist/content/maps/firemap.php
    
    The hyperlink should be redirected to:
    
    http://www.mediastudies.viu.ca/steeple/index.htm
    
    [The link currently points to an obsolete server - 
    http://cdi.mala.bc.ca/firemaps/
    
    However, the new URL is about 1891 panorama images; it has nothing about the 1985 Fire Insurance Plans. Waiting for the correct URL from PD.

16/03/12

Permalink 01:57:46 pm, by mholmes, 2 words, 65 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 177 - 2 = 175 hours G&T

Leaving early.

Permalink 01:43:51 pm, by mholmes, 35 words, 261 views   English (CA)
Categories: Activity log, Tasks; Mins. worked: 5

Change William Allen id to "william_allen"

DONE 2012-03-26: The xml:id for the William Allen is currently "william", which is very confusing; change it to "william_allen", and change refs to it, so it's not confused with the Brig William.

Permalink 01:33:21 pm, by mholmes, 3 words, 58 views   English (CA)
Categories: Activity log; Mins. worked: 10

Mnor update to PacificAsia site

On MK's instructions.

Permalink 01:32:46 pm, by mholmes, 87 words, 153 views   English (CA)
Categories: Activity log; Mins. worked: 120

More progress on tokenizing/parsing etc.

I've had to resort to a second pass through the data to count offsets, and that's now working reliably. I've also got the reconstitution of hyphenated words at linebreaks working, but only most of the time; for some reason, when the linebreak precedes a <fw> element, the reconstitution fails. I'm still working on that, but it's very mysterious. I'll probably have to create some test data rather than working on real files until I get it sorted out.

All in all, though, very promising progress.

Permalink 01:30:21 pm, by mholmes, 46 words, 47 views   English (CA)
Categories: Activity log; Mins. worked: 20

Tweaks for vol 20

Met with JT -- some issues discussed, leaving corrections to be made tomorrow, and a decision re the cover, where there will not be enough space for reviews in the TOC: we will have a single entry for reviews on the cover (but not inside, obviously).

Permalink 09:45:04 am, by mholmes, 233 words, 50 views   English (CA)
Categories: Activity log; Mins. worked: 60

French site: figuring out history of Boréal/Colloquium

LSPW and I have been trying to track the history of Boréal and how it relates to the Grad colloquium. This is what we learned:

  • On the old site, there were two locations where Grad Colloquium data was stored: www/grad_colloquium, and www/web_pages/grad_colloquium. The former is a partial copy of the latter; only the latter is complete.
  • The website had information for colloquia starting in 2003; as the years went on, more and more data tended to be stored, culminating in full PDFs for all the presentations.
  • In 2007, the colloquium introduced itself as Boréal No. 1, which makes it look as though a journal issue was intended, but there is no sign that a single PDF was produced; all the individual article PDFs seem to number pages from 1. The UVic library has no catalogue entries for a journal called Boréal.
  • 2008 introduces itself as Boréal No. 2, and also has lots of article PDFs.
  • 2009 does not seem to mention Boréal at all, although it has lots of PDFs for articles.
  • After 2009, the material is not organized into folders, but there are PDFs for colloquia in 2010 and 2011.

LSPW will create a new page in current_students/graduate which has an accordion with one section for each year; the introductory material for each year can be copy/pasted from the index.php files, and links to the articles listed.

15/03/12

Permalink 05:15:05 pm, by mholmes, 163 words, 141 views   English (CA)
Categories: Activity log; Mins. worked: 120

More progress on tokenizing/parsing etc.

I now have the XSLT breaking down each text node into a series of components: either whitespace (passed through as plain text), punctuation sequences (tagged with <pc>) or word[-fragment]s (tagged with <w>, with much more tagging due in subsequent phases).

My current problem is the requirement to record the offset and length of each word in the original text node, so that a search engine can find its way from the modernized source back to the original text. Length is easy, but offset is proving difficult. I have a question posted on the XSLT list in the hope of some help, but it may be that we have to go in two stages: pre-process to create the <ab> element, which is stored in a variable, and then post-process, where the <ab> element and its contents are re-analyzed and additional tagging is added based on that analysis, before the resulting enriched ab is output.

Permalink 10:39:32 am, by mholmes, 104 words, 114 views   English (CA)
Categories: Activity log; Mins. worked: 30

Removed a browser restriction on the VicFire 1891 map

There's one special map that uses SVG for its interface, and which was originally working only on FF; DB had put a complete block on other browsers, but now most of them support SVG so I've removed that block. Browsers that don't support SVG should get with the program. Committed that change to SVN, but I was unable to commit the bulk of my additions (code which was not originally in SVN, but should have been) because we ran out of disk space on revision.tapor.uvic.ca. We'll be switching to the new SVN on Monday, so hopefully this problem will be solved.

Permalink 09:55:28 am, by mholmes, 8 words, 91 views   English (CA)
Categories: Activity log; Mins. worked: 30

NLP lecture video: final week 1 video

Worked through the final video lecture in week 1.

Permalink 09:46:45 am, by mholmes, 99 words, 173 views   English (CA)
Categories: Activity log, Tasks; Mins. worked: 175

Need to check linking of vessels

NOTE: Completed 2012-04-23. Many new vessel entries have resulted from this work, and they will need to be completed when time permits.

Try this, first in /db/coldesp/correspondence, and then in /db/coldesp/:

xquery version "1.0";

declare default element namespace "http://www.tei-c.org/ns/1.0";

for $r in //name[@type='vessel'][not(@key)]
return $r

The vessel tags inside the correspondence seem mainly to be for vessels which HAVE write-ups; these should simply be correctly linked with @key. The broader set include vessels which may not have bios yet; bios need to be created, and those vessels linked.

Permalink 09:09:52 am, by mholmes, 191 words, 87 views   English (CA)
Categories: Activity log; Mins. worked: 30

TNB's report at end of workstudy

This is the state of play on TNB's work as of today:

  • Peripheral bios will all be finished except for one:
    • gordon_t, Captain George T Gordon is the entry.
    • He was captain of the Cormorant, on station in Nisqually in 1846.
    • Gordon Lake was named after him.
    • More research is required to complete his bio.
  • B58 bios: references all switched to Chicago style, and minor edits done up to storks_hk. Old references have just been commented out. Sometimes better references have been added, from a more recent source.
  • A lot of citations for the revised bios still need to be checked in hard copies in the library; sometimes the library will have a different edition, and page numbers may have to be changed.
  • Many, many bios remain to be completed (more than two thirds).
  • Many bios refer to BCDES and could be linked to page-images we have (e.g. the bio for shepherd_j), but we currently lack a system to link from editorial text to a page-image. This needs to be implemented, and BCDES references linked and clarified.
  • Vessels and placenames are up to date to the end of 1861.

14/03/12

Permalink 03:47:42 pm, by mholmes, 64 words, 148 views   English (CA)
Categories: Activity log; Mins. worked: 45

More work on text search prep

I've written the bones of an XSLT file to convert an original file to a framework for modernization and regularization. So far the code can create <ab> elements with full working xpath references back to the source text nodes. Now I need to start on tokenization, which I think I'll do with a regex initially, but it's going to be quite complicated.

Permalink 03:33:08 pm, by mholmes, 187 words, 113 views   English (CA)
Categories: Activity log; Mins. worked: 360

Got the maps working

Got the site basically working by doing this:

  • Moved everything from taprhist/vihdev/www to taprhist/www (it really doesn't like living in that odd location, and there are hard-coded paths in several text files).
  • Removed "vihdev" from paths in config files.
  • Cleaned out the /home1t/taprhist/www/content/maps/cache folder so it had to start rebuilding.
  • Now we found that most maps were working, but a handful were failing. The failing maps had one line in their .map file:
     METADATA
          "queryable" "true"
          "tile_source" "cache"     <-- This line has to be removed.
     END
    
    Our surmise is that this problem line causes the server to construct a broken path to a cache folder that doesn't exist or isn't writable, and therefore it cannot read or construct tiles.

We now propose to have the DNS repointed so that vihistory.uvic.ca points at taprhist/www, and keep the live site there.

Greg also noticed that vihistory.ca is broken; outside of the ring it's pointing at mala.bc.ca DNS servers, so he's emailed PD to get him to fix that on the domain host.

Permalink 11:29:56 am, by Erin, 6 words, 69 views   English (CA)
Categories: Activity log; Mins. worked: 160

latin texts

Finished Adam + formatted, marked up Castellanus
Permalink 09:08:21 am, by mholmes, 13 words, 47 views   English (CA)
Categories: Activity log; Mins. worked: 60

TEI: work on text standardization

Working on standardizing some spelling variants across the P5 source. Mundane but necessary.

13/03/12

Permalink 02:06:30 pm, by mholmes, 409 words, 143 views   English (CA)
Categories: Activity log; Mins. worked: 30

Addressing addressees

There are issues with the search engine relating to both authors and addressees of correspondence. The drop-down lists are generated from distinct values of tags in the header. These tags, inherited from the Waterloo Script, contain plain text, and so the same individual is identified in a variety of different ways. It would be helpful if we could tag these names with ids from the personography, and then build our search engine drop-downs in a more intuitive fashion.

It seems best to start with the addressees, since they constitute a much smaller number (only 89 distinct values, listed below). The simplest approach would be this:

  • Create an XML file listing the referents (or just use the search_lists.xml file).
  • Identify each referent and tag it with the appropriate id from the personography.
  • Create a default personography entry for completely unknown people, uncertain people and missing people.
  • Fix any known oddities (like the square brackets around Carnarvon in one document).
  • Write an identity transform that adds the appropriate id to all files.
  • Update the search form generator so that it pulls appropriate info from the personography based on the distinct values of the name/@key attributes.
  • Update the search form and the search to use the new feature.

Addressees:

  • [Carnarvon]
  • [None]
  • [Unknown; Eliot?]
  • [Unknown]
  • [Various]
  • Adderley (Parliamentary Under-Secretary)
  • Assistant Secretary of State
  • Assistant Under-Secretary
  • Ball (Parliamentary Under-Secretary)
  • Banister
  • Barclay
  • Begbie, Thomas
  • Birch
  • Birch (Assistant Clerk)
  • Blackwood
  • Blackwood (Chief Clerk)
  • Blackwood (Senior Clerk)
  • Blanshard
  • Buckingham
  • Cardwell
  • Carnarvon
  • Carnarvon (Parliamentary Under-Secretary)
  • Chief Clerk
  • Clerk
  • Colonial Office
  • Colonial Secretary
  • Desart (Parliamentary Under-Secretary)
  • Douglas
  • Duke of Argyle
  • Earl Grey
  • Elliot (Assistant Under-Secretary)
  • Elliot (Permanent Under-Secretary)
  • Fortescue (Parliamentary Under-Secretary)
  • Gairdner (Chief Clerk)
  • General Public
  • Gladstone, R.
  • Granville
  • Graves, S.R.
  • Grey
  • Grey, Sir George
  • Hankin
  • Hawes
  • Hawes (Parliamentary Under-Secretary)
  • Head Clerk
  • Herbert (Assistant Under-Secretary)
  • Herbert (Permanent Under-Secretary)
  • Herman Merivale
  • Herman Merivale, Esq.
  • Herman Merivale, Esq. Under Secretary of State for Colonial Affairs
  • Higgins (Private Secretary)
  • Holland (Assistant Under-Secretary)
  • House of Commons
  • Irving (Junior Clerk)
  • Kennedy
  • Kimberley
  • Labouchere
  • Lytton
  • Merivale
  • Merivale (Permanent Under-Secretary)
  • Molesworth
  • Monsell (Parliamentary Under-Secretary)
  • Musgrave
  • Newcastle
  • Officer Administering
  • Pakington
  • Palliser
  • Palmerston, Viscount (Secretary of State, Treasury)
  • Palmerston (Treasury)
  • Parker (Private Secretary)
  • Peel (Parliamentary Under-Secretary)
  • Peel (Under-Secretary)
  • Pelly
  • Prince of Wales
  • Queen Victoria
  • Robinson (Senior Assistant Clerk)
  • Rogers
  • Rogers (Permanent Under-Secretary)
  • Russell
  • Sandford (Assistant Under-Secretary)
  • Secretary of State
  • Secretary of State for Foreign Affairs
  • Seymour
  • Smith
  • Stanley
  • Stanley (Foreign Office)
  • Under-Secretary for the Colonies
  • Under-Secretary of State
  • Under-Secretary of State Foreign Office [sic]
  • Young
Permalink 01:51:28 pm, by mholmes, 55 words, 237 views   English (CA)
Categories: Activity log; Mins. worked: 20

Consistency edits to XML files

Following one of KSW's notes in this post, removed date tags from specific location in 17 files. This is presumably for consistency -- only 17 files had them -- and because I suspect some useful parsing can be done/is being done based on the first date in the text being the date the document was penned.

Permalink 01:19:12 pm, by mholmes, 74 words, 83 views   English (CA)
Categories: Activity log; Mins. worked: 30

Added helpful message for when mentions not found

Items in the indexes have a link under their info popup which enables you to retrieve references to them in the correspondence, but sometimes there are no references (as in the case of peripheral bios, which are referred to in other bios but not in the actual correspondence). Previously, clicking on the "Mentions..." link simply did nothing in these cases, but I've now added a trap for this condition and an appropriate error message.

Permalink 11:36:33 am, by mholmes, 463 words, 228 views   English (CA)
Categories: Activity log, Academic; Mins. worked: 120

More work on modernization/searching etc.

I've been working out my ideas a little more clearly, and beginning to evolve the idea of a working pipeline and a target format for my documents. It would look something like this:

  • Original document is processed into a sort of generic structure where each text node is expressed as an <ab> element. At this stage,
    • The root text element in the new file points back to the source document using a private URI system based on the source document's @xml:id, like this xml:base="mar:maladies_des_femmes"
    • The <ab> element points back to the location of the original text node which gave rise to it, using a TEI pointer structure, something like this: <ab corresp="xpath1(*[20]/*[4]/*[3]/text()[2])">.
    • The contents of the text node are tokenized. It's not clear to me yet whether we need to tag punctuation, but we definitely need to tag words, so we'll need a good tokenizer that can handle this.
    • Words broken across linebreaks are reconstituted in the context of the text node preceding the linebreak, and ignored in the one following it. The reconstituted word is linked (see below) back to the original character strings in both locations, though.
    • Each word is marked up with a <w> tag, and that tag is linked back to the original source using XPath again: <w corresp="xpath1(substring(., 36, 10))">.
    • The original form of the word (reconstituted in the case of a broken word) is included as the text content of the <w> tag. It is also stored in an attribute (possibly @n, or more likely a custom attribute), so that when the text content is normalized and modernized, the original form is still available.
  • The resulting file is then processed again, and the text contents of <w> tags are run through a series of normalization rules which do things such as replacing long s.
  • Further processing attempts to modernized the contents of the <w> tags. This is going to require some serious processing, and will include algorithmic spelling modernization, dictionary lookups, etc.
  • The now-hopefully-modernized form is lemmatized, and the lemma is stored in an @lemma attribute on the <w> tag.
  • These documents can now be stored in the db and indexed for searching and analysis; search hits will have available to them the original spelling of the form, and will also be able to get back to the exact place in the original document where the form is located.

For this, we'll need a range of tools, some of which exist and some of which appear not to exist yet (or, as in the case of the lemmatizer, not in an open-source form we can adapt for a Java web application).

Permalink 11:29:52 am, by Erin, 67 words, 93 views   English (CA)
Categories: Activity log; Mins. worked: 540

3 posts in one

haven't posted for the 2 previous sessions... 2 sessions ago I finished mark-up of french accounts, and learned how to back up to the server through the terminal 1 session ago I added in geographical locations to the places.xml file, and searched through the Leminus text for another account Today I marked up latin text by Melchior Adam, searched through leminus and Sigaud de la Fond for more accounts...
Permalink 08:03:45 am, by mholmes, 43 words, 45 views   English (CA)
Categories: Activity log; Mins. worked: 15

Pic added to CityTalks site (and odd cacheing issue)

Got a pic of the speaker from KE, and added it to the site; worked fine on the www-dev location, but the www location kept showing the old page until I deleted the file, loaded the page for a 404, then uploaded again. Strange.

12/03/12

Permalink 05:40:25 pm, by mholmes, 2 words, 42 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 175 + 2 = 177 hours G&T

Late duty...

Permalink 05:39:39 pm, by mholmes, 3 words, 147 views   English (CA)
Categories: Academic; Mins. worked: 15

Update to City Talks site

... on KE's instructions.

Permalink 05:31:10 pm, by mholmes, 102 words, 162 views   English (CA)
Categories: Activity log; Mins. worked: 20

Invisible link issue on French site

ST solved a long-standing issue in the Xenu broken link report for the French site. Apparently when a number of pages were originally created, they were made by copy/pasting from the existing site. That site had links to a Contact page, which were then deleted -- except that what was deleted was only the text, not the anchor tag, so the links were still there, invisible. Deleted them all, except for one reported by ST on a page which no longer seems to exist (french/current-students/graduate/colloquium/index.php). Wrote to LSPW to find out what happened to that page.

Permalink 04:46:00 pm, by mholmes, 21 words, 38 views   English (CA)
Categories: Activity log; Mins. worked: 90

Meeting of the Versioning group

Interesting discussions of personographies, referencing with private URI schemes, and centralized authority databases at the first meeting of the versioning group.

Permalink 02:50:08 pm, by mholmes, 13 words, 65 views   English (CA)
Categories: Activity log; Mins. worked: 45

Timesheet admin

Posting time spent on timesheets (SA is away, so did TNB's timesheet too).

Permalink 02:49:20 pm, by mholmes, 9 words, 46 views   English (CA)
Categories: Activity log; Mins. worked: 30

Two more NLP video lectures

Getting towards the end of the week one materials.

Permalink 02:48:49 pm, by mholmes, 108 words, 68 views   English (CA)
Categories: Activity log; Mins. worked: 120

Work on documentation and crediting MM

Added appropriate credit to MM for her transcription work, and began the process of pulling documents from Google Docs into the actual repo, which is a bit easier to keep track of. Found one suitable document to get PCA started with full-doc transcription, and created a simple guide to the file/id/naming convention for our collection. Wrote a detailed assignment for PCA and sent it. This process will include a check that our Guidelines document in fact provides enough guidance for a encoding a complete new document. Most likely we will be expanding it in the next week or two as PCA starts to add new transcriptions.

Permalink 02:33:23 pm, by mholmes, 12 words, 57 views   English (CA)
Categories: Activity log; Mins. worked: 30

Update to Beck site

At PAB's request. And some preliminary discussion of moving it to Cascade.

Permalink 02:20:09 pm, by jnazar, 14 words, 40 views   English (CA)
Categories: Activity log; Mins. worked: 30

"Interviews" - project

Received email from PL with project summary.
Printed off waiver release forms as requested.

Permalink 02:18:05 pm, by jnazar, 8 words, 38 views   English (CA)
Categories: Activity log; Mins. worked: 60

HCMC accounting

Purchased supplies for special project ("Interviews").
Completed reimbursement.

09/03/12

Permalink 05:02:30 pm, by sarneil, 7 words, 134 views   English (CA)
Categories: Vacation; Mins. worked: 0

SA: Vac 65 - 10 = 55 days + 10 days long service

Two weeks vacation for CSG spring break.
Permalink 05:01:09 pm, by sarneil, 37 words, 147 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

SA G&T 13.0 - 1.0 = 12.0 hours

week of Mar 5 - Mar 9 M -3.0 CSG, T +1.0 beanstream, W -3.0 CSG, R +1.0 admin before vac, F +1.0 francotoile update
next week I'm coming in Tuesday for some kind of focus group which will take about 2 hours

Permalink 04:30:24 pm, by sarneil, 342 words, 120 views   English (CA)
Categories: Activity log; Mins. worked: 60

xslt bug processing ref type="note" containing 'mentioned' element

This structure in the xml data file:

<ref type="info">pépés<note> : <mentioned>Pépé<mentioned> est généralement utilisé par les enfants.</note></ref>

Was originally processed by this xsl:

<xsl:template match="tei:ref[@type='info']">
<xhtml:a href="#" class="tooltip">
<xsl:value-of select="./child::text()"/>
<xhtml:span class="hover_off">
<xsl:value-of select="tei:note"/>
</xhtml:span>
</xhtml:a>
</xsl:template>

Generating this output (note the "Pépé" is passed through as plain text, whereas user wants it italicized)

<a class="tooltip" href="#">pépés<span class="hover_off">Pépé est généralement utilisé par les enfants.</span></a>

I modified the xsl to this:

<xsl:template match="tei:ref[@type='info']">
<xhtml:a href="#" class="tooltip">
<xsl:value-of select="./child::text()"/>
<xhtml:span class="hover_off">
<!--<xsl:value-of select="tei:note"/>-->
<xsl:apply-templates/>
</xhtml:span>
</xhtml:a>
</xsl:template>

Which generates this output (note the "pépés" appears in the span as well as outside it):

<a class="tooltip" href="#">pépés<span class="hover_off">pépés : <em>Pépé</em> est généralement utilisé par les enfants.</span></a>

I've got to come with some xsl that gives me this output from the given input, but ran out of time today:

<a class="tooltip" href="#">pépés<span class="hover_off"> : <em>Pépé</em> est généralement utilisé par les enfants.</span></a>

When I do, I can delete the leading " : " which is only there as a kludge around this problem.

Permalink 02:15:39 pm, by sarneil, 148 words, 112 views   English (CA)
Categories: Activity log; Mins. worked: 120

add mention element

Needed a way to identify words used as examples in the annotations, so added the "mentioned" element. Entailed - adding the element to a data file - invoking oddbyexample to create an odd file incorporating the new element (see previous post on that) - using Roma to generate a new rng file from the odd file - putting that rng file back into my tree - updating the xslt files to find the new element correctly (special case was mentioned embedded in note) and then output appropriate html - committing the changes to the repository and uploading the modified files into the exist db. For the time being, we're using it for any string that we want italicized as the vast majority (if not all) are in fact mentions. If we find a significant number of other cases of strings that need italics, we'll make the necessary modifications.
Permalink 01:58:57 pm, by mholmes, 20 words, 80 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 176.5 - 1.5 = 175 hours G&T

Leaving early -- need to keep these hours under control. Going home to read about NLP and historical spelling variance.

Permalink 01:55:15 pm, by mholmes, 180 words, 189 views   English (CA)
Categories: Activity log, Academic; Mins. worked: 60

More research on historical spelling variance

I now have a collection of a dozen or so papers I'm reading and annotating, and some ideas are getting clearer. At the moment (although I still have a lot of reading and consulting to do), this kind of approach looks promising:

  • Run XSLT on collection to create parallel collection in which each significant block (not clear what a block is yet) is converted to a modernized textual representation with an XPath pointer that points back to the original block in the original doc. In this process, linebreaks would be dealt with.
  • Each modernized block includes the original variants as attributes or elements (if the latter, the modern indexer can be instructed to ignore them).
  • Modern blocks may also be stemmed.
  • Search is done on modern blocks.
  • KWIC hits from search can be shown EITHER as modern OR as original sequence (reconstructed from original variants stored in modern block).
  • Clicking on the hit takes you to the original text, with hits highlighted based on a new search done using the original tokens stored in the modern block as search terms.
Permalink 01:27:23 pm, by mholmes, 11 words, 63 views   English (CA)
Categories: Activity log; Mins. worked: 45

Pacific and Asian Studies site plan and proposal submitted

Final tweaks received from department, and nav plan submitted to JS.

Permalink 11:02:14 am, by sarneil, 472 words, 230 views   English (CA)
Categories: Activity log; Mins. worked: 120

create new rng using oddbyexample

With critical input from Martin on the syntax of the java command, I managed to create a new rng file derived from the existing data files using the oddbyexample utility from TEI.

Here are my notes.

minimal instructions here: http://tei-l.970651.n3.nabble.com/ODD-by-example-utility-td2344937.html

download for saxon jar files : http://saxon.sourceforge.net/#F9.4HE

download for oddbyexample.xsl and getfiles.xsl : http://tei.svn.sourceforge.net/viewvc/tei/trunk/Stylesheets/tools/

my setup:
in folder: /System/Library/Java/Extensions (which is in the java classpath)
- saxon9he.jar (working jar file in System)
- saxon9-unpack.jar (working jar file in System)

all other files in folder: /Users/sarneil/Documents/Projects/french/FrancoToile/oddbyexample/
- data folder containing all the data files to use in creating the odd file (I removed child values folder)
- oddbyexample.xsl
- getfiles.xsl
- saxon9he.jar (backup of jar file in System, not used otherwise)
- saxon9-unpack.jar (backup of jar file in System, not used otherwise)
- ftodd (file created by running the java command below)
- francotoile.rng (file created by running ftodd file through Roma as detailed below)
- this readme file.

command I issued:
java -jar /System/Library/Java/Extensions/saxon9he.jar -it:main -o:/Users/sarneil/Documents/Projects/french/FrancoToile/oddbyexample/ftodd /Users/sarneil/Documents/Projects/french/FrancoToile/oddbyexample/oddbyexample.xsl corpus=/Users/sarneil/Documents/Projects/french/FrancoToile/oddbyexample/data

Everything (i.e. paths) is spelled out explicitly as otherwise there's just too much voodoo magic for me.
Tell java to run the jar file specified in the following argument (i.e. saxon9he.jar)
The -it switch presumably tells java which class to run first (not sure).
The -o switch provides the path and file name for the output file (e.g. /root/path/path/path/nameOfODDfile)
The next argument provides the path and file name of the oddbyexample.xsl file to run
The corpus= argument provides the path to the folder containing the tei data files to run the oddbyexample.xsl against to generate the ftodd file

Once you've the odd file
Go to http://www.tei-c.org/Roma/
Click the Open existing customization button and browse to the odd file you've just created
Click the start button
In the Customize tab, change the filename to what you want your schema's filename to be (e.g. francotoile) without any extension
Click the save button
In the Schema tab, select RELAX NG schema (XML syntax) not compact syntax
Click the generate button
Roma will generate the file francotoile.rng (using the name you provided and the extension based on the schema format you selected)
Save that file and move it wherever you want it to go.

Where the data files are expecting that rng file to be for francotoile:

Will test shortly.

Permalink 10:46:52 am, by mholmes, 47 words, 96 views   English (CA)
Categories: Activity log; Mins. worked: 90

PCA's directed reading report

Reviewed the extensive (and excellent) work completed by PCA, who is now nearly at the end of the 1854 abstracts. Wrote a number of notes for tweaks and fixes, as well as a couple of requests for further research and the transcription of a mysteriously-untranscribed despatch (V547102A).

Permalink 08:55:38 am, by mholmes, 4 words, 82 views   English (CA)
Categories: Activity log; Mins. worked: 30

NLP lecture video

Did another lecture video.

08/03/12

Permalink 05:42:39 pm, by mholmes, 29 words, 67 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 174.5 + 2 = 176.5 hours G&T

Folks from three different projects needed attention, and with the morning taken up by the presentation it was difficult to get through emails before the end of the day...

Permalink 05:41:18 pm, by mholmes, 168 words, 90 views   English (CA)
Categories: Activity log; Mins. worked: 120

Mallarmé: encoding rhythm

Long discussion with EDR about possible approaches to encoding rhythm. I think we should use something like this:

<metDecl xml:id="fr_ip" type="met" pattern="AAT:AAT\|AAT:AATA">
<metSym value="T">syllabe tonique</metSym>
<metSym value="A">syllabe atone</metSym>
<metSym value="|">césure</metSym>
<metSym value=":">pause métrique</metSym>
</metDecl>

and then tie <l> tags to the specific <metDecl> using the @met they match. This would make for nice stand-off markup. We should probably actually replace the pipe with some other character that doesn't need escaping, for convenience. But looking at the Guidelines, you can't actually point at a metDecl; you have to reiterate the pattern in the @met attribute. I've already found and reported one bug in the source for the French example of <metDecl>, but I think bit of the GL needs a more serious look.

Permalink 05:30:46 pm, by mholmes, 21 words, 51 views   English (CA)
Categories: Activity log; Mins. worked: 60

Started work on assignment

Started work on first NLP course assignment. It's helping me polish up my regexes and dip a toe into Java again.

Permalink 05:16:41 pm, by mholmes, 7 words, 42 views   English (CA)
Categories: Activity log; Mins. worked: 180

Prepping and delivering presentation

JJ + me gave keynote at CS IdeaFest.

Permalink 01:52:13 pm, by sarneil, 88 words, 593 views   English (CA)
Categories: Activity log; Mins. worked: 30

mlahat : beanstream hashcode from cart purchase

Ok.

To get this bit to work your user has to have access to the adminstration / account settings / order settings area.

You must
1) provide the URLs for working pages in each of the Approval Redirect: and Decline Redirect: text boxes. I think those URLs can be to either pages hosted within the beanstream account or on your server - I've only tested the former so far.
2) uncheck the Require hash validation checkbox
3) check the Include hash validation checkbox
4) click the update button at the bottom of the page

Permalink 12:54:40 pm, by sarneil, 153 words, 57 views   English (CA)
Categories: Activity log; Mins. worked: 120

malahat : hashcode problem with shopping cart

I've created a simple shopping cart, but when I try to use it to buy, I get a "hashvalue missing" error. I had a similar problem with a form hosted on their service buy created by me and solved it by including a hashcode in the submission string as the documentation suggested.
I don't see how I'm going to be able to inject a hashcode into the submission string produced by one of the forms on their shopping cart, and there's no mention of handling hashcodes in the shopping card documentation, so I've written to RE to find out what to do.
I also created a simple page on my account on the UVic server which invokes the shopping cart and passes in the item I want to buy. That works fine, but if I then go on to try and actually buy the item, I get the same error as detailed above.

07/03/12

Permalink 03:12:59 pm, by sarneil, 245 words, 149 views   English (CA)
Categories: Activity log; Mins. worked: 150

Malahat : create beanstream cart on their server

The issue with the simple form is that you'll have to write a lot of code to deal with various kinds of situations (errors, user changing their mind about items or quantities, etc.). I'm hoping the cart takes care of some of those hassles. First test is a cart using pages hosted on their server, then I'll try a cart with as many of the pages as possible (likely the product pages) on the Malahat site.

First, you've got to get the finance people to create a test account for you with the beanstream service, if you haven't got that set up already as described in the post on how to get going with a simple form.
Robert Elves has been my contact.
That account's permissions have to be set so that it has full access to the configuration / shopping cart and configuration/inventory areas.
I suspect strongly that you'll also want access to the configuration / shipping area too, but I haven't got far enough to know that for sure yet.
The procedure for the Simple Shopping Cart is described here: https://beanstream-manuals.pbworks.com/f/BEAN_Starter_Cart.pdf

I found it pretty straightforward to implement, other than I'm not sure how to rearrange the order in which the categories appear, and I'm not sure how to handle the various shipping charges the Malahat charges (e.g. the first item in any category attracts a higher price than subsequent items in the same category).

Permalink 02:20:51 pm, by mholmes, 8 words, 58 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 176 - 1.5 = 174.5 hours G&T

Leaving early to drive CA to an appointment.

Permalink 01:59:11 pm, by jnazar, 684 words, 121 views   English (CA)
Categories: Activity log; Mins. worked: 150

"Interviews" - Instructions

- 4 recorders have already been set to: volume to 25; setting to "Meeting"
(don't reset to any other settings); recording room location: CLE B046

Procedure during Recording:
- 2 machines are recording at once
- 1 machine only is designated for student to control
(Students: only touch 1 button: Record - Pause - Record (All same button)
- 2nd machine is the back-up machine and will run the entire time Non-Stop (students don't touch)

When ready to start interview:

Judy:
- turns on both recorders' power (power switch on side of recorder)
- places "backup recorder" on table - press record button - leaving machine in record
mode throughout the interview till end (students don't use this machine)
- places student recorder on table (screen facing student)

Student:
- press RED RECORD button to start recording
- press STOP button when interview is finished. Bring student recorder to Judy.

Judy:
- stops "backup recorder" when interview is finished.
Both recorders keep in HCMC.

TO COPY INTERVIEW FILE FROM STUDENT RECORDER TO JUDY'S COMPUTER:

Connect Student Recorder to Judy's Computer:
- each recorder has a cable; plug cable into back of Judy's computer
- plug other end of cable into USB port located on side of student recorder
- recorder will automatically power up itself

On Judy's Computer Desktop:
1. See "VN8100PC" (name of digital recorder) icon on Judy's desktop.
2. - 2x click on VN8100PC icon on desktop to open
- will see VN8100PC screen with 3 folders list - see "Recorder" folder
3. Open "Recorder" folder and open "Folder A" (click once on arrow - opens - then click on
arrow to open Folder A) - see interview files listed e.g. VN810001.MP3, VN810002.MP3,
etc.
4. Select and drag interview file over from VN8100PC Screen to Judy's desktop
5. Rename that interview file now on Judy's desktop (click once, then click again in
field to rename file) with the naming convention of: student's surname_interviewee's surname_date_interview file#.
6. Open on Judy's desktop the "INTERVIEWS" folder.
7. Drag renamed interview file over to Judy's "INTERVIEWS" folder.
8. Click on "INTERVIEWS" folder to see the renamed interview file has been moved there.
9. To disconnect recorder from Judy's computer: Right click on VN8100PC icon on Judy's
desktop to disconnect.
10.Disconnect cables (from computer and recorder); put recorder and cable back in plastic
bag and lock up.

TO GIVE A COPY OF INTERVIEW FILE TO STUDENT ON THEIR USB MEMORY STICK:
1. Plug student's USB device into back of Judy's computer.
2. Student's USB icon then shows up on Judy's desktop.
3. Under "Device" (on left of screen) see the USB memory stick listed.
4. Open on Judy's desktop the "INTERVIEWS" folder.
5. Drag specified interview file from "INTERVIEWS" folder to their USB device icon.
6. To disconnect USB memory stick right click on USB icon to "Eject"
7. Unplug USB device from back of computer, return USB memory stick to student.

Student CD-R Disk copy:
1. Insert CD-R disk into Judy's computer (with disk printed-side facing me)
2. Wait
3. "Untitled CD" icon shows up on Judy's desktop
4. 2x click on "Untitled CD" icon (rename student's CD e.g. "Interview" by clicking 1x slowly in the field then click again in the same field of the icon and type "Interview")
5. Screen window "Untitled CD" shows up
6. 2x click on Judy's desktop the "Interviews" folder to open
7. Drag interview file from Judy's "Interviews" folder to Untitled CD (now renamed "Interviews)
8. Hit "Burn" button on right of screen
9. Will ask "Are you sure....." Leave burn speed as is
10.Hit "Burn"; will go through tow times on its own (once, then again verifying) takes a bit of time
11.When done, screen disappears
12.Click on student's CD-R icon to see file has been dragged over (this is a copy, original interview file stays in Judy's desktop folder)
13.Right click on CD-R to "Eject"

Students to provide USB memory stick however CD-R used instead if USB memory stick not provided. Students keep CD-R disk.

Paperwork to be turned in:
At end of each interview student will turn in paperwork (Signed Release form)to Judy to put in desk file folder - for PL to pick up.

Permalink 01:19:03 pm, by mholmes, 130 words, 61 views   English (CA)
Categories: Activity log; Mins. worked: 60

First build of the book for vol 20

I've created the first version of the book and sent it to JT.

This XQuery outputs the XInclude statements for reviews, ordered by the author of the book being reviewed:

xquery version "1.0";

declare namespace xi="http://www.w3.org/2001/XInclude";
declare option exist:serialize "expand-xincludes=no";

for $d in //TEI.2[descendant::sourceDesc/descendant::biblScope[@type='vol'][contains(., '20')]][descendant::classCode[contains(., 'review')]]
order by $d/teiHeader/fileDesc/titleStmt/title/name/@reg

return 
(
xs:string(concat('<!-- ', $d/@id, ': ', normalize-space($d/teiHeader/fileDesc/titleStmt/title), ' -->')),

  <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="{$d/@id}.xml">
   <xi:fallback>{concat('MISSING XINCLUDE CONTENT: ', $d/@id, '.xml')}</xi:fallback>
  </xi:include>
)
Permalink 12:51:29 pm, by mholmes, 60 words, 74 views   English (CA)
Categories: Activity log; Mins. worked: 120

Finished Iglesias

In the process, I've modified the XSLT so that it uses the &#x202f; character when inserting guillemets, as well as for "double" punctuation marks; that results in better-looking output. I haven't done that for the XHTML output though; I'm still using &#160; there, because various reports cast doubt on the reliability of &#x202f; on various browsers.

06/03/12

Permalink 05:47:32 pm, by mholmes, 15 words, 51 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 174 + 2 = 176 hours G&T

Presentation prep and keeping everything else ticking over, along with ScanCan work that's rather urgent...

Permalink 05:40:53 pm, by mholmes, 51 words, 69 views   English (CA)
Categories: Activity log; Mins. worked: 45

Error with vessel info

PCA reported that mentions of the Brig William, wrecked in 1854, are linked to the vessel info for the William Allen, which is not the same ship at all. We dug around to find some references from which to construct a new vessel entry, and she's now going ahead with writing it.

Permalink 05:31:34 pm, by mholmes, 71 words, 117 views   English (CA)
Categories: Activity log; Mins. worked: 30

French punctuation and spacing

With regard to spaces, French punctuation behaves like English, except in the case of the so-called "double" punctuation marks (;:!?). These should be preceded by U+202F, the "narrow no-break space". In the case of the Iglesias text, regular spaces were used, whcih meant that punctuation marks sometimes wrapped to the next line. I've now fixed that, and confirmed that XEP handles it OK.

Lots more to do on the Iglesias, though...

Permalink 05:28:50 pm, by mholmes, 17 words, 69 views   English (CA)
Categories: Activity log; Mins. worked: 60

Entered proofing corrections for 5 more documents

Mussari, Urberg, Rudling, Gudmundsson and Blackwell done (with a few red-circled questions to talk to JT about).

Permalink 03:52:25 pm, by sarneil, 417 words, 131 views   English (CA)
Categories: Activity log; Mins. worked: 300

missing maps resist debugging

I've been working for the better part of the last couple of days trying to figure out why the imap maps in ViHistory are not appearing.
Problem appears with the production front end and a test front end connected to either the old db server or the new db server.
Problem appears with a test front end which is the production front end minus the captcha code (which is the only code that has changed since Jamie left us a working site).
We weren't getting errors when we trolled the server logs on lettuce.
We did get the following errors from the sysadmin:
Apache error log:

[Tue Mar 06 09:16:21 2012] [error] [client 96.54.151.99] [Tue Mar 6
09:16:21 2012].616424 loadSymbolSet(): Unable to access file.
(/home1t/taprhist/www/content/maps/user/symbol/generic.sym), referer:
http://vihistory.ca/content/maps/htdocs/index.php?map=vicbird1889

AND syslog:

2012-03-06T09:16:21-08:00 local@mustard.hcmc.uvic.ca user.notice
php-cgi: PHP Warning: [MapServer Error]: loadSymbolSet(): (/home1t/taprhist/www/content/maps/user/symbol/generic.sym)
2012-03-06T09:16:21-08:00 local@mustard.hcmc.uvic.ca user.notice in: /home1t/taprhist/vihdev/www/content/maps/htdocs/init.php on line 125
2012-03-06T09:16:21-08:00 local@mustard.hcmc.uvic.ca user.notice
php-cgi: PHP Warning: Failed to open map file /home1t/taprhist/vihdev/www/content/maps/user/map/vi1798.map in /home1t/taprhist/vihdev/www/content/maps/htdocs/init.php on line 125
2012-03-06T09:16:21-08:00 local@mustard.hcmc.uvic.ca user.notice
php-cgi: PHP Fatal error: Call to a member function getMetaData() on a non-object in /home1t/taprhist/vihdev/www/content/maps/htdocs/init.php on line 131
2012-03-06T09:16:21-08:00 local@mustard.hcmc.uvic.ca local0.debug
suphp_wrapper: 0 PHP5
/home1t/taprhist/vihdev/www/content/maps/htdocs/init.php

That init file is as provided by the imap people, so I really, really doubt it is causing the problem, though it is the where a problem occurs. It looks like that file is trying to create objects based on what it reads from some kind of config file, and somehow that process is breaking down, so the object doesn't get created, so invoking a method on the (non-existant or empty) object throws the error. MOre precisely, it looks like the config.php file is supposed to create an array in the variable aszMapFiles and then those values are used in the init file, but for some reason something is failing with the way that array and associated variables are being populated.

Permalink 10:13:02 am, by mholmes, 15 words, 128 views   English (CA)
Categories: Activity log; Mins. worked: 30

Stanford NLP course: first two lectures

Worked through the first two lectures in the Stanford NLP course I'm taking this semester.

Permalink 09:23:07 am, by mholmes, 44 words, 75 views   English (CA)
Categories: Activity log; Mins. worked: 60

Entered proofing corrections for 4 reviews

Entered JT's corrections for Stenberg, Higgins, Norrman and Sheffield. One outstanding issue on Stenberg and one on Norrman, waiting for JT to come by. Also looked up Chicago on ellipses, and suggested policy doesn't align with it, so referred back to JT for clarification.

05/03/12

Permalink 05:51:05 pm, by mholmes, 7 words, 102 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 172 + 2 = 174 hours G&T

Late duty and a presentation to create...

Permalink 05:39:18 pm, by mholmes, 37 words, 83 views   English (CA)
Categories: Activity log; Mins. worked: 240

Working on the presentation

Collected images, made some details and diagrams, and wrote a first draft of my bit of the presentation in the form of a WP document with text and images. Tomorrow I'll turn it into an actual presentation.

Permalink 11:34:49 am, by Erin, 15 words, 474 views   English (CA)
Categories: Activity log; Mins. worked: 170

mar5

Added in new themes to documents, updated citations chart, searched for a couple more sources.
Permalink 10:37:33 am, by mholmes, 22 words, 45 views   English (CA)
Categories: Activity log; Mins. worked: 120

PAS nav plan completed

Spent the morning turning the navigation plan for PAS into a spreadsheet; sent it to the team for comments before submitting it.

Permalink 10:17:10 am, by Greg, 23 words, 509 views   English (CA)
Categories: Documentation, Announcements; Mins. worked: 0

CLI tricks - recently changed files

Produce a sorted list of recently changed files by running this:
find . -type f -printf '%TY-%Tm-%Td %TT %p\n' | sort

02/03/12

Permalink 05:06:24 pm, by sarneil, 39 words, 39 views   English (CA)
Categories: Activity log; Mins. worked: 60

paas : meet to talk about new dept website

Met with Martin and 3 people from PAAS to review site plan. Couple of issues sorted out at meeting, leaving at most a couple of open issues. Martin writing up spreadsheet and will distribute for review before submitting to communications.
Permalink 05:05:05 pm, by sarneil, 37 words, 30 views   English (CA)
Categories: Activity log; Mins. worked: 90

grs : review navigation plan

Met with Judy and BB to review spreadsheet of nav plan for new site. A few modifications made and a few open questions to resolve, but we're very close to submitting to communications. Judy to do follow-up.
Permalink 05:03:12 pm, by sarneil, 75 words, 44 views   English (CA)
Categories: Activity log; Mins. worked: 30

sent out two invoices

Sent out "invoices" to two French researchers (HC and EdR) for HCMC resources. Following the system I worked out with AS in dean's office:
I get research account number from researcher
I write up invoice with reference to dh cttee approval of terms and specified costs and research and hcmc account number
I send it to them and ask them to forward it to AS with authorization for Journal Entry.
AS does the journal entry.

Permalink 05:00:04 pm, by sarneil, 33 words, 36 views   English (CA)
Categories: Activity log; Mins. worked: 30

admin meeting with dean

Half hour with dean on:
allocation of space or other resources dedicated to one project for more than a year,
flexibility of terms for use of resources
participation in etcl workshop coming up

Permalink 04:57:58 pm, by sarneil, 51 words, 94 views   English (CA)
Categories: Activity log; Mins. worked: 180

post-gres based maps not showing up

User reported that none of the post-gres based maps (insurance etc) for ViHistory are appearing. The imap infrastructure appears in the browser window, but no image tiles. PD forwarded this and I confirmed it. Spent a number of hours with Greg and Drew trying to solve the problem. Still not resolved.
Permalink 04:54:48 pm, by sarneil, 55 words, 46 views   English (CA)
Categories: Activity log; Mins. worked: 90

dh ctte chair : various admin conversations

Had email and in-person conversations with MC and SB on various issues: OK to put in for an IRG for a proposal that's still before the committee, pros and cons of HCMC charging researchers for work done for IRG-funded projects, etc. Most will find their way to the agenda of the next DH ctte meeting.
Permalink 04:49:08 pm, by sarneil, 18 words, 27 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

SA G&T 12.0 + 1.0 = 13.0 hours

week of Feb 27 - Mar 2 M -1.0 CSG, T +0.5 beanstream, W +0.5 admin, R +0.5 hold fort, F +0.5 phil php
Permalink 04:37:25 pm, by sarneil, 153 words, 33 views   English (CA)
Categories: Activity log; Mins. worked: 240

phil : file to write courseArray to page rather than file

I've created a file called coursesWriteToPage.php which calls the code that scrapes the UVic calendar and then rather than try to write it out to a file on the server, it embeds it in an html comment. You'd then have to copy the php array from the source of that page and paste it into the PHILcourses.php file. I updated the courses offered page to reflect summer 2012 and winter 2012-2013 offerings. The department has created a number of new courses and renamed a number of existing ones for 2012-2013. As the calendar site offers only the 2011-2012 information, those changes are not yet apparent in the calendar or on the dept site that scrapes data from the calendar. When the Uvic calendar presents the 2012-2013 info, then I'll need to go to the coursesWriteToPage.php file, get the revised php array and copy and paste it into the PHILcourses.php file.
Permalink 02:28:51 pm, by mholmes, 2 words, 42 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 173 - 1 = 172 hours G&T

Leaving early.

Permalink 01:53:36 pm, by mholmes, 14 words, 47 views   English (CA)
Categories: Activity log; Mins. worked: 30

More minor updates and fixes...

Various edits. New decision to remove punctuation around ellipses still needs to be implemented.

Permalink 01:52:10 pm, by mholmes, 24 words, 84 views   English (CA)
Categories: Activity log; Mins. worked: 120

Working on presentation

Met with JJ and outlined the presentation for next week. I've started collecting materials, and I'll write a draft of my bit on Monday.

Permalink 01:51:20 pm, by mholmes, 11 words, 80 views   English (CA)
Categories: Activity log; Mins. worked: 20

Retrieved stats

Note: Francotoile and the Mysteries projects are not showing any stats.

Permalink 09:58:31 am, by mholmes, 10 words, 71 views   English (CA)
Categories: Activity log; Mins. worked: 60

Marked up Jochens review

Marked up the review, and sent some queries to JT.

01/03/12

Permalink 04:41:31 pm, by mholmes, 13 words, 35 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 172 + 1 = 173 hours G&T

Meeting went on to the end of the day and then backups etc....

Permalink 03:57:28 pm, by mholmes, 26 words, 30 views   English (CA)
Categories: Activity log; Mins. worked: 75

PAAS site project: meeting and site plan

Met with folks from the dept, and worked through their spreadsheet. I'll turn the results into another spreadsheet on Monday, and we'll get it submitted ASAP.

Permalink 03:56:25 pm, by mholmes, 68 words, 202 views   English (CA)
Categories: Activity log, Academic; Mins. worked: 120

Research on handling historical spelling variants

Started some detailed reading on this topic, with some pointers from friends and people on TEI-L. It looks like a flurry of activity happened around 2005-2007, and there are some working examples such as EEBO with fully implemented systems, as well as lots of surveys of approaches, and some tools. It looks useful and interesting. Haven't found anything resembling a dictionary of variants for Early Modern French, though.

Permalink 02:28:39 pm, by jnazar, 29 words, 36 views   English (CA)
Categories: Activity log; Mins. worked: 60

HCMC Admin.

End of Year (eoy):
Met with SA today to review eoy issues. Forwarded to SA last year's
eoy information for perusal.

Subscription:
Renewed/mailed MAC subscription.(2 yr./24 issues subscription)

Permalink 02:25:14 pm, by jnazar, 48 words, 68 views   English (CA)
Categories: Activity log; Mins. worked: 90

Cascade (HCMC site)

Cascade (HCMC site)

Attended Cascade drop-in session:

Issues:
- banner color (color will be implemented later today by communications staff)

- home-home: issue has already been reported by others; will be addressed by communications later; (HCMC -correct as is for now).

Reviewed:
- "page mirroring"
- content details

Permalink 02:24:36 pm, by jnazar, 45 words, 41 views   English (CA)
Categories: Activity log; Mins. worked: 90

Cascade (HCMC site)

Attended Cascade drop-in session:

Issues:
- banner color (color will be implemented later today by communications staff)

- home-home: issue has already been reported by others; will be addressed by communications later; (HCMC -correct as is for now).

Reviewed:
- "page mirroring"
- content details

Permalink 02:16:26 pm, by mholmes, 34 words, 37 views   English (CA)
Categories: Activity log; Mins. worked: 60

Various minor updates and fixes

Responses to questions from yesterday prompted another round of edits to existing documents, including the normalization of the use of ellipsis (in vol 20 documents only) to have a space before and a space after.

Permalink 02:15:25 pm, by mholmes, 35 words, 34 views   English (CA)
Categories: Activity log; Mins. worked: 120

Mallarmé: processed a stack more poems

Hérodiade Scène took a long time because of its speaker headings and line indents, but most of the rest were quite rapid. 47 out of 77 poems have now been processed into XML and HTML.

Permalink 10:20:59 am, by mholmes, 56 words, 127 views   English (CA)
Categories: Activity log; Mins. worked: 15

Changed connection string in imap experimental map

With dbs moving to Mango, I've now changed the connection string in the imap code (on home1t/london) to point to pgsql.hcmc.uvic.ca. This is working through Lettuce, but not yet on hcmc.uvic.ca/~london, because (presumably) the (machines behind) the load balancer don't yet have permission to talk to Mango's pgsql.

All HCMC Blogs

Actions

Reports

Categories

All HCMC Blogs

Transformer blog

Work on this blogging tool

Image Markup Tool blog

HCMC Project Management

Nxaʔamxcín (Moses) Dictionary Blog

Maintenance

FrancoToile

Mariage

Administration

Academic

Depts

Scandinavian-Canadian Studies

EMLS

Scraps

Image Markup and Presentation

Update of Humanities Sites

viHistory

Vacation, Hours and Sickday Log

Times Colonist Transcript Database

Devonshire

CMC Research Collective

Moodle

Humanities Project Showcase

Peter's blog

teiJournal

Projects

Professional Development

Colonial Despatches

Coup De Des - GUI for concrete poem

Capital Trials at the Old Bailey

Agenda Class Timetabling

Lansdowne Lectures

German Medical Exams

Canadian Mysteries

Map Of London

MyNDIR

Canadian Journal of Buddhist Studies

Adaptive Database

Myths on Maps

Properties

Cascade

Vesalius

DHSI

History of the Philosophy of Language

A City Goes to War

Landscapes of Injustice

March 2012
Sun Mon Tue Wed Thu Fri Sat
 << < Current> >>
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30

XML Feeds