Archives for: March 2012, 09


Permalink 05:02:30 pm, by sarneil, 7 words, 134 views   English (CA)
Categories: Vacation; Mins. worked: 0

SA: Vac 65 - 10 = 55 days + 10 days long service

Two weeks vacation for CSG spring break.
Permalink 05:01:09 pm, by sarneil, 37 words, 147 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

SA G&T 13.0 - 1.0 = 12.0 hours

week of Mar 5 - Mar 9 M -3.0 CSG, T +1.0 beanstream, W -3.0 CSG, R +1.0 admin before vac, F +1.0 francotoile update
next week I'm coming in Tuesday for some kind of focus group which will take about 2 hours

Permalink 04:30:24 pm, by sarneil, 342 words, 120 views   English (CA)
Categories: Activity log; Mins. worked: 60

xslt bug processing ref type="note" containing 'mentioned' element

This structure in the xml data file:

<ref type="info">pépés<note> : <mentioned>Pépé<mentioned> est généralement utilisé par les enfants.</note></ref>

Was originally processed by this xsl:

<xsl:template match="tei:ref[@type='info']">
<xhtml:a href="#" class="tooltip">
<xsl:value-of select="./child::text()"/>
<xhtml:span class="hover_off">
<xsl:value-of select="tei:note"/>

Generating this output (note the "Pépé" is passed through as plain text, whereas user wants it italicized)

<a class="tooltip" href="#">pépés<span class="hover_off">Pépé est généralement utilisé par les enfants.</span></a>

I modified the xsl to this:

<xsl:template match="tei:ref[@type='info']">
<xhtml:a href="#" class="tooltip">
<xsl:value-of select="./child::text()"/>
<xhtml:span class="hover_off">
<!--<xsl:value-of select="tei:note"/>-->

Which generates this output (note the "pépés" appears in the span as well as outside it):

<a class="tooltip" href="#">pépés<span class="hover_off">pépés : <em>Pépé</em> est généralement utilisé par les enfants.</span></a>

I've got to come with some xsl that gives me this output from the given input, but ran out of time today:

<a class="tooltip" href="#">pépés<span class="hover_off"> : <em>Pépé</em> est généralement utilisé par les enfants.</span></a>

When I do, I can delete the leading " : " which is only there as a kludge around this problem.

Permalink 02:15:39 pm, by sarneil, 148 words, 112 views   English (CA)
Categories: Activity log; Mins. worked: 120

add mention element

Needed a way to identify words used as examples in the annotations, so added the "mentioned" element. Entailed - adding the element to a data file - invoking oddbyexample to create an odd file incorporating the new element (see previous post on that) - using Roma to generate a new rng file from the odd file - putting that rng file back into my tree - updating the xslt files to find the new element correctly (special case was mentioned embedded in note) and then output appropriate html - committing the changes to the repository and uploading the modified files into the exist db. For the time being, we're using it for any string that we want italicized as the vast majority (if not all) are in fact mentions. If we find a significant number of other cases of strings that need italics, we'll make the necessary modifications.
Permalink 01:58:57 pm, by mholmes, 20 words, 80 views   English (CA)
Categories: G&T Hours; Mins. worked: 0

MDH: 176.5 - 1.5 = 175 hours G&T

Leaving early -- need to keep these hours under control. Going home to read about NLP and historical spelling variance.

Permalink 01:55:15 pm, by mholmes, 180 words, 189 views   English (CA)
Categories: Activity log, Academic; Mins. worked: 60

More research on historical spelling variance

I now have a collection of a dozen or so papers I'm reading and annotating, and some ideas are getting clearer. At the moment (although I still have a lot of reading and consulting to do), this kind of approach looks promising:

  • Run XSLT on collection to create parallel collection in which each significant block (not clear what a block is yet) is converted to a modernized textual representation with an XPath pointer that points back to the original block in the original doc. In this process, linebreaks would be dealt with.
  • Each modernized block includes the original variants as attributes or elements (if the latter, the modern indexer can be instructed to ignore them).
  • Modern blocks may also be stemmed.
  • Search is done on modern blocks.
  • KWIC hits from search can be shown EITHER as modern OR as original sequence (reconstructed from original variants stored in modern block).
  • Clicking on the hit takes you to the original text, with hits highlighted based on a new search done using the original tokens stored in the modern block as search terms.
Permalink 01:27:23 pm, by mholmes, 11 words, 63 views   English (CA)
Categories: Activity log; Mins. worked: 45

Pacific and Asian Studies site plan and proposal submitted

Final tweaks received from department, and nav plan submitted to JS.

Permalink 11:02:14 am, by sarneil, 472 words, 229 views   English (CA)
Categories: Activity log; Mins. worked: 120

create new rng using oddbyexample

With critical input from Martin on the syntax of the java command, I managed to create a new rng file derived from the existing data files using the oddbyexample utility from TEI.

Here are my notes.

minimal instructions here:

download for saxon jar files :

download for oddbyexample.xsl and getfiles.xsl :

my setup:
in folder: /System/Library/Java/Extensions (which is in the java classpath)
- saxon9he.jar (working jar file in System)
- saxon9-unpack.jar (working jar file in System)

all other files in folder: /Users/sarneil/Documents/Projects/french/FrancoToile/oddbyexample/
- data folder containing all the data files to use in creating the odd file (I removed child values folder)
- oddbyexample.xsl
- getfiles.xsl
- saxon9he.jar (backup of jar file in System, not used otherwise)
- saxon9-unpack.jar (backup of jar file in System, not used otherwise)
- ftodd (file created by running the java command below)
- francotoile.rng (file created by running ftodd file through Roma as detailed below)
- this readme file.

command I issued:
java -jar /System/Library/Java/Extensions/saxon9he.jar -it:main -o:/Users/sarneil/Documents/Projects/french/FrancoToile/oddbyexample/ftodd /Users/sarneil/Documents/Projects/french/FrancoToile/oddbyexample/oddbyexample.xsl corpus=/Users/sarneil/Documents/Projects/french/FrancoToile/oddbyexample/data

Everything (i.e. paths) is spelled out explicitly as otherwise there's just too much voodoo magic for me.
Tell java to run the jar file specified in the following argument (i.e. saxon9he.jar)
The -it switch presumably tells java which class to run first (not sure).
The -o switch provides the path and file name for the output file (e.g. /root/path/path/path/nameOfODDfile)
The next argument provides the path and file name of the oddbyexample.xsl file to run
The corpus= argument provides the path to the folder containing the tei data files to run the oddbyexample.xsl against to generate the ftodd file

Once you've the odd file
Go to
Click the Open existing customization button and browse to the odd file you've just created
Click the start button
In the Customize tab, change the filename to what you want your schema's filename to be (e.g. francotoile) without any extension
Click the save button
In the Schema tab, select RELAX NG schema (XML syntax) not compact syntax
Click the generate button
Roma will generate the file francotoile.rng (using the name you provided and the extension based on the schema format you selected)
Save that file and move it wherever you want it to go.

Where the data files are expecting that rng file to be for francotoile:

Will test shortly.

Permalink 10:46:52 am, by mholmes, 47 words, 96 views   English (CA)
Categories: Activity log; Mins. worked: 90

PCA's directed reading report

Reviewed the extensive (and excellent) work completed by PCA, who is now nearly at the end of the 1854 abstracts. Wrote a number of notes for tweaks and fixes, as well as a couple of requests for further research and the transcription of a mysteriously-untranscribed despatch (V547102A).

Permalink 08:55:38 am, by mholmes, 4 words, 82 views   English (CA)
Categories: Activity log; Mins. worked: 30

NLP lecture video

Did another lecture video.

All HCMC Blogs




All HCMC Blogs

Transformer blog

Work on this blogging tool

Image Markup Tool blog

HCMC Project Management

Nxaʔamxcín (Moses) Dictionary Blog







Scandinavian-Canadian Studies



Image Markup and Presentation

Update of Humanities Sites


Vacation, Hours and Sickday Log

Times Colonist Transcript Database


CMC Research Collective


Humanities Project Showcase

Peter's blog



Professional Development

Colonial Despatches

Coup De Des - GUI for concrete poem

Capital Trials at the Old Bailey

Agenda Class Timetabling

Lansdowne Lectures

German Medical Exams

Canadian Mysteries

Map Of London


Canadian Journal of Buddhist Studies

Adaptive Database

Myths on Maps





History of the Philosophy of Language

A City Goes to War

Landscapes of Injustice

March 2012
Sun Mon Tue Wed Thu Fri Sat
 << < Current> >>
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30

XML Feeds