Hello everyone,
I had a good day in UVic Special Collections, working on a list of deserters for the justice theme. When I get this in order, I will go back to Victoria City Archives, and look through their police documents for these names. Before and after special collections closed, I was working on going through the early magazines of University School. After I finish writing all the captions (pointing people to noteworthy pages in each one, since each magazine is 32 pages), I'll upload those all at once. The reason I haven't uploaded anything yet is because I'm waiting for legal clearance to do so. I am also finalizing the list of education-related documents we want from Victoria City Archives, and I will upload that when it is done. I'm still waiting to hear back from the photographer about the pictures of Vic High memorials, but hopefully, that he will respond soon.
Monday morning, I have an appointment with the archivist from Glenlyon Norfolk about the records for Norfolk House School. Much of next week I will spend with the St. Michaels archivist, getting documents from them.
See you on Monday!
Leaving early.
It's now running OK. Had to rewrite the XQuery for version 1.0, and tweak various parameters and paths.
Hi,
I've spent the day continuing to build my church database and contact them all (or discern which ones should be contacted). I'm now up to 100 contacts for churches and individuals! I'll be continuing to arrange meetings with them and find relevant information through the coming weeks. I've worked most of today at the HCMC and talked to Greg about our dropbox issues. He says he may look into downloading the dropbox app for these computers here so we can use them the same way as our home computer (see below if you missed that). If he does, it will make our work in the lab a bit easier and less likely to mess up our programs. Now if only he could do something about these darn ergonomic keyboards...!
Kirsten and Ashley, in case you for some reason need to cross reference soldiers with church members, I've started a second page in my archive file for honour rolls. As I find more, I will continue to update it with names by church, and include indication of death if it is shown on the original roll. When I meet with the larger archives next week and the week after, I'll look for membership registers that can also be used for cross referencing.
I am cutting out early today, I just found out my house is being sold and we are having an appraiser in on Monday morning, so I've got to (literally) clean house and get everything in order before then. I'll make up the hours in the coming weeks.
So I've put a pdf into my Archives folder called Unit List and CEF Guide. The CEF Guide is just a visual I cobbled together from a couple of books and sites, a diagram of how the Canadian Expeditionary Force was organized (ie, what's bigger, a battalion or a brigade? What Division was the 62nd Battery part of?) and lists of all the units active overseas as of November, 1918. This is mostly for my own reference, but I thought I'd share it in case anybody else was interested! The Unit List names and records the activities of militia and CEF units that were mobilized in Victoria, or drew recruits from Victoria, or both, according to Library and Archives Canada. Unfortunately, their records are missing a few details - they do not have places of mobilization or places of recruitment for every unit they describe. The CEF units listed are just those that are recorded in their CEF unit guides as having been mobilized in Victoria and/or formed, at least in part, of Victorian recruits. It goes without saying that not all of the men who joined these units were from Victoria, but there's also a good chance that there were Victorians who signed up with units not listed here. There's got to be a way to find them too, though...
I have to apologize for the sparseness of the information related to the militia units I list - I am honestly not sure how many militia units were in the city during the war years. I couldn't find any lists of wartime militia units from reputable sources, nevermind a list that determined each unit's city of origin. I'm really sorry about this, but I will try to find out more about Victoria's militia once I get back to Victoria and into the collections of the Princess Mary's and the 5th Regiment.
When it comes to tagging, at this stage we could enter every unit as a tag and see how much we find - as I said, some of these units recruited in Victoria but mobilized elsewhere, so it's quite possible that the soldiers left nothing behind but an attestation paper or two before they shipped out. From what I've seen so far, there's a fair bit to be found on the 88th Regiment Victoria Fusiliers, the 30th Battalion, the 2nd Canadian Mounted Rifles Regiment, the Gordon Highlanders/50th Regiment of Foot, and the 5th BC Regiment Canadian Artillery. Tags will have to take into account the fact that some units changed names and designations during the war; we could use slashed tags to indicate this (as in, 48th Battalion/3rd Pioneer Battalion) or just make two separate tags and apply them to every item we have pertaining to both the original unit and the renamed unit. I'll leave the decision up to Ashley and Jim!
If anybody finds any problems in the unit list or CEF guide, please tell me! I'll be tweaking them as I learn more, and would be grateful to know of any mistakes I've made!
I had a great day! The Victoria City Archives doesn't have detailed finding aids, so you have to ask the archivists to retrieve a lot of the collections, and go through them yourself to figure out what is in there, which is extremely enjoyable. I was there today going through personal records which contained photos of schoolchildren, and I finished off searching their city records relating to education on the online catalogue after they closed. I believe I have a complete list of education documents/photos from the city archives, so I will upload that list to dropbox tomorrow. First, though, I am going to UVic special collections to get a list of deserters from the 88th, so I can look them up in the police records for the 'justice' theme. I have also been going through the early yearbooks for University school, and writing captions to point future archive visitors to the most interesting parts. I will try and finish those tomorrow and put them on dropbox as well. I also contacted the photographer who took the really outstanding shots of the war memorials at Vic high and put them on flickr to ask for his permission to put them on the archive, so hopefully he gets back to me soon. Ben, I will email you his contact info.
Trying to figure out why MoEML is running some operations at glacial speed...
All ready for next week now.
Spent most of the day cleaning up the database to remove duplicate copies of files, fix duplicate ids, and similar infelicities which might lead to the slowdown we're seeing in performance. Problems have gone away with regular files and searches, but the Stow 1598 is still proving to be a killer, and we're working with RE to figure out why it's taking so long.
Is anyone interested in following up with the President-Elect of Rotary? This doesn't fall under my two themes, but I will if Ben and Ashley are too busy.
Hannah: I'll do Rotary, but probably after I finish with the churches. The contact info would be greatly appreciated! -Ben
Hi, I'm adding photos to my archive folder from the Saanich Archives online site. Do we have permission to use them yet?
Ashley and I met yesterday to figure out the proper workings of the Dropbox and Excel spreadsheets and as she mentioned it looks like it is working well now. I did some final troubleshooting, reading and experiments with it today and came up with this suggestion in case people haven't done so yet:
Go to the Dropbox website and find the download button in the top right. If you download a dropbox folder, you have access to all your files right on your computer. This is handy not only for easier navigation, but it also works for saving items as if they were on your own hard drive (open a file from the dropbox folder on your computer, simply click save when you're finished, and it returns automatically to the dropbox folder). This is also handy for updating files that only you are working on because (contrary to what I thought yesterday Ashley) it works even when not connected to the internet. How? Well it saves the files on your computer and then automatically updates when next connected. This saves a lot of the issues we were having with multiple copies of one document or having things both on computers and online. Hope this makes sense!
As for me, I was with Hannah at St. Ann's in the morning, met with Ashley about the Excel sheet and to discuss archive requirements in the afternoon, and dropped by the Anglican Diocese archives in preparation for my visit on Monday. I was surprised to find out the Christ Church Cathedral is actually a newer building (1929) and that most of their artifacts are from the Second World War. So the archives will be my primary source for that congregation and the others from the 1913-1919 period. When I was there I ran into a city counsellor who was very interested in our project and wants to be kept up to date. Networking! Today as I wait to hear back from churches again, I'm updating my archive folder and continuing background research on the churches so I have a better picture of what to look for in the archives and what Victoria's faith community looked like in 1913. My minutes are for both days because I have to run to my other job right at five today.
Ashley: Was the person you networked with Julie Cormier? I don't know if you were there early enough to meet her. I met with her in the morning and she is very interested in sharing some of her information (her society is in the midst of creating an early-Victoria church walking tour and we will be meeting in a few weeks once that has calmed down for her).
Kirsten: Your Excel sheet is going to be a bit different from everyone else's because you are using a Mac. Let me know if you're having any problems with it, I've left it as is and will take a look at it on your computer. The best thing for us would be either getting Excel for Mac or a program that can save files in Excel for PC format or Ashley may have problems keeping yours up to date/transferring information.
I've made us each a separate spreadsheet in the dropbox folders for each of us. We had multiple versions of the combined spreadsheet in dropbox yesterday and I was worried about loosing data becasue more than one of us is making changes at the same time.
I've also updated the tags in each spreadsheet. I think that the tags should be thematic rather than descriptive of the document. For example, I've removed photograph and document. That information should be in the description section. I'm envisioning that the tags will be used to organize the archive. So students will follow from one document to the next via a tag. A student may want to follow the Oak Bay connection from one photo to the next but isn't likely to follow a connection between two photos because they are both photos.
If you are adding new tags to your sheet, think about the scope. For example, I've used religion rather than church to keep the number of tags to a manageable size. I will give everyone time to visit a few archives and add some documents, so that we can see if we need to add more tags, and in a week or two we can finalize the list.
In the meantime, if you add new tags to your sheet, please tell us in your blog post so that we can add it to our own sheets. I want to avoid losing data becasue there are multiple people working on the same sheet. Kirsten and I have been discussing more descriptive military tags and she will have those for us soon.
Don't feel like you have to fill in all 4 tags. The spaces are there in case you need them, but aren't all necessary.
I've also been talking with Special Collections and library administration about the possibility of using their equipment to digitize documents that some of the municipal archives are unable to digitize. I would like to hear back about this before the meeting I am planning with the archivists from the municipal archives in two weeks.
We've set a tentative date and time to meet with the archives as June 12 at 10am. I'll be calling the archives tomorrow to invite them and arranging for the room.
I joined Ben and Hannah at the Catholic Legacies conference in the afternoon for a tour of the St. Ann's Archives. It was great to meet their archivist Carey and I also spoke with the chair of the Friends of the Sisters of St. Ann's Academy, who said that she would be happy to help us in anyway she can. I will send her an email tomorrow morning explaining a bit more about the project.
This evening and tomorrow morning I will be working on defining my 2 themes for the website so that I will have them to discuss at the meeting on Monday.
*The hours are for today and for the few hours I will work tomorrow morning. I am working for the Congress for the next week so I won't be working on the project much. I will keep up with reading the blog and feel free to contact me with an questions or concerns you have.
I will update the write-up I am keeping for myself about what I am doing in dropbox, but here are the highlights for the past two days:
I visited Victoria High, where, as I believe someone mentioned in our meeting, they are doing a website about the school during WWI. They seemed worried that we would steal their thunder if we used some of their archival material, but I told them that this was the last thing we wanted to do. I said that if Jim permitted it, we would put some of the most interesting stuff they have on our archive, credit them as a source, and instead of a 481 student doing a microhistory on Vic High, link to their website. I was given a tour of their archive. It contains, first, detailed information on Vic High alumni who served in WWI, most of it borrowed from the Commonwealth War Graves Commission, but supplemented with their own research. They have one soldiers’ personal photo albums, and records pertaining to alumni Bobby Powell, who was a Canadian tennis star, and who might make an interesting feature on our website. They have class registers, provincial exam results, registration cards from 1916, and The Camosun, the school paper from the war years. They also have some of the principal’s correspondence. They do want to work with us though, and will apply for formal permission from the alumni association, who owns the archival material. In addition, they are extremely enthusiastic about being a pilot school for the educational package, and I promised to put them in touch with Jeremy as soon as he gets here. I also took photos of their war memorials, which I will put on the spreadsheet as soon as Ben and Ashley create the drop-down menu. A few years ago, the archivist had a professional photographer photograph all their memorials. She gave me some information on him, and I am going to try and track him down to give us permission to use the images.
I met with the archivist from St. Michael’s University School, and she has arranged for me to come back next week. She will share with me photos she has done of their school trophies from that time period, and the list of names she has of people who left St. Michaels and served in the war. She has also given me permission to borrow digital copies of the School’s yearbook “the Black and Red,” which is on their website, and put it on our archive. The Old Boys’ column in “Black and Red,” tracks the alumni serving, and she has it from the WWI years. She has also agreed to let me photograph school clothing from that era, and to let me scan the principal’s diary from the time period.
Since our meeting, I have been going through Victoria City Archives, and looking at detailed finding aids, collections, and what not, figuring out exactly what we will want to put on the archive, and then I will give the list to Jim.
I attended the UVic Catholic Legacy in Victoria Symposium, and got some background information about the Sisters of St. Ann, which will be helpful in the education/medical section. Ashley and I have agreed that when I am visiting the Sisters' archives, I will get the information for her about the medical side of things, because she is supervising, and this will save her some time.
I have been in contact with the School District archivist, and also the Old Cemeteries Society. I spoke to them about sharing what information they have on soldiers buried in Ross Bay for our soldiers database, and they are going to talk to their research committee, and call me back.
The minutes are for yesterday and today.
Fixed some CSS errors in the METR1 and TRIU1 files, which were generating invalid CSS in the redesign project. Also fixed a bunch of old-style uses of @rend in TRIU1, and some XSLT bugs.
The MolSortComparator.jar file was missing from the web application on Peach, and suddenly this began to matter, as all pages in /site/ started failing on the XSLT transformation because of it. I've now imported the Java codebase for this into the MoEML SVN and added the library.
Met with CP and prepared the presentation for next Monday morning.
1. ES added the xml files for fraf7, cltf5, pscf6.
2. ES added transcripts for fraf7, cltf5, pscf6
3. SA uploaded all the latest changes to the website and all seems to work fine. Thumbnails have to be created for fraf7, cltf5, pscf6.
I have a bunch of new users to add to the system and make members of the CanMys blog. When I added the first and logged in as him, I noticed that when I clicked on the Posts button, I was allowed to see posts from the Moses, EMCS and Scraps blogs. Turns out those three blogs had group permissions which made all users of type "Blogger" members and with permission to upload to the media folder, but do nothing else. Those three blogs also have individual users with specific permissions.
I disabled the is-member checkbox for Bloggers in the group permissions for each of those three blogs. That automatically disabled the upload checkbox. There should be no change of behaviour for specific users that are members of those blogs.
Reviewed the second paper on topic modelling.
Putting some graphics together for the presentation and creating the first slides.
More work than time...
Team meeting resulted in these things:
<ref> will become common. I see no reason why they shouldn't work out-of-the-box, but it needs testing.<respStmt> in the header. This allows JJ to put the entire <respStmt> list into order of precedence.<hi> for rendering instructions in born-digital documents should be avoided in favour of semantic tagging. I implemented handling for <label> to deal with a specific case in prepare_transcription.xml.750 page images for RG7 G8C Vol 10 (in three different sizes) have been added to the collection. These cover the British Columbia 1862-63: Despatches from London. These will now be linked into the transcription documents.
Hello all,
This morning I met with the volunteer from Oak Bay archives that is compiling a list of WWI veterans. He is interested in working together to compile the list so that we can share the information. Caroline Duncan from Saanich Archives was at Oak Bay today and we tentatively planned a meeting with the archivists and main volunteers from the municipal archives in Greater Victoria for June 12. I'm working on making contacts at the UVic library so that we can offer to digitize some materials for Saanich (and possibly other archives who don't have the resources to digitize larger items). I'll keep you up to date on that planning.
In the afternoon I met with a contact from the View Royal Archives who provided me with photos of her father, who was a POW held in Germany for 3 years. She also has an oral history that I now have the transcript for. The audio file is held in the Canadian War Museum so I am going to be in contact with them to see if they have digitized the file from the cassette form. If not, we can include the transcript or see if we are able to digitize it in the library. I'll upload those to dropbox tomorrow.
I've also been contacting band offices today. I haven't had much luck yet but I'm waiting to hear back from some offices.
I uploaded my contact list into my folder on dropbox. Feel free to use any of the contacts if you need them.
Today was fun, went out again to see some churches "in the field." First I visited St. Stephen's out in Saanichton (a beautiful church if anyone wants a photographing spot) and the rector gave me a free copy of their church history book. Their cemetery had some good grave stones and they had a rather large and detailed honour roll which will be very handy for cross-referencing. Then I stopped in for a visit with the rector of St. Mary's who had some amazing things to show me. There are plenty of monuments to look at and the colours of the 88th regiment which are almost falling apart. She had photocopies of church histories and the personal story of the Rev. Andrews who served overseas. She also has some info on a woman's wood carving guild which she said was very special in the community back then.
Spent the afternoon building up some photos in the archive, hope they work! Hannah, I'll be joining you tomorrow, hopefully they got my registration email! HCMC is kicking me out now, see some of you tomorrow.
Ben
In the kerfluffle last week with the eXist server and contained webapps, the Francotoile webapp was somehow corrupted. After an hour or two, we got the instance going again, but then discovered that the password for the admin client no longer worked, so we wouldn't be able to update the webapp. Solution was to replace the instance of the webapp on the server with a copy of it on my local machine.
Basic procedure to replace a corrupted instance of a webapp e.g. francotoile
- log in as tapor to tomcat manager on server (peach)
- undeploy webapp
- go back in browser to safe URL (one without undeploy instruction in it)
- ftp in as hcmc to server (peach.hcmc.uvic.ca)
- cd up and down to /usr/local/tomcat-instances/devel/webapps/
- delete old folder
- upload new folder (same name as old folder)
- refresh webapp listing in tomcat manager
- app should appear, click deploy
Hi everyone,
I found a place on campus to borrow a digital camera yesterday, so I could get started. I then headed down to Victoria City Archives. I had a different archivist than before, and they told me that I should not take photos of anything until they have worked out a licensing agreement with Jim. Instead, I was instructed to go through absolutely everything, make a complete list of everything I would want to scan or photograph, and then submit it to Jim with the paperwork so he could arrange for the payment of fees and what not. So, I am just going through finding aids and inventories, looking at artifacts, and deciding everything that I will photograph right now. I may put this on hold for today though, if I can get a hold of the archivist at Victoria High. They only come in on Tuesday, so I might go down there and check their collection out. At 4 pm, I have an appointment at St. Michael's.
The minutes are from Friday and yesterday.
Hi everyone,
I found a place on campus to borrow a digital camera yesterday, so I could get started. I then headed down to Victoria City Archives. I had a different archivist than before, and they told me that I should not take photos of anything until they have worked out a licensing agreement with Jim. Instead, I was instructed to go through absolutely everything, make a complete list of everything I would want to scan or photograph, and then submit it to Jim with the paperwork so he could arrange for the payment of fees and what not. So, I am just going through finding aids and inventories, looking at artifacts, and deciding everything that I will photograph right now. I may put this on hold for today though, if I can get a hold of the archivist at Victoria High. They only come in on Tuesday, so I might go down there and check their collection out. At 4 pm, I have an appointment at St. Michael's.
Hello all,
I've updated the spreadsheet for the documents we're collecting and saved it in the "Archive" folder on Dropbox. I've also made us each a folder there to upload documents. Make sure that all documents uploaded to your folder are entered into the spreadsheet and labeled with the reference number you've assigned it. This could become a mess if we don't keep things organized from the beginning. If you want to add a new tag, or change an existing one, make sure that you add it in all the sheets so that the tags are consistent. It will probably be easier for me to make the tags more board in the future, so if you're not sure about a tag, it is better if it's more specific.
I also went to the View Royal Archives today. One volunteer has compiled a book about her father who was held in Germany as a POW for 3 years. I'm in contact with her sister to get digital copies of the oral histories, letters, and photographs she used.
I've also been in contact with the Central Saanich Archives and they are keen to be involved.
In the afternoon, I started to call the local band offices. I have a few good leads and I will follow them tomorrow!
Ashley
Hi,
I had a field trip day today, going out to Sooke and Colwood which garnered all the names of Soldiers from Jordan River to 17 Mile House (98 soldiers) including which ones died, as well as checking out some grave sites and getting contact info for historians at the Sooke Museum and Colwood Historical Commission who have each written books about the area during our time period. In Colwood I met with Dick Emory whose father served as a signalman in France with the 88th and was wounded there. He had some great artifacts including the original discharge papers, his father's pay book, and the telegrams sent to his family when he was injured. He also had some good photographs. He'll keep me informed of further finds and any developments from the Historical Commission. My plans for the rest of the week include a date with some regimental colours, checking out some local church honour rolls, and possibly visiting Pearkes' grave. I've been in contact with all the Anglican churches now, so I'll start moving into the other smaller-number denominations this week too.
(I was going to post a photo of what I found today, but my computer won't read my camera card at the moment, so I'll get back to you on that one...)
Trying to get a couple of presentations for July prepared before Congress, so we can concentrate on the big MoEML one afterwards.

The minutes noted below are from Friday, this weekend, and today. I'm still working on a list and lineages of units that were active in Victoria during the war. I was just going to use it for my own reference, but would something like that be of interest to anybody else? I can stick it in the Dropbox if so!
My Dropbox folder has some of the attestation papers and photographs I've found so far - personal favorites include the 30th Battalion parading past the Legislature and swarming over the SS Mary's rigging. Research plans for the week include a visit to the Bay Street Armory, 13 oral histories available online through UVic's special collections, and finishing off a list of items I'll hunt for at the BC Archives.
Unfortunately, I probably won't be able to get to the Maritime Museum or into the hard-copy Special Collections until the end of the week. My grandfather had a heart attack on Friday, and they moved him down here from Nanaimo yesterday. He's heading into open heart surgery tomorrow morning. It won't be his first, and he had surgery for circulation problems in his lower intestine barely a month ago. My dad and aunt are coming down this afternoon, and staying with me tonight. I will try to get some work done here and there... but I'll be spending as much time as I can with my family.
Made more progress with the graphics and presentation for the Dates talk at DH. I think I'm about half-way through.
To fix this problem, run
sudo update-alternatives --config javaws
and choose the java 7 version
Reviewed and commented on the first of two white papers for MVP.
Leaving early.
Created an SVG map for use in the presentation on dating in July.
Created XSLT to add long s to transcriptions, based on previous work on Stow, and ran it on TRIU1. Note to self: it needs to exclude editorial notes. Also did a lot of semi-manual cleanup of encoding in the document, ready for KMF and ZV to start work on it. Noticed a lot of remaining @rend attributes; I've now added a Schematron warning for those, so people convert them to @style.
Another mockup for the Guidelines TOC page rewrite.
Hiya, I spent most of my day on the phone, slowly going through my list. It seems most of the Anglican churches have office hours early in the weeks, so I didn't get a hold of too many, but I have been invited to the Colwood Historical Association meeting on Monday as well as to Dick Emory's house to see his private collection of artifacts and newspaper clippings from when his father served and was wounded in France! That is probably the most exciting, although I located many churches from the 1910's and learned a bit about the Anglican mission in the 1870s from a very chatty Rector's Assistant. I also located the church where Pearkes is burried and another whose reverend served with the 88th Regiment, Victoria Fusiliers. All churches seemed interested in circulating a poster in the coming weeks.
Another vital piece of information is that the Anglican archives close for July and August, so I'm hoping to arrange multiple visits but may want someone to join me for some of them as those archives hold a lot of info and are only open two half-days a week. I will keep you posted. I guess that takes away all my show and tell for tomorrow, but I'll see you all then!
Ben
PS: because I'm contacting so many groups, I've made a new email for myself just for CGTW. If you want to make this your primary contact info for me it will help me keep things all in one place. It is ben.cgtw@gmail.com
Used XSLT to add an @n attribute to all paragraphs holding the @xml:id of the preceding or first-child milestone element, to enable faster searches on p tags instead of ranges between milestones, while still returning the correct target milestone.
Created another mockup of proposed new Guidelines TOC page.
Tested Macs in A103 to make sure no memory problems etc. doing large transformation exercises.
Fixed some bugs and cleaned up some XQuery and XSLT.
Took several restarts of Tomcat and various apps, then intervention by sysadmin to increase the number of files a process can open; now we have much higher speed on all apps.
You may notice that when you add images to a blog post it tries to display them at full size, sometimes cropping an edge. To make them easier to view, here's a trick.
After adding the image to a post, look at what the blog engine dropped in to your post editor. It looks like this:
<div class="image_block"><img src="http://hcmc.uvic.ca/blogs/media/blogs/cgtw/poster4.jpg" alt="" title="" width="987" height="1281" /></div>
The width and height attributes are representative of the pixel size of the image. We can adjust them to make it fit a little better by fiddling the numbers. If we reduce each number by, say, 50% we end up with this:
<div class="image_block"><img src="http://hcmc.uvic.ca/blogs/media/blogs/cgtw/poster4.jpg" alt="" title="" width="494" height="640" /></div>
Notice that these are rounded to the nearest whole number. If you try to keep image no wider than about 400 or 500 pixels they'll look better in the blog. Also, please note that this does NOT change the size of the original image. It ONLY changes the display size in the blog. Right-clicking and saving will store a full-size version.
Hi everyone!
Sorry I did not blog at the end of the day. I started off searching BC Archives' collections for material on police/court activity and war resistors, and then I went down there to talk to the archivist in person. Unfortunately, they told me there would be numerous legal challenges involved in accessing some of the material. I will report what they said in more detail on Friday, but for now I have a basic list of what is open to us. I then looked at what BC Archives has for education, and found some useful school records, some oral histories with Victoria high alumni, and other material. While I was there, I also looked at the music scene in Victoria during our time period so that we might have some audio clips for the website. BC Archives has concert programs from various musical societies, and piano sheet music about Victoria, by Victoria composers and published in Victoria. If we can't find any recordings, I could always record myself playing it and put the clip on the website.
Then I checked out the legislative library, and talked to the lovely reference librarian about what records pertaining to education and the provincial government are there. I think I have an almost complete list of what public schools were in operation during the time period, based on a masters thesis on microfilm she showed me. Today, I am going to Victoria City Archives first. I was hoping to find school board minutes yesterday, although these may not give us the interesting stories we are hoping to feature, so I will keep looking for other material. I have an appointment with the archivist at St. Michael's. I spoke to her on the phone yesterday, and it sounds like they have great records on the school's veterans. I am going to call all the old private schools, and Victoria High, and talk to them about records.
I've spent the day looking for municipal and community archives. So far my list is:
Sooke Region Musuem
Metchosin Museum Society
Esquimalt Municipal Archives
View Royal Community Archives
Oak Bay Archives
Saanich Archives
Goldstream Archives
Sidney Museum and Archives
It seems that Langford does not have an archive, as much as I've looked for it. Let me know if there are any I've neglected! I have finding aids for some of these and the rest I'll be calling tomorrow. I've also found some useful material in the BC Archives for medical history. Tomorrow I'll be following up with the the Chinese Presbyterian Church and calling local First Nations bands to see if they would be interested in advertising in their newsletter or mailing list.
Wow, things are really kicking off! Love the posters and excited to hear what everyone else found.
As for me, I've started my contact list of churches and social organizations (did you know there are 162 places of worship in this town?) and will be cold-calling them tomorrow. I will be starting with the organizations I know existed 100 years ago, and then as the project moves forward I will start reaching out to newer groups to see if their members have other information - which will be greatly aided by those posters. The things I will look for first are if the groups have lists of members who served, members from the time, any archival materials, and monuments, and from there I will build a list of places worth visiting.
Wish I could join you at the air museum, have a great time. I'm signing off early today to enjoy a birthday dinner, see you all Friday!
Ben


My contention about the change to docUtils.java having caused a regression which broke relative paths for the doc() function was borne out after I changed the file and rebuilt. Reported the bug formally on the bugtracker, and it is now fixed, so I have a fresh trunk build of eXist ready to go for MoEML. I'll deploy this first thing tomorrow before anyone else gets to work.
I am posting this exchange about inferred glosses so that I don't have to think it through all over again in the future!
SMK wrote:
Regarding the search engine, I blogged on 12/12/12:
"ECH's goal for the search engine in the web database is that, if a user searches for "fat", s/he will get results including fat, fatten, fattening, fatty. Our current settings, and our policies for adding inferred glosses, seem to be accomplishing this nicely. An entry which has "fatty" in its def is found by a search for "fat", because it also has an inferred gloss "fat". Searching for "fat*" also returns defs including fat, fatten, fattening, fatty ... but also fatal, fathom, father."
However, we also noticed the converse on 16/04/13:
When I searched for the inflected form “fired”, I also I got all the entries with “fire”.
BUT when I search for “fatty” or “fatten”, I don’t get all the entries with “fat”. What is the difference here?
MDH replied:
I think you're just discovering that a stemming analyzer is not an educated human. It doesn't understand semantics; it just knows how to strip off (some) inflectional endings and index the resulting stems, and then how to stem the search input and search the stemmed index with it. You will never find an automated search engine that gives you perfect results.
Right now, the search is paying no attention to whether things are in gloss tags or not; as I understand it, the purpose of the gloss tags is to construct and English-Nxa’amxcin list, not to aid in searching.
The situation with "fatty" is definitely a bit odd; it appears that if you search for that word, you it doesn't get stemmed prior to the search, whereas if you search for "fired" it does. Perhaps the stemmer avoids stemming -tty inputs because there are many which shouldn't be stemmed? ("batty", "natty", "patty", for instance.)
SMK continued:
OK, so when I search for fatten, fattened, or fattening, I get the same 5 hits – 3 for “fattening”, one for “fattened”, and one for “fatten” – i.e. everything with the stem “fatten”. It doesn't go all the way down to the root “fat”, and that's fine.
When I search for “fatty”, all I get is the one entry for “fatty”, as you explained above. That's fine too.
We had been adding inferred glosses for the uninflected English stems and roots of attested glosses, e.g.
<def>
<seg>I am <gloss>fattening</gloss> it up</seg><bibl corresp="psn:W">W10.138</bibl>
<seg><gloss subtype="i">fatten</gloss></seg><bibl corresp="psn:ECH">ECH</bibl>
<seg><gloss subtype="i">fat</gloss></seg><bibl corresp="psn:ECH">ECH</bibl>
</def>
Here, <gloss subtype="i">fatten</gloss> adds nothing to the search capabilities, because the stemmer can find “fatten” within “fattening”.
But does this entry with “fattening” get found when I search for “fat” because of the stemmer, or because of the <gloss subtype="i">fat</gloss>? It must be because of the inferred gloss, because the stemmer only stems as far as “fatten”.
In the case of “fatty”, where we know the stemmer doesn't operate on it, it still gets found when I search for “fat” because of the <gloss subtype="i">fat.
(“fattening” and “fatty” do NOT get found when I search for “fat” just because they contain the string f-a-t, because “fatal” and “father” are NOT found by a search for “fat”. To find anything with the string f-a-t, I would need to search for “fat*”.)
So the inferred glosses do play a role in improving the search. That said, I don't think we should be going out of our way to add inferred glosses for this reason.
Much discussion over the last few weeks regarding the placing of gloss tags for generating the Eng-Nx wordlist. I attempt to summarize our conclusions here for future reference.
1) Why do we place inferred glosses (<gloss subtype=”i”>)?
At various times, we have placed inferred glosses for augmenting the search engine on the website, and for generating the English word list.
We concluded that from here on, we ONLY need to place gloss tags for generating the English word list. Inferred glosses do sometimes enhance the web search engine, but now that the stemming analyzer is in place, we don't need to do any further markup to help it out.
2) How should we tag inflected English words?
Until last week, we had been inferring the root word (or stem where relevant) when a def is an inflected or derived form of an English word, e.g.
<def>
<seg>he is <gloss>fattening</gloss> it up</seg>
<bibl corresp=“psn:JM”>JM 1.2.3</bibl>
<seg><gloss subtype=“i”>fatten</gloss></seg>
<bibl corresp=“psn:ECH”>ECH</bibl>
<seg><gloss subtype=“i”>fat</gloss></seg>
<bibl corresp=“psn:ECH”>ECH</bibl>
</def>
This encoding means that this entry will show up three times in the English-Nxa’amxcin wordlist: under fat, under fatten, and under fattening. This seems like overkill, especially when these three words will sort one after the other in the English wordlist anyway.
ECH and SMK decided we would like to see the “fat” entries as follows in the print dictionary:
fat: fat
fatten: fatten, fattened, fattening
fatty: fatty
To accomplish this, we need to reduce the number of gloss tags we place in each entry. Inflected English forms (-ed, -ing) should not be gloss tagged; only their root or stem should be gloss tagged.
So “fattening” would now be gloss-tagged as:
<seg>he is <gloss>fatten</gloss>ing it up</seg>
MDH confirmed that the search engine is ignoring gloss tags, so the stemmer will operate on <gloss>fatten</gloss>ing the same as it would on <gloss>fattening</gloss>. (That is, it will continue to return all results with the stem “fatten” when someone searches for fatten, fattened, or fattening.)
MDH has created two sample Eng-Nx word lists based on the 6 files with “complete” status, one using all the gloss tags, and one omitting the inferred gloss tags. They are in moses/trunk/docs/glosses. We concluded that we don't want to programmatically ignore the inferred glosses, because many of them – especially the synonyms – are worth including. But we can refer to these lists to identify the inflected English words whose gloss tags need to be revised.
3) How should we tag English phrasal verbs?
Where appropriate, English phrasal verbs will be enclosed in a single gloss tag - e.g, <gloss>go after</gloss>. This will allow us to organize the headwords in the Eng-Nx word list as follows:
go
go after
go down
go up
, etc.
4) How can we distinguish English homophones in glosses?
English homophones in glosses will be distinguished with a secondary word (or phrase) in an @n attribute on the <gloss> tag, e.g.<gloss n="conflagration">fire</gloss>, <gloss n="back of boat">stern</gloss>. These will then be rendered as follows in the print dictionary:
fire (conflagration):
stern (back of boat):
We decided not to use parts of speech for @n values. We will always use synonyms. We need to select synonyms that will be clear to readers in the community.
I have now disambiguated the English homophones listed here, and updated the Notes on Definitions and Gloss Tagging document accordingly. Where one homophone was far more common in the data than the other, I only added an @n value on the less common one - e.g. watch (wristwatch).
ES added transcripts for accf3, fraf8, cltf6
Trying to abstract the combined keyword/text search into a separate library yesterday was very problematic, but I took a simpler approach this morning and simply copied and adapted the code from search.xq into advanced_search.xq. The result seems to be working perfectly -- the keyword/text search is done first to retrieve a set of @xml:ids, then the search is done on those ids, with additional filters provided by the other form controls.
Did this through XSL with some cunning language-detection code based on content and context, and it seems to have worked pretty well. The Names page now uses the @xml:lang attribute instead of its own cruder detection code to build output.
It was great meeting you all today and I'm looking forward to working with you all through the summer! I thought I would post one of my favorite newspaper articles from the project I mentioned today. Blayney was the oldest of the Scott brothers and the event that earned him the Distinguished Flying Cross is outlined in the article on the left. It's a pretty unbelievable story!
Happy hunting tomorrow.


Too much to do, not enough time to do it...
PAB wants to combine the simple search (which is actually very complicated behind the scenes, since it does keyword lookups and combines them with supplementary text-searching) with the advanced search filters. This is proving virtually impossible, partly because it's just too messy -- you'd need to retrieve a document set from the keyword search in a separate step, and then filter it -- and partly because I just don't have time to implement it properly before the launch. I'll have a couple more shots at it, but things aren't looking good so far.
Made a few other changes and fixes requested with PAB, and hid the text search box, since it's doing what it says on the box (a text search), and not what PAB wants (a complicated keyword search).
Following a meeting at which we discussed strategy, and decided to focus for now on the Mayoral Pageants, worked with KMF on a range of minor display and rendering issues for primary source documents, including bylines, marginal labels, and text indents.
...on instructions from JS-R.
As planned last week.
Meeting to review the presentation -- my task now is to collapse six slides which begin with the picture of the filecard box into a single stepped diagram illustrating the old encoding process and the horrible binary result.
Started a tutorial based on SNOW1 (for the moment), and in the process of writing the first bit of it, came up against many annoyances in the rendering of egXML blocks; fixed those rendering issues (in three places, site, redesign, and codesharing. Grrr).
Emailed DR with latest changes/additions required for site.
Site in progress.
In Progress: updating site with new course listings 2013-14.
In progress: updating site with new course listings for 2013-14.
Added rendering handling for sp, speaker, and p within sp. The stage tag isn't handled yet. Rolled out changes both to site and to redesign codebases.
The ISE was getting the error
java.lang.NoClassDefFoundError: Could not initialize class sun.awt.X11GraphicsEnvironment
when running xwiki.
It turns out that the existence of quotes in JAVA_OPTS directives causes the option to be ignored. So, for future reference, use -Djava.awt.headless=true instead of -Djava.awt.headless="true" when launching tomcat on a headless server.
...on RL's instructions.
Since SNOW1 was a bit of a mess at the beginning, because of the encoders following obsolete examples, I've manually encoded the title page as an example.
Also found a problem with METR1 which was not really a bug, nor an encoding invalidity: a body element which goes straight to content (e.g. a head) with no intervening div is not invalid, but it triggered rendering problems because it was completely unexpected. As it happens, the encoding should not have been that way -- other divs appear later in the body -- but it wasn't technically wrong, so it would be good to figure out a way to prevent this through the schema or more likely through Schematron. We could change the content model of body so that it can only have divs, of course.
Section 2 is now down to 6 slides, with more detail and more extensive notes.
Following Sarah's post, I've done the following:
Here are a few requests for the Names page on the website:
DONE -exclude Lexical Suffix entries
DONE -fix the display of sic/corr, so that only “Wenatchi” displays, not “WenatcheeWenatchi” (See for example the entry for “Sam George”.)
DONE -put flora (plants) and fauna (animals) in the link text at the top of the page
-separate out the sorting into Nx-Eng and Eng-Nx pages. Ideally, users should be able to view the complete list, or any of the six lists by name type, sorted either by Nxa'amxcin name or by English name. The present setup with Nx and Eng names mixed together in the Name column is somewhat confusing. Continue to sort the Nx-Eng lists based on name tags in prons. For the present, exclude name tags in orths when generating these lists. Sort the Eng-Nx lists based on name tags in defs.
PENDING ECH'S FURTHER DISCUSSION WITH CCT:
Please also generate a printable version of the six lists of names by type. These only need to be sorted alphabetically by Nxa'amxcin name - i.e. only include the name tags within prons when generating these lists. Ideally they would be spreadsheets with the following columns:
Name (pron:seg type= “p”)
Source (following bibl ... if the pron:seg type= “p” is NOT subtype=“i”)
Definition (all defs)
Pronunciation (pron:seg type= “n”)
Source (following bibl)
Word Parts (hyph)
Running very fast to stay in same place...
Did some tasks from yesterday and some new ones:
<group> have now been converted to <div>s. (The only exception is stow_1633, which probably does need <group>.)I've implemented the advanced search as a separate page, and got it basically working, although some missing bits in the encoding mean that it's not finding everything it should (e.g. dates are missing @whens sometimes).
1. ES corrected location coordinates for cltf6, aacf3, fraf8
2. ES added transcripts (non annotated) for fraq 7, fraq8, fraq9
1309 page images for CO 60 Vol 13 (in three different sizes) have been added to the collection. These cover the British Columbia 1862: Despatches to London. These will now be linked into the transcription documents.
Work arising from the Providence meeting.
I have these tasks coming out of the team meeting today:
On late duty.
I've spent the whole day working on getting a more flexible and successful build system for eXist. This is what I've added to Greg's script:
Found a number of problems with eXist, which I've reported, including a bad one once the webapp is running: you can no longer call transform:transform with a relative path to the XSLT file, otherwise you get an error. A full path from /db seems to work.
ES added about ten new videos and XML data files, so I had to create a thumbnail image for each. I ran each file in the player.xql file, stopped the video, captured a bit of the screen to a png file, edited that to 88x66 px (size that all of them seem to be) added them to the SVN repository, uploaded them to the production site and the copy of the site on my Mac.
While doing that, I noticed extraneous thumbnail files in the images (as opposed to the images/thumbnails) folder, so deleted those from the servers and from the repository.
Leaving early.
We've been running the live db with open access since the last time I rebuilt it, so in the process of doing other updates (such as rolling out the Java sorting collations) I've also added back the protection that we had before. In the process of doing this, I got bitten by the horrible eXist bug which enables you to lock yourself out of the admin account if you edit the admin user and forget to retype the password into the two password boxes (the effect is that you end up with a random admin password that you can never discover). As a result, I had to remove the server version of the app and replace it with a refreshed version of my local copy. This failed the first few times -- Tomcat tries to auto-deploy the app before it's completely uploaded the dbx files, so the uploaded .filepart files can not be renamed to overwrite the ones created by the live startup. It took two or three shots to get this problem solved. The only way seems to be to let it deploy, but stop it immediately in the Tomcat manager; then delete all the dbx, lock and log files; then upload them again; then restart it in the manager.
1) For the linguists' dictionary, we would like to see:
first phonemic representation in bold <orthography in angle brackets> [narrow transcription(s) in square brackets], for both forms and cits - e.g.:
ʔáyx̣ʷt <ʔáyx̌ʷt> [ʔáyəx̣ʷt]
√ʔáyx̣ʷ-t
1. be tired
2. tired, worn out
• √ʔáyx̣ʷ-tl kɬʔámnc
<√ʔáyx̌ʷ-tl kɬʔámnč>
[√ʔáyəx̣ʷ-t ləkɬəʔámənč]
he is tired of waiting (for you / me)
2) On the website, we would ultimately like things sorted by orthography.
ES noted that recent changes she'd made weren't appearing on the production site at francotoile.uvic.ca.
I had a connection in the exist admin client that used pear.hcmc.uvic.ca as the domain. I thought that would be dead, but when the connection succeeded, I assumed that domain name was forwarding to the current instance. Wrong. Obviously there is another instance somewhere on "pear" that is still running.
Created a new connection in the admin client using tomcat-devel.hcmc.uvic.ca as the domain and that worked. Also, the webapp in the new instance is francotoile and not francotoile21 as it was in the old instance.
In poking through the files, also noticed a connection string using lettuce.uvic.ca, so changed that to hcmc.uvic.ca and it seems to be working.
Updated the lastpass records.
This morning we decided that a simple and quick way to distinguish between homographs with different meanings is required to make the English lookup part of the dictionary less confusing. This will be achieved by adding a clarificatory word or phrase in the @n attribute of a gloss. Glosses will then be presented in the E-to-M view with this clarification in parentheses. Processing on the website will need to be changed to take account of this, and the print dictionary rendering will also have to be written with this in mind.
Wrestling with similarity metric algorithm...
I've now figured out how to create an extension module for eXist, following the instructions here. These are some things I've learned:
build.sh extension-modules, then drop that jar into an existing eXist instance (although if the new jar was built with a substantially different version from the rest of the code, there could well be problems).<module uri="http://hcmc.uvic.ca/ns/usm" class="org.exist.xquery.modules.unisimmetric.UniSimMetricModule" />along with the other modules.
I'm not yet happy with my module, and I'm still working on it. In particular, I'm not happy with the scores it's generating, and I think this might be something to do with other bits that get included in the GZIP stream, such as a header; if I can figure out how big those are, I can remove them from the calculation. The highest difference I seem to get is around 0.53 with completely dissimilar strings, so it seems as though the results are being compressed into a range much smaller than 0-1.
After an update to DSM 4.2 rutabaga no longer allowed rsync backups, failing with:
sh: rsync: not found
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: remote command not found (code 127) at io.c(605) [Receiver=3.0.9]
After much wailing and gnashing of teeth we discovered that non-interactive users do not have /usr/syno/bin in their path (it *is* in their path if they shell in to the NAS, so they can run rsync *from* the NAS when shell'd in).
So, that's an easy fix, says us: add a symlink to /usr/syno/bin/rsync in a logical spot that *is* in a non-interactive path, like /usr/bin.
Problem: admin user cannot su root (error message = su: must be suid to work properly), so cannot create symlink.
Answer: TURN ON TELNET AND LOG IN AS ROOT USING THE WORST POSSIBLE METHOD!!! Then, you make the symlink and turn off telnet - quick!
Late duty, then fighting with @W(*%&($^ Rutabaga which has forgotten how to do rsynb backups. Still not solved. GRRR.
I've written out the prose of the Oxford talk. Still remaining to do before July:
This morning we got nets to set up B047 on the switch. Some time after that 3 machines lost the ability to get a DHCP address - I have no idea if there is a causal relationship between these things.
After much mucking about, it *looks* like it might have been a communication problem between the DHCP server and the machines.
Even forcibly releasing the DHCP lease didn't make any difference.
In the end, I booted the machine with a LiveCD and fiddled with enabling/disabling the network (in the network manger). I got a proper IP and rebooted in the installed OS. That seemed to break it out of the loop.
A bit perplexing and aggravating because I don't actually know what the problem was...
1. ES added transcripts for fraq10, fraf6
2. ES has edited Liette's video, and given it to SA. Corresponding xml file has also been added.
3. ES asked SA to upload all new addition to the production site in order to see if edition with Audacity works fine.
Sent in FMIS report May 2nd re furniture removal. (p/up by May 8, 2013)
May 8: followed up on sent FMIS request re furniture removal specific pick up time. Furniture removed May 8th, 11:00am.
Computer facility:
Received computer usage requests from several projects for May-August 2013
New schedule updated and posted online
SA and I met with SA (Rel.St) to discuss migration of current Religious Studies site over to Cascade format.
Discussion:
- RELS current information and design to be replicated basically in Cascade
- discussed various Cascade requirements
Next steps:
- sent info. request to SA (RELS) required for outline
- HCMC:currently preparing RELS structure outline in readiness for submission for approval
On late duty.
Team meeting, at which we discussed the use of ISE's facsimile viewer in MoEML (which will be easy enough to do, although it's based on a traditional db, and we'll have to replace that with proper TEI facsimile encoding).
People also asked me to clarify how the EEBO linking works, so I've done that in the transcriptions documentation file, and I've also implemented the display of little page-images linking to the EEBO pages. Also, during today, <address> and <addrLine> were added to the schema, with some basic display rendering.
Met with PAB and made a number of fixes:
We also made a plan for an advanced search, which I'll document in more detail here before I try to implement it.
When making modifications a couple of weeks ago (see post), I changed only the list view and not the timetable view. I didn't realize that the dropdown for which courses to print affected only the list view (as in the code it is located in the active_area code and not the view-specific code in manage_calendars.
I added code to
- manage_calendars (at about line 12985 - that file is ridiculously big) to add the option to the select in the dropdown
- manage_calendars.php (at about line 118, in the else if(strcmp("display_table",$do_what)==0) branch) to check the setting of the dropdown and take appropriate action
Notice that in the timetable view, the effect of changing the setting take place immediately in the view, then that view is printed; in the list view, changing the setting does not change the display but does correctly filter what gets printed. Not sure if that inconsistency is a bug or a feature which reflects how those two views are used.
Leaving early.
The problem of duplicate @xml:id attributes on entries has now become a serious issue for the print dictionary building, because I'm unable to properly process the entire collection properly to produce the book; to build the dictionary I have to use XInclude to create a single XML source file, and when I do that there are over 1600 duplicate ids which prevent some of the processing steps from being successful.
I've taken a quick look at where the duplicates tend to be concentrated, by adding the files in alphabetical order and looking to see how many duplicates occur with each addition. These files create no problems (i.e. they have no duplicates among themselves):
affix_glot-ix.xml affix_k-m.xml affix_n-t.xml affix_u-CAPS.xml c.xml c-glot.xml c-rtr.xml glottal.xml h.xml h-phar-part1.xml h-phar-part2.xml l-affric.xml lex-suff.xml new-data-2013.xml p-glot.xml phar-w.xml qw-glot.xml s-rtr.xml t-glot.xml xw.xml
When I add the remaining files, one by one (and only one at a time), these are the results:
k.xml 100 duplicates. k-glot.xml: 18 kw.xml: 2 kw-glot.xml: 2 l.xml: 3 l-fric.xml: 6 m.xml: 3 n.xml: 97 p.xml: 7 particles.xml: 4 pron.xml: 2 q.xml: 4 q-glot.xml: 3 qw.xml: 1 rescued.xml: 54 s.xml: 2 t.xml: 20 ww-glot.xml: 4 x.xml: 3 x-uvul.xml: 4 yy-glot.xml: 4
What I'm going to do is develop the dictionary output using only the valid files, and then add the others in as they get fixed. In the meantime, it might be worth having a go at some of the low-hanging fruit (the ones with only two or three duplicates). More will show up as we add those in, of course -- there will be duplicates across the currently-excluded files as well as those that they share with the "good" files. So the dictionary PDFs will shrink in size, but I'll be able to start doing things like generating page-references that depend on xml:ids.
Lucene-based fuzzy matching seems to be very broken in the build of eXist I'm using, and in any case it's based on Levenshtein distance, so I've implemented a crude version of the USM/NCD algorithm in XQuery. It's a long way from ideal, though, because it's using base64 versions of strings rather than compressing the actual strings (this is all I can do with eXist's exposed gzip access); using zip seems to be punitive because it would require creating a file on the filesystem or in the db and compressing that. I think a simpler approach would be to take my Java class and strip out all the command-line stuff it contains, then call that directly from XQuery (see the xqSearchUtils java project and the way it's called from the Despatches XQuery for an example). A jar file with a simple XQuery module interface might be very handy indeed.
Media queries...
1. SA found a solution with regards to cutting the soundtrack at the millisecond : Use Audacity! The program was installed on POMME.
2. ES entered & committed the transcripts for cltq3, fraq11, fraq12, fraq13
The call is out, and mine are done.
| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|---|---|---|---|---|---|
| << < | Current | > >> | ||||
| 1 | 2 | 3 | 4 | |||
| 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 | 16 | 17 | 18 |
| 19 | 20 | 21 | 22 | 23 | 24 | 25 |
| 26 | 27 | 28 | 29 | 30 | 31 | |