Judy and I helped Sada, his RA, and Alicia Brown get familiar with the system. Surprised at Alicia's lack of training and begain reviewing possibilities for training CS staff (night users). (She has been teaching in lab for years but has little knowledge of basic features.) Also thought about CALL staff training (night shift employees). Will discuss with Ali.
Yesterday evening the Research Collective on computer mediated communication met for the first time. Present were Peter, Scott, Andrew Rippin, Claire Carlin, John Lutz, Ray Siemens, Catherine Caws, Marie Claude, Ulf Scheutze.
Ulf and Catherine convened the group in an effort to stimilate interest in research collective. The group determined no directions but
- discussed meaning of Computer mediated communication
- decided on a monthly meeting agenda
- agreed to provide discussion topics (theoritical) for next meeting
- agreed to have one showcase (coordinated with ours) yearly
- agreed to put forth a newletter once yearly
- agreed to expand group to other faculties
- Discussed having invited speakers
Action items included:
- creation of a listserv
- creation of a blog (which may take place within our own tool?)
Next meeting is Feb. 28th, Martin's Place, 6:30.
Any of you would be welcome to join us. Most particpants are language faculty. Right now theoretic discussion "salon" style seems to be the consensus. I put forth the idea of having a formal relationship with NWALL, and Peter suggested ways in which the HCMC could support the collective. There may be technical duties for certain events, but nothing has been determined yet.
The Graves project will have to be moved to the new eXist. Began that process today, by getting a user/group set up, pushing the files into the db, and testing the XQuery. Files went into the db OK, and the front page works, but the rest of the site is scuppered by the fact that it can't find the DTD. We added it into catalog.xml and put it in the entities directory, but the error I'm getting still suggests that entities can't be expanded:
org.exist.xquery.XPathException: XPST0003 : Invalid character in entity name (=) or missing ;
Need to get back to this in future. It has to be done.
Project proposal, still a work in progress:
Project Title: | Devonshire MS |
Objective(s): |
For this first phase of the project, the objective is to produce an online electronic edition which presents both a clean version of the text, and the scholarly apparatus that would be found in a conventional print edition. Our model is an edition in which there are two volumes, one containing the text, and one containing the apparatus. We aim to present these side-by-side, both scrolling under the user's control. The text will show line-numbering (every 5 lines), while the apparatus will refer to line numbers. The full MS will be chunked for the purposes of display at the poem level (each poem is a <div> tag in the TEI XML). There is an issue with "special" characters, which are currently represented by parenthetic symbols rather than actually encoded. For display, we want to use Unicode characters, but we also want to generate a normalized option for general readers. Meanwhile, the original XML must not be edited, for the present at least. Therefore, the first task is to use Transformer to create a replace sequence, which finds each special character signal and replaces it by a sequence of elements including the normalized reading and the Unicode character(s). This replace sequence can be run on the XML markup, and the result used for our project, while the original file is untouched. Whenever edits are made to the original, we can run the sequence again in Transformer, until eventually the characters are permanently replaced. The site should also provide access to the original XML file as XML, and plain-readable CSS-rendered versions of the original and the Unicode-ized version, using Karin's CSS. Following this process, we will examine the range and types of annotation in the document, and decide on the various categories of apparatus which will be displayed. Meanwhile, Cara will be producing some data based on collation of other witnesses. This will be in the form of discrete XML files, linked with the poems in the original document using id-based mechanisms, which we will then link into the site in a simple manner. This lays the groundwork for a more research-oriented process which experiments with various ways to display collations. This research work will take place after the goals for this phase of the project have been completed. |
Lead(s): | Ray Siemens |
Contact email: | karindar@uvic.ca |
Team members: |
Karin Armstrong Cara Leitch |
Benefits: | The first benefit is electronic publication of the Devonshire MS. This may help get a print version published. After that, the work we do on ways to present multiple witnesses will be valuable; few people are working on this, and what has been produced so far is a long way from satisfactory. |
Scope: | I don't know what "Scope" is supposed to mean, but let's ignore it. |
Constraints, risks: | The initial online presentation needs to resemble a conventional edition to some degree, so that it doesn't scare off the press who may want to publish the print edition. |
Resources required: |
Programming (Martin Holmes) Hosting (TAPoR machines, Greg Newton consulting) Existing tools and software (no new purchases envisaged). |
Time line: | Initial site is to be available by April 26 2007. If collation work is in a suitable state to be used in time, then it will be included, using a simple linking mechanism. |
[Note: as we work on refining this, I'll increment the minutes spent.]
I've collated Ray's and Karin's reponses here:
1. No, it's not a research playground. What's needed is achieving display principles for an electronic edition that, in this case, and for the project thus far reads like a print edition of the same standard. The research playground is a step after this one.
The first step, is to use CSS or XSL to display the encoded text in a way that not only captures the basic information one sees in a print text (in a way that's readable) but also begins to suggest the bibliographic detail that the encoded attempts to represent. The best model for this would be a print edition established by using diplomatic editing principles with collation. (Later on we'll add things like notes and dynamic rendering).
2. users: people who want a good print text. This isn't the final end point, but a good mid-stage to aim for (and essential).
3 and 4: rudimentary tools only, these will be focused on much later in the project.
5 and 6: right now, Cara is exploring tools and strategies for the generation and representation of witnesses. Collate seems the most likely of the collation tools that we will use, but as casual conversation with you and others has suggested we will probably want to explore specific TEI scholarly apparatus. Could we explore this more with you? Cara will likely be collating the texts within the month.
7. We would like to have a reading text before the end of term. But it's important to stress that this is a mid-stage development in a 2-3 year project lifespan.
8. we'd like to discuss this with you, but more generally we'd like to use open solutions.
Spent some time helping Karin with her preliminary CSS rendering of the Devonshire MS -- looks pretty good. There's now a formal proposal for a project to produce an online edition, which we kicked around a little in discussion and by email. This is what I need to know before I can start planning:
- Is this primarily a research playground, in which we investigate various methods of presenting a scholarly edition online, or does it have one set of clear aims known in advance? If it's a research thing, then we'll need to set aside quite a lot of time to do some background reading on the state of the art; Alan Galey will be a good person to talk to here, and we should also look at Stan Ruecker and his team's work.
- What are the main types of user who we expect to access the text? (Lit scholars, undergrads, general readers, historians, etc.)
- What features do each of these user types want to see?
- What level of sophistication do we want in searches? For instance, are we going to undertake the creation of wordlists to handle stemming, or elaborate the markup with lemmas?
- We know there are other witnesses to some components of the text. What kind of data view is imagined to present the variants?
- Will the multi-witness view be limited to the segments of the text to which it applies? For instance, if one particular poem has ten witnesses, will a collation view only be available when viewing that particular poem in isolation, or will a larger view of the continuous MS also have to present variants?
- Is there a time-line for the project? What deliverables are there, and what are their deadlines?
- Is there a requirement or a preference to use or avoid particular software, techniques or approaches?
I'd like to get Ray's detailed responses to these questions so that I can get down to a more detailed plan, assuming Scott approves the project.
This is essentially the same as the task described here, except that the XQuery ports have already been done. Katakana is no longer really relevant, but it's a great testbed for complicated XQueries that really stress the server, and for range indexes, and it's a fine example project, so we should keep it live.
Presented the Image Markup Tool to John Lutz's History 481 class, and then floated while they did a practice markup task. Found one bug in the Web view output for the Markup Tool, documented on the IMT blog.