Customizing ODD

I'm beginning the work on customizing the ISE3's ODD so that we can have a "standoff" element to store all of the database-like stuff that the rest of the Endings projects have been putting in the teiHeader. It is based off of the stdf proposal, but is less concerned about linguistic annotation.

Basically, the standoff element contains model.listLike and listBibl (which is part of model.biblLike) and spanGrp (for annotations).

Note to self: Adding the standoff element (or any custom namespace element) between the teiHeader and the TEI element requires adding that namespace to the @defaultExceptions attribute in the schemaSpec element.


Developing local search engine

One of our project goals is to investigate the practicality of developing a local search "engine" which does not require any server-side support, to find out whether it is possible to do this, and if so, how large a site can be before it's impractical. Today I did the first half of this work, with the Keats site as a pilot (because it's modern(ish) English, it's of a size which is not huge but not trivial, and it doesn't have any back-end and probably shouldn't). This is what I've got so far:

  • XSLT tokenizes all the content files, duplicating some bits to create simplistic weighting. I attempt to preserve proper names by retaining capitalization for all words which don't appear in a small English word-list (40,000 words).
  • A python Porter stemmer stems all the non-proper-name tokens.
  • XSLT amalgamates all the token-counts and their source documents.
  • XSLT generates a separate JSON file for each token, containing a list of all documents containing it, and how many hits there are in that document.

Next, we write a search engine interface in which we use JavaScript to:

  • stem each search term (unless it's a proper name). I've found a JS implementation of the Porter stemmer.
  • retrieve the JSON files for each of the search terms
  • unify them to get hit counts for each individual document
  • display (paged?) results

Should be doable in a few hours.

Keats: developing local search engine

I'll document this on the Endings blog, because it's really part of that project.

Fixes and updates

Fixed a number of issues with the now-live site, including the removal of the old stats.htm page in favour of the new statistics.htm, and some tweaks and updates to documentation to take account of changes.

Corrections to Arthur text

Entered author's proofing corrections.


MDH: 401 + 1 = 402 hours G&T + 1.5 days CTO

Wrestling with whitespace in CSS.

Keats: solved a problem in documentation

This is a note to self: the TEI rendering of ODD to HTML documentation has screwed-up whitespace inside the egXML elements. The solution is to add this to the CSS:

  white-space: pre-line;

This should probably be added into the TEI Stylesheets, via a ticket. Took me a while to figure it out.

March 15 2018

Today I continued editing and marking up Pausanias 2. I created PITY2, PAUS3, EPIO1, ARES3, and (orgs)DORI6 and ARIC1. I also included ISCH1 as a son of ELAT1 because I read online that Ischys is the brother of Polyphemus which we had attributed to the man known as ELAT1 but please update it if you come across contradictory information. I also added to my lists and worked up to 2.27.6.
Starting on Mariage documentation

I'm finally getting around to writing proper ODD-based documentation for the Mariage project, starting with a description of the use of Image Markup in the project, prompted by a request to talk about it. I've done an intro, and the section on the IMT and image markup in the project. More later...

March 11 - 15

James Albert Thompson seems to have two entries.
Have been working on Ontario. Down to ~3,100.

