28/06/17

Permalink 10:51:10 pm, by jtakeda, 126 words, 13 views   English (CA)
Categories: Activity log; Mins. worked: 1230

ISE3 Editor packages

Been busy trying to get everything finished before vacation, so I've been forgetting to blog. Summary of minutes below.

Editor packages should be good to go. Everyone has schemas, xprs, and tools in their project files, with the tools doing fairly decent lemma checking. I'm fairly confident that the ODD is good enough to start working with. I was able to get annotation and collation templates ready as well. I also added DJ's files from a few years ago to the repo and started chipping away at them; they're good and interesting TEI experiments that need a fair bit of wrangling to get into ISE3 TEI. A good exercise, though, since it's contributing greatly to the ODD and schematron.

Minutes summary: 23: 330min; 27: 360min; 28: 540min (360 in office)

26/06/17

Permalink 04:52:30 pm, by jtakeda, 77 words, 11 views   English (CA)
Categories: Activity log; Mins. worked: 630

ISE3: Editor package

Discussed issues regarding witnesses with JJ, MH, and MT today and hashed out exclusion/inclusion of witnesses. Sorted that out, got the ODD + taxonomies transform working and then I started work on a new SGML->TEI build that creates a full edition package for editors. Almost done--just need to create the XPR file for each package. Next will be the editor build tools. Edit: Worked an extra 180 minutes at home getting editor build tools up and running.

22/06/17

Permalink 11:28:55 pm, by jtakeda, 138 words, 16 views   English (CA)
Categories: Activity log; Mins. worked: 390

ISE3

Finished documenting the lemma checker and discussed editor tools with MT. Also continued work on the ODD and discussed best linking practice in terms of docs and TLNs. MT and I decided that on a few prefixDefs, the main ones being:
  • ident="doc" | matchPattern="(.+)(#.+)?| replacementPattern="http://ise3.uvic.ca/$1$2". This gets us around TLNs not yet having explicit xml:ids (which we also decided will only be local to the document, not project-wide)
  • ident="tln" | matchPattern="(.+)" | replacementPattern="iseH5_FM.xml#tln-$1". This will only be used in the context of @to and @from in the apparatus files. The prefix will be defined for each apparatus/collation document so that the link will refer to a specific TLN in the text file (i.e iseH5_FM will change).
  • And simple ones to refer to people, document types, and glyphs

21/06/17

Permalink 04:31:13 pm, by jtakeda, 38 words, 37 views   English (CA)
Categories: Activity log; Mins. worked: 375

Lemma matching

Revamped lemma matching code to make it much more efficient. It has now been split into 3 XSLTs, chained together in a build process. The lemma match function has also been rewritten so that it reports errors more accurately.
Permalink 04:17:07 pm, by mholmes, 14 words, 15 views   English (CA)
Categories: Activity log; Mins. worked: 120

Discussions on lemma matchin

Posting time spent discussing lemma matching and milestone insertion over the past two days.

20/06/17

Permalink 10:53:45 pm, by jtakeda, 236 words, 27 views   English (CA)
Categories: Activity log; Mins. worked: 450

Lemma matching and ODD tutorial

Met with MH and MT about ODD creation. We've decided that we're going to try and do most of the documentation in the ODD itself and create an ISE-TEI guidelines from the standard transform (and wrap it in the ISE's styling). Most of the day was spent working on the apparatus matching code, which has preoccupied my thought for a while. I have an XSLT in the Git repo that matches lemmas that seems to be working; it's finding errors that are truly errors (incorrect ranges, bad characters, etc). The process is: * Tokenize the entire source text in 'c' elements with generated @xml:ids * Look at a TLN and see if we can find the right following characters that string together the proper phrase * If there's a match, add it to the @to/@from attributes in the span/app (depending on the context) * Then in a final pass, get rid of all the c elements and add anchors if there is an apparatus entry that references the character xml:id There's a lot of working with preceding nodes and ensuring characters are following the right TLN and all the nodes are being processed twice (first to find the beginning anchor and then again to find the ending anchor). This isn't the most efficient, but I think it will work out well. The next step is to integrate this into an Editor build tool as a diagnostic.

19/06/17

Permalink 04:40:48 pm, by jtakeda, 85 words, 17 views   English (CA)
Categories: Activity log; Mins. worked: 435

Annotations and collations

Long day with lots of work done on annotations and collations. We have a fairly solid structure up and running. ise:rdg/@resp are becoming tei:rdg/@wit that will point to a tei:witness/xml:id in the header of the document that @corresp to a series, etc. Annotations have become a list of tei:notes with spans, glosses, and other notes. Still have to figure out iembeds, which will be worth while when converting the XWiki documents. This is all coming together nicely.

17/06/17

Permalink 05:42:35 pm, by jtakeda, 54 words, 18 views   English (CA)
Categories: Activity log; Mins. worked: 390

ISE3 mtg (JJ, MT, DJ)

Had a whole day meeting with JJ, MT, and DJ to go over ISE3 implementation. Lots of great stuff discussed, most of which documented through Asana and the GitHub repo. And good headway made with annotations and collations; I've think we've discussed it enough to start writing some code that processes annotations and collations.

16/06/17

Permalink 04:16:53 pm, by jtakeda, 69 words, 29 views   English (CA)
Categories: Activity log; Mins. worked: 160

Work on Annotations

Talking at length with MT about annotations and collations. He gave me a good run-through of how apparatus work and we made some progress thinking about how it will be implemented. We're still not sure about which method of annotation is best for the project, particularly since we're sort of wedded to string-matching. We've been wading through the TEI guidelines trying to find the most appropriate method for attachment.

15/06/17

Permalink 09:10:04 pm, by jtakeda, 105 words, 29 views   English (CA)
Categories: Activity log; Mins. worked: 795

ISE3 Conversion (420+ 375)

Post for June 14 and 15: working on ISE3 TEI conversion. Long time spent trying to deal with unicode characters that were being garbled in OSX--solved by creating a small XSLT for conversion that used analyze string to tokenize each character and fn:string-to-codepoints to check whether or not the string should be escaped or not. Seems to be working well now. Discussed file structure with JJ and MT to come to the conclusion: each edition gets its own folder (with the ISE work id without the 'ise' prefix) that has documents, etc. Note: we are not future proofing the ISE to think about more than 1 edition.

<< Previous Page :: Next Page >>

ISE

Internet Shakespeare Editions

Reports

XML Feeds