Fall 2019 LEMDO Project Plan
Team: RAs at UVic (Ashley Howard, Kate LeBere, Chris Horne); RA at UNR (Sarah Johnson); Programmer (Joey Takeda); Director (Janelle Jenstad)
Project Title: Remediation and Documentation
Outcomes
- Some ISE editions will be fully remediated and available on LEMDO (by RAs).
- One DRE edition will be fully remediated and available on LEMDO (by JJ).
- One QME edition will be fully remediated and available on LEMDO so that we can resolve performance notes encoding and video embedding for the LEMDO platform (by JT).
Objectives
- Clean up six editions.
- Clean up the metadata for all the files in those five editions.
- Get the digital surrogates for those five editions (and metadata for the digital surrogates) into UVic ContentDM.
- Republish those six editions on LEMDO.
- Finish the documentation for all the parts of the edition.
Deliverables
See Asana LEMDO-Tech space for details.
Tasks
See Asana LEMDO-Tech space for details.
Texts
- JJ does 1HW
- JT does FV
- AH does AYL
- CH does H5
- KLB does Oth
- SJ does TBA
Process
- We all remediate one or more OS texts. Pick a facsimile from the ones available, document the choice in the metadata, and make sure the Library adds each one to ContentDM.
- JJ and AH finish the documentation for OS texts.
- We all remediate one or more M texts.
- JJ and AH finish the documentation for M texts.
- We all remediate an annotation file.
- JJ, AH, and JT finish the documentation for annotations.
- We all remediate a collation file.
- JJ, AH, and JT finish the documentation for collations.
- We all remediate one or more critical paratext files (secondary). This task will entail cleaning up bibliographies and adding entries to BIBL1.
- JJ, AH, and JT finish the documentation for critical paratexts.
- We all remediate one or more supplementary texts (hybrid primary/secondary).
- JJ, AH, and JT finish the documentation for supplementary texts.
Necessary Information
Types and Origins of Texts to Be Remediated
- OS: Old-spelling transcriptions of primary texts (from IML or EEBO-TCP)
- M: Modern-spelling texts of primary texts (from IML)
- Critical paratexts (from xWiki or from the old ISE, QME, and DRE sites)
- Annotations (XML files written in an IML-XML hybrid)
- Collations (XML files written in an IML-XML hybrid)
- Supplementary materials (from xWiki or from the old ISE, QME, and DRE sites. These texts were treated as critical paratexts in the ISE Platform; LEMDO treats them as primary texts.)
- Title pages from QME (already in good TEI; need LEMDO teiHeader)
Remediation Processes
There are four ways in which we convert texts for LEMDO.
- IML to LEMDO TEI-XML: OS and M texts.
IML files are stored as .txt files.
JT and/or MH run multiple conversion/cleanup cycles to produce rough LEMDO TEI.
RAs clean up the TEI and check the accuracy of the transcription simultaneously.
- EEBO-TCP TEI-XML P5 texts to LEMDO TEI-XML: OS texts only.
We get EEBO-TCP files from GitHub. They are already encoded in TEI P5.
JT runs a conversion process that produces relatively good LEMDO TEI.
RAs clean up the TEI and check the accuracy of the transcription simultaneously.
xWiki custom HTML to LEMDO TEI: Secondary texts, About pages, Paratextual materials
RAs clean up the files, with varying amounts of copyediting and checking. Generally, the editor of the edition and/or the Anthology Lead will be involved with checking.
RAs encode the files using the LEMDO tagset for born-digital files.
RAs clean up the files, with varying amounts of copyediting and checking. Generally, the editor of the edition and/or the Anthology Lead will be involved with checking.
Some texts were programmatically converted.
We might have to copy and paste the text into our own XML files.
- XML to TEI-XML conversion: Annotations and Collations
These files are stored as .xml files. They are encoded in a boutique IML-XML language.
JT does a conversion that embeds pointers to these annotations and collations (about 60% successful with the best texts).
RAs check the pointers.
RAs manually add pointers for annotations and collations that could not be resolved programmatically.
RAs clean up the encoding of the annotations/collations.