For thorough documentation on how to encode primary sources, see Encode a Primary Source Transcription. The following documentation explains how we encode semi-diplomatic transcriptions
of primary source texts—in particular, the semi-diplomatic transcriptions housed in
MoEML’s library. The purpose of these revised guidelines is to 1) standardize the encoding of our
library texts, and to 2) limit and simplify the CSS required to adequately render
these texts.
In our library texts we encode:
Front matter (<titlePage>)
Textual gaps (<supplied>)
Page breaks with linked facsimile images (<pb>)
Woodcut images (<figure>)
Foreign words (<foreign>)
Dates, names, organizations, and toponyms (<date>, <name>, <ref>)
In our library texts we do not encode:
Line beginnings throughout prose (<lb>)
Formeworks (i.e., running titles, signatures, and catchwords)
Last-word wraps
Printer’s ornaments or line rulings
To lessen the amount of time spent on CSS, we have created a set of standard renditions
for our library texts. Within <tagsDecl>, there are standardized renditions for:
Semi-diplomatic transcriptions are transcriptions of texts that are not modernized
or corrected for clarity. These transcriptions are not as strict as facsimile transcriptions
which attempt to replicate the exact layout of the page. Rather, our goal is to normalize
and regularize the features of the text that cannot be adequately captured through
encoding (e.g., spacing, font-size, typographical ligatures) while retaining other
significant features such as spelling, punctuation, abbreviations, and typographical
errors. Our conventions for semi-diplomatic transcriptions can be found here. In summary, we:
Silently normalize the long ſ
Silently expand typographical ligatures (e.g., fl)
Preserve capitalization, italicization, interchangeable characters (i.e., u/v, i/j,
vv/w), vowel digraphs (i.e, æ, œ), nasal tildes over vowels (i.e., ã, ẽ, ĩ, õ, ũ),
macrons over vowels (i.e., ā, ē, ī, ō, ū), and quotation marks
Close up extra spaces between words and punctuation marks
Preserve the line breaks in verse but not in prose
If you run across a unique character while transcribing, you may be able to find it
as a unicode character. For example, note the fleuron in this heading:
See The Magnificent Entertainment sig. A3r for this image in context.
In this case, the encoder can use the unicode character U+2767:
<head>❧ A DEVICE (projected downe, but till now not <hi style="font-style:italic;">publisht) that should have served at his Maiesties first accesse to the Citie</hi>.</head>
If you cannot find an appropriate unicode character for the character you need to
transcribe, bring it to the MoEML team so a protocol can be established.
A common non-standard character that appears in early modern texts is a thorn (þ)
that looks like a small Latin letter y with a reversed hook above:
See The Queen’s Majesty’s Passage sig. A4v for this image in context.
If you run across this character in your text, you will need to add a <charDecl> to your document. General information about encoding non-standard characters can
be found here. Since we have already written a <char> for this particular figure, all you need to do is paste the following <char> into the <charDecl> of your document:
<char xml:id="QMPS1_ye"> <localProp name="name" value="LATIN SMALL LETTER Y WITH REVERSED HOOK ABOVE"></localProp> <desc>An abbreviated form of <mentioned>the</mentioned>. This character takes the form of a small latin letter y with a
reversed hook above. The closest Unicode character we have to represent this is a
small latin letter y with a combining left half ring above. This character appears
twice in the text, which is in black letter gothic.
</desc> <localProp name="entity" value="yesup"></localProp> <mapping type="standard">y͑</mapping> <mapping type="simplified">ye</mapping> <mapping type="medieval">þe</mapping> <mapping type="modern">the</mapping> </char>
Make sure to change the xml:id on <char> to match your document and update the prose to reflect how many times the character
appears throughout the text. When you come across this character in the text, transcribe
it as y͑ (regular y + U+0351) and tag it with the <g> element, @ref attribute, and "xml:id_ye" value:
<p>eche cōteining <g ref="#QMPS1_ye">y͑</g> title of those two princes. And these personages wer so set, <g ref="#QMPS1_ye">y͑</g> the one of thē ioyned han-</p>
While transcribing early modern texts, you will likely stumble across something that
does not lend itself well to encoding. It is up to the MoEML team to decide on a case-by-case basis how these irregularities should be encoded. In
our library texts, we want to avoid the use of extensive in-line CSS. For example,
note this passage from The Magnificent Entertainment:
See The Magnificent Entertainment sig. E4r for this image in context.
Now see how it was transcribed:
<p><name ref="mol:AGLA1" style="font-style:italic;">Aglaia</name>, <name ref="mol:THAL1" style="font-style:italic;">Thalia</name>, <name ref="mol:EUPH1" style="font-style:italic;">Euphrosine</name>, } Figuring { Brightnesse, or Maiestie. Youthfulnes, or florishing. Chearfulnes,
or gladnes.</p>
This encoding avoids the use of extensive CSS while preserving the author’s intent.
When you are transcribing a library text—especially if you are working from an EEBO
TCP transcription—you will need to supply textual gaps. Our documentation on how to
supply gaps can be found here.
If you are only encoding an excerpt from a primary source text, it is unlikely that
you will need to encode front matter. If you are encoding a full text that includes
a title page and other preliminaries (i.e., a dedicatory epistle, a letter to the
reader, an introduction), you will want to nest this information in <front>. Our documentation on how to encode front matter can be found here. Here is the title page from A Remembrance of the Worthy Show and Shooting by the Duke of Shoreditch without entity tagging or styling:
<front> <pb facs="https://search.proquest.com/eebo/docview/2240956608/pageLevelImage/?imgSeq=25" n="D1r" xml:id="REME2_sig_D1r"/> <titlePage> <docTitle> <titlePart type="main">A REMEMBRANCE Of the worthy SHOW and SHOOTING BY THE DUKE of SHOREDITCH, AND HIS ASSOCIATES
THE Worshipful Citizens of London, UPON Tuesday the 17th of September, 1583.</titlePart> <titlePart type="desc">Set forth according to the Truth thereof, to the everlasting Honour of the Game of
Shooting in the Long bow.</titlePart> </docTitle> <docAuthor>By W. M.</docAuthor> <docImprint>London, Printed in the Year 1682.</docImprint> </titlePage> </front>
Note that we do not use <lb> elements to add padding between lines. Guidelines on how to style title pages with
standardized renditions can be found below. All text that appears after the title page and other preliminaries should be nested
within <body>.
After you have encoded the basic structure of your text, you will need to add page
breaks. In our library texts, we:
Mark all page breaks with the <pb> element
Link to the facsimile image of each page with a @facs attribute on the <pb> element
Note each page’s signature number with an @n attribute on the <pb> element
Add an xml:id to each <pb> element so we can create links to specific pages throughout the website
Our documentation on how to link to facsimile pages on EEBO and EEBA can be found
here. Once you have linked to the correct facsimile image, you will need to add an @n attribute with the page’s signature number and an @xml:id attribute with the page’s xml:id (i.e., xml:id of the text + sig + signature number of page):
If the text you are encoding is a broadside, it will not have any signature numbers.
To exclude broadsides from our diagnostic that requires an @n attribute on all <pb> elements (see our diagnostics here), you will need to give them the mdt category mdtPrimarySourceLibraryBroadside. More information about document categories can be found here.
In our Library texts we no longer encode printer’s ornaments or line rulings. If there
is a woodcut image in the text, you can use <figure> to describe the image. For example, see description given of this woodcut in The Great Boobee:
See The Great Boobee for this image in context.
<figure> <figDesc>Woodcut of a traveller with black hat, satchel, and walking stick being approached
by man in black clothes and cape, with ruffled white cuffs and prominent white collar.
Both men are bearded with moustaches. The pair appear on a white background, with
shaded ground beneath their feet.</figDesc> </figure>
Currently our standard page width is "34em", which allows for easy reading. In CSS, you can use absolute length units (e.g.,
cm, mm) or relative length units (e.g., em, rem) to describe length. Relative length
units specify their length in relation to another length property. We used relative
length units at MoEML because they scale better when aspects of rendering—such as
browser size—change. An em specifies its length in relation to font-size. "34em", therefore, means that the page width will be 34 times the size of the font. In the
future, set page widths may be created for different book sizes (i.e., folio, quarto,
octavo, broadside).
A quick way to add simple styling to your text is with "text-align". This value can be used to align text to the left, right, or center and is particularly
useful when styling lines of text that are not headings:
<l style="text-align: center;">Vnicus à Fato surgo non Degener Hæres.</l>
While we want to use CSS to describe how a text looks, we do not want to add CSS that
takes a lot of guesswork and tweaking on behalf of the encoder. Our overriding concern
when encoding primary source texts is to tell the truth. For example, we cannot discern exactly how many ems an indent or dropcap may be, especially when we are working with scans of facsimiles.
Standardized renditions, therefore, provide a quick way to style the main components
of a text (i.e., headings, dropcaps) for the reader.
Below is the <tagsDecl> from The Magnificent Entertainment:
Note that there are currently seven standardized renditions. These renditions should
be pasted into the <tagsDecl> of all future library texts. If you believe that a rendition should be tweaked or
a new rendition should be created, bring your proposal to the MoEML team.
There are two different renditions for headings: "mainHead" and "subHead". The "mainHead" rendition is used for substantial titles and headings and the "subHead" rendition is used for less substantial titles and subheadings. If your text does
not have a title page, it is likely that you will use "mainHead" to style the title. If your text does have a title page, you can use "mainHead" and "subHead" to style individual <titlePart> elements, depending on what the text calls for:
<docTitle> <titlePart rendition="#MAGN3_mainHead" type="main">THE MAGNIFICENT Entertainment:</titlePart> <titlePart rendition="#MAGN3_subHead" type="desc">Giuen to <name ref="mol:JAME1">King <hi style="font-style:italic;">Iames</hi></name>, <name ref="mol:ANNE2">Queene <hi style="font-style:italic;">Anne</hi></name> his wife, and <name ref="mol:HENR9" style="font-style:italic;">Henry Frederick</name> the Prince, vpon the day of his Maiesties Trvumphant Passage (from the <ref target="mol:TOWE5">Tower</ref>) through his Honourable Citie (and Chamber) of <ref target="mol:LOND5" style="font-style:italic;">London</ref>, being the <date when-custom="1603-03-15" datingMethod="mol:julianSic" calendar="mol:julianSic">15. of March. 1603</date>.</titlePart> </docTitle>
Note that "mainHead" and "subHead" can be used throughout a text. In Nine Worthies of London, substantial headings appear every time a worthy is introduced:
See Nine Worthies of London sig. B3r for this image in context.
The differing font size of Sir William Wallworth Fishmon- versus er, sometime Maior of London demonstrates the type of CSS styling that we do not want to do in our library texts.
In this case, we would style the entire heading with "mainHead":
<head rendition="#NINE2_mainHead">Sir William <hi style="font-style: italic;">Wallworth</hi> Fishmonger, sometime Maior of London.</head>
Indents appear throughout the library texts in both poetry and prose. The "indentedLine" rendition is most often used on the <p> element to indent paragraphs that do not begin with a dropcap. The "indentedLineExtra" rendition is often used in conjunction with the "indentedLine" rendition to indent poetry. As the names suggest, "indentedLineExtra" will indent a line more than "indentedLine". A good example of these renditions being used together is this poem in The Magnificent Entertainment:
See The Magnificent Entertainment sig. F2r for this image in context.
<lg> <l><hi style="font-style:italic;">Troynouant</hi> is now no more a Citie:</l> <l rendition="#MAGN3_indentedLineExtra">O great pittie! is’t not pittie?</l> <l rendition="#MAGN3_indentedLine">And yet her Towers on tiptoe stand,</l> <l rendition="#MAGN3_indentedLine">Like Pageants built on Fairie land,</l> <l rendition="#MAGN3_indentedLineExtra">And her Marble armes,</l> <l rendition="#MAGN3_indentedLineExtra">Like to Magicke charmes,</l> <l rendition="#MAGN3_indentedLine">binde thousands fast vnto her,</l> <l>That for her wealth & beauty daily wooe her,</l> <l rendition="#MAGN3_indentedLine">yet for all this, is’t not pittie?</l> <l><hi style="font-style:italic;">Troynouant</hi> is now no more a Cittie.</l> </lg>
The renditions "lmlabel" and "rmlabel" are used to style the marginal labels that appear in many early modern texts. Note
that in The Magnificent Entertainment’s <tagDecl>, font-style: italic is included within the rendition since most of the text’s marginal labels are italicized.
If most of the marginal labels in your text are not italicized, font-style: italic can be removed.
These renditions are placed directly on the <label> element:
<lg> <l>And put his bounty off with a demurre.</l> <label place="margin-left" rendition="#COLD2_lmlabel">* An vnconscionable Broker.</label> <l>The third a Broker*, a base Houndsditch hound,</l> </lg>
It should be noted that we differentiate between typeface within texts, but not text to text. For example, note how we encode Roman typeface in a mostly Blackletter Gothic
text:
See The Queen’s Majesty’s Passage sig. B1r for this image in context.
<p>wreathe was written the name, and title of the same, which was. <hi style="font-style: italic;">The vniting of the two howses of Lancastre and Yorke</hi>. Thys pageant was grounded vpon the Queenes maiesties name.</p>
Now note how we encode italics in a mostly Roman text:
See The Cold Tearm for this image in context.
<lg> <l>Then there past Wherries in a month and more,</l> <l>’Twixt <hi style="font-style: italic;">Essex</hi>, <hi style="font-style: italic;">Middl’sex</hi>, <hi style="font-style: italic;">Kent</hi> and <hi style="font-style: italic;">Surry</hi> shore.</l> <l>And though for two mon’ths time, that fell together,</l> </lg>
In both cases, we use the <hi> element, @style attribute, and "font-style:italic" value to mark the change in typeface.
While superscripts are not very common in early modern texts, you may stumble across
some that need to be styled. Many superscripts appear throughout The Praise and Virtue of a Jail and Jailers:
See The Praise and Virtue of a Jail and Jailers sig. 2M1v for this image in context.
We would style this superscript as follows:
<lg> <l>That it in History is not enrold.</l> <l>And <hi style="vertical-align:super; font-size: 50%;">h</hi> Woodstreet Counters age we may deriue,</l> <l>Since Anno fifteene hundred fifty fiue.</l> </lg>
We tag all dates, names, organizations, and topynyms in our library texts. For a brief
overview of how to tag these entities, see Tagging Dates, Companies, Toponyms, and People. While this quickstart is directed at those encoding Survey of London, the principles are the same for those encoding library texts.
Provider: University of Victoria
Database: The Map of Early Modern London
Content: text/plain; charset="utf-8"
TY - ELEC
A1 - LeBere, Kate
ED - Jenstad, Janelle
T1 - Encode a Library Text
T2 - The Map of Early Modern London
ET - 7.0
PY - 2022
DA - 2022/05/05
CY - Victoria
PB - University of Victoria
LA - English
UR - https://mapoflondon.uvic.ca/edition/7.0/encode_library_text.htm
UR - https://mapoflondon.uvic.ca/edition/7.0/xml/standalone/encode_library_text.xml
ER -
TEI citation
<bibl type="mla"><author><name ref="#LEBE1"><surname>LeBere</surname>, <forename>Kate</forename></name></author>.
<title level="a">Encode a Library Text</title>. <title level="m">The Map of Early
Modern London</title>, Edition <edition>7.0</edition>, edited by <editor><name ref="#JENS1"><forename>Janelle</forename>
<surname>Jenstad</surname></name></editor>, <publisher>U of Victoria</publisher>,
<date when="2022-05-05">05 May 2022</date>, <ref target="https://mapoflondon.uvic.ca/edition/7.0/encode_library_text.htm">mapoflondon.uvic.ca/edition/7.0/encode_library_text.htm</ref>.</bibl>
Project Manager, 2020-2021. Assistant Project Manager, 2019-2020. Research Assistant,
2018-2020. Kate LeBere completed her BA (Hons.) in History and English at the University
of Victoria in 2020. She published papers in The Corvette (2018), The Albatross (2019), and PLVS VLTRA (2020) and presented at the English Undergraduate Conference (2019), Qualicum History
Conference (2020), and the Digital Humanities Summer Institute’s Project Management
in the Humanities Conference (2021). While her primary research focus was sixteenth
and seventeenth century England, she completed her honours thesis on Soviet ballet
during the Russian Cultural Revolution. During her time at MoEML, Kate made significant
contributions to the 1598 and 1633 editions of Stow’s Survey of London, old-spelling anthology of mayoral shows, and old-spelling library texts. She authored
the MoEML’s first Project Management Manual and quickstart guidelines for new employees and helped standardize the Personography and Bibliography.
She is currently a student at the University of British Columbia’s iSchool, working
on her masters in library and information science.
Janelle Jenstad is Associate Professor of English at the University of Victoria, Director
of The Map of Early Modern London, and PI of Linked Early Modern Drama Online. She has taught at Queen’s University, the Summer
Academy at the Stratford Festival, the University of Windsor, and the University of
Victoria. With Jennifer Roberts-Smith and Mark Kaethler, she co-edited Shakespeare’s Language in Digital Media (Routledge). She has prepared a documentary edition of John Stow’s A
Survey of London (1598 text) for MoEML and is currently editing The Merchant of Venice (with Stephen Wittek) and Heywood’s 2 If
You Know Not Me You Know Nobody for DRE. Her articles have appeared in Digital Humanities Quarterly, Renaissance and
Reformation,Journal of Medieval and Early Modern Studies,
Early Modern Literary Studies, Elizabethan
Theatre, Shakespeare Bulletin: A Journal of Performance
Criticism, and The Silver Society Journal. Her book
chapters have appeared (or will appear) in Institutional Culture in Early
Modern Society (Brill, 2004), Shakespeare, Language and the Stage,
The Fifth Wall: Approaches to Shakespeare from Criticism, Performance and Theatre
Studies (Arden/Thomson Learning, 2005), Approaches to Teaching
Othello (Modern Language Association, 2005), Performing Maternity
in Early Modern England (Ashgate, 2007), New Directions in the
Geohumanities: Art, Text, and History at the Edge of Place (Routledge, 2011), Early
Modern Studies and the Digital Turn (Iter, 2016), Teaching Early Modern
English Literature from the Archives (MLA, 2015), Placing Names:
Enriching and Integrating Gazetteers (Indiana, 2016), Making
Things and Drawing Boundaries (Minnesota, 2017), and Rethinking
Shakespeare’s Source Study: Audiences, Authors, and Digital Technologies
(Routledge, 2018).
Janelle Jenstad authored or edited the following items in MoEML’s bibliography:
Jenstad, Janelle and Joseph Takeda. Making the RA Matter: Pedagogy, Interface, and Practices.Making Things and Drawing Boundaries: Experiments in the Digital Humanities. Ed. Jentery Sayers. Minnesota: University of Minnesota Press, 2018. Print.
Jenstad, Janelle. Building a Gazetteer for Early Modern London, 1550-1650.Placing Names. Ed. Merrick Lex Berman, Ruth
Mostern, and Humphrey Southall. Bloomington and
Indianapolis: Indiana UP, 2016. 129-145.
Jenstad, Janelle. The
Burse and the Merchant’s Purse: Coin, Credit, and the Nation in Heywood’s 2 If You Know Not Me You Know Nobody.The
Elizabethan Theatre XV. Ed. C.E. McGee and A.L.
Magnusson. Toronto: P.D. Meany, 2002. 181–202.
Print.
Jenstad, Janelle. The City Cannot Hold You: Social Conversion in the Goldsmith’s
Shop.Early Modern Literary Studies 8.2 (2002): 5.1–26..
Jenstad, Janelle. The Gouldesmythes Storehowse: Early Evidence for
Specialisation.The Silver Society Journal 10 (1998): 40–43.
Jenstad, Janelle. Lying-in Like a Countess: The Lisle Letters, the Cecil
Family, and A Chaste Maid in Cheapside.Journal of Medieval and Early Modern Studies 34 (2004): 373–403. doi:10.1215/10829636–34–2–373.
Jenstad, Janelle. Public
Glory, Private Gilt: The Goldsmiths’ Company and the Spectacle of Punishment.Institutional Culture in Early Modern Society. Ed.
Anne Goldgar and Robert Frost. Leiden: Brill, 2004. 191–217. Print.
Jenstad, Janelle. Smock
Secrets: Birth and Women’s Mysteries on the Early Modern Stage.Performing Maternity in Early Modern England. Ed. Katherine
Moncrief and Kathryn McPherson. Aldershot: Ashgate, 2007. 87–99. Print.
Jenstad, Janelle. Using
Early Modern Maps in Literary Studies: Views and Caveats from London.GeoHumanities: Art, History, Text at the Edge of Place. Ed.
Michael Dear, James Ketchum, Sarah
Luria, and Doug Richardson. London: Routledge, 2011. Print.
Stow, John. A SVRVAY OF
LONDON. Contayning the Originall, Antiquity, Increase, Moderne estate, and description
of that Citie, written in the yeare 1598. by Iohn Stow Citizen of London. Also an
Apologie (or defence) against the opinion of some men, concerning that Citie, the
greatnesse thereof. With an Appendix, containing in Latine, Libellum de situ &
nobilitate Londini: written by William Fitzstephen, in the raigne of Henry the
second. Ed. Janelle Jenstad and
the MoEML Team. MoEML. Transcribed.
We’d also like to acknowledge students who contributed to MoEML’s intranet
predecessor at the University of Windsor between 1999 and 2003. When we redeveloped
MoEML for the
Internet in 2006, we were not able to include all of
the student projects that had been written for courses in Shakespeare,
Renaissance Drama, and/or Writing Hypertext. Nonetheless, these students
contributed materially to the conceptual development of the project.