Scholarly contribution

Conventions for Diplomatic Transcriptions

Regularization Practices

Our practice is to preserve most of the typographical, orthographical, and compositorial features of the original text. To do this, our transcribers, editors, and encoders follow these conventions:
Textual Component Rule
Long ſ
We preserve the long ſ.
Capitalization
We preserve the capitalization of characters in the source, including the second upper-case letter after a woodblock dropped capital.
Italicization
We preserve the italicization of words by tagging them with a <hi> element with a @style value of "font-style: italic;". We consider italicization to be a bibliographical code rather than a linguistic code.1
Interchangeable Characters
We retain the interchangeable u/v and i/j and the use of vv for w.
Ligatures
We retain the vowel digraphs using the appropriate Unicode characters (e.g., æ). We silently expand typographical ligatures (e.g., fl).
Nasal Tildes
We retain the nasal tilde over vowels (e.g., õ) using the appropriate Unicode characters.
Spacing Within Lines
We close up extra spaces between words and punctuation marks. However, we retain the spacing in authorial initials, such as A. M. (for Anthony Munday). We add a single space after a comma when the comma has been used to separate two words.
Lineation
We preserve the line breaks in verse sections. We also preserve the line wrapping in the prose sections of some works in our library, principally the mayoral pageant books. Prose line breaks are encoded with a self-closing <lb/> element. All line breaks in verse are produced by the use of <l> elements contained by <lg> elements.
Hyphenation
We preserve the hyphenation of words, both within and at the end of lines.
Quotation Marks
We retain all quotation marks in the text using the appropriate Unicode characters. We do not use the <quote> element for quotations in primary-source texts. Note that the MoEML Guide to Editorial Style calls for curly apostrophes and straight double quotation marks in both transcriptions and born-digital texts.

Notes

  1. For definitions of bibliographical code and linguistic code, see Encode a Primary Source Transcription.