An Encoding Model for Librettos: the Opera Liber DTD Elena Pierazzo pierazzo@ital.unipi.it University of Pisa Opera librettos are a very peculiar literary genre. Often considered an ancillary part of the opera, merely the plot through which the music can express its power and its beauty or the pretext for singers to show their capabilities and the potential of their voices, the libretto is a little studied aspect of the literature. A considerable number of web sites are currently presenting collections of librettos in several formats (doc, pdf, html, gif, txt). However, they generally do not cite their source or even which version of the text they are based upon; furthermore, in most cases they do not respect editorial traditions of the libretto. Librettos have some peculiar structural characteristics: they can be considered a subcategory of drama texts, but they distinguish themselves form the non-musical drama texts mainly in two ways: 1. the presence of Concertato sections 2. the extreme fragmentation of the versification. The Concertato is a musical term that is passed on in the librettos tradition to mean normally a scene or part of a scene performed simultaneously by different characters, each singing different texts, including several cues and stage directions. The number of simultaneous sequences can range from a minimum of two, to a maximum of seven/eight, as in the following example taken from Falstaff (music by Giuseppe Verdi and libretto by Arrigo Boito). [Pages 77-78 of the Falstaff libretto] In the libretto the versification is extremely fragmentary: as for the drama, verses and stanzas are usually split according to the different cues; furthermore, fin de siècle librettos admit different metres that can change at any moment, even within a cue. An important point is that usually the libretto that is printed and distributed to the public can be markedly different from the one that is sung on the stage. In the score, verses and words are adapted to the musical progression and for that reason they can be stretched, repeated, modified, cut and added. The libretto is often conceived as a support for the spectator; in the libretto, indeed, portions of text suppressed in the score, stage directions, comments, notes that have no match with the performed opera, can help the spectator to follow the plot. All these peculiarities need to be seriously taken into consideration before starting any encoding. Firstly, this is because the librettos' printing tradition has fixed some conventions to represent the different characteristics. Second, the public to which a digital collection of librettos is addressed will expect its habits to be taken into account. In the last two years a research project named "L'Opera prima dell'Opera" (The text/literary source before the staging of Opera) has carried out the creation of a digital library of librettos called Opera Liber, freely available on the Net (currently at but will be soon transferred to ). Opera Liber is a portal for the study and the documentation of the Italian librettos for the period 1870 - 1920, including works of the main Italian composers such as Verdi, Puccini, Leoncavallo, Mascagni, Ponchielli and many others. The main resource of the web site is represented by the collection of texts, available both for reading and for linguistic querying. The texts have been encoded in XML TEI format and are managed and queried using the native XML database eXist. The Opera Liber DTD is a customization of the TEI DTD P4, fully documented on the web site, and it is constituted by a mixed base set (verse and drama) and additional tag sets such as figure, transcriptions of primary sources, linking, and names and dates. Some customizations of the DTD have been made, following the prescription of the Chapter 29 of the Guidelines for Text Encoding and Interchange (Sperberg-McQueen & Burnard). In creating the encoding model the main problem was to find a correct encoding for Concertatos. The Concertato can surely be considered a sort of structural division, even if not at the same level of usual structural divisions (such as acts and scenes, encoded by the TEI