<?xml version="1.0" encoding="UTF-8"?>
<TEI.2 id="poster_156_meschini">
   <teiHeader>
      <fileDesc>
         <titleStmt>
            <title>Classifying the Chimera</title>
            <author>
               <name reg="Meschini, Federico">Federico Meschini</name>
            </author>
            <respStmt>
               <resp>Marked up by </resp>
               <name reg="Holmes, Martin">Martin Holmes</name>
               <lb/>
               <name reg="Baer, Patricia">Patricia Baer</name>
            </respStmt>
         </titleStmt>
         <publicationStmt>
            <p>Marked up to be included in the ACH/ALLC 2005 Conference Abstracts book.</p>
         </publicationStmt>
         <sourceDesc>
            <p>None</p>
         </sourceDesc>
      </fileDesc>
      <profileDesc>
         <textClass>
            <classCode>paper</classCode>
            <keywords>
               <list>
                  <item>digital libraries</item>
                  <item>TopicMap</item>
                  <item>TEI</item>
               </list>
            </keywords>
         </textClass>
      </profileDesc>
      <revisionDesc>
         <list>
            <item>MDH: Created from John Bradley's XML <date value="2005-03">March 2005</date>
            </item>
            <item>MDH: Entered proofing corrections from RS. <date value="2005-05-19">19 May 2005</date>
            </item>
         </list>
      </revisionDesc>
   </teiHeader>
   <text>
      <front>
         <docTitle n="Classifying the Chimera">
            <titlePart>Classifying the Chimera</titlePart>
         </docTitle>
         <docAuthor>
            <name reg="Meschini, Federico">Federico Meschini</name>
            <address>
               <addrLine>fmeschini@tin.it</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">Tuscia University</titlePart>
      </front>
      <body>
         <div0>
            <p>While the <term>Digital Library</term> (<term>DL</term>) concept is an extremely vague one, its paradigm, the real implementation, is, if possible, still more elusive.</p>
            <p>Since the late 80s and during the 90s, the primary concern and task in the textual digital resources field was really a basic (but not simple) one: how can we put texts in computers? How can we encode, manage and memorize them? Which stuff digital texts are made of?</p>
            <p>Other important problems were perceived (visualization, for example) but they were temporarily put aside for practical reasons. The Text/Data relationship is really a Castor and Polydeuces’ one: is Text a particular kind of Data or Data a particular kind of Text? The main choice, and long-term winning, in the textual encoding issue was the use of the powerful <term>Standard Generalized MarkUp Language</term> (<term>SGML</term>), and – more specifically – the rules (or better <emph>guidelines</emph>) established by the <title>Text Encoding Initiative</title> (<title>TEI</title>), which could actually be considered the <hi rend="emph">de facto standard</hi> for encoding humanistic texts in digital format.</p>
            <p>The transition from SGML to the <term>eXstensible MarkUp Language</term> (<term>XML</term>) was, more than an evolution, a sort of Copernican revolution for two main aspects: the introduction of the new stylesheet technology for displaying an XML file, in particular the powerful <term>Extensible Stylesheet Language Transformation</term> (<term>XSLT</term>), and the great diffusion and development of open-source software able to manage XML documents. This transition has of course also taken place in the <title>TEI</title>: starting with the P4 version of the Guidelines, XML is the technology now used. As a logic result different open-source software tools for implementing a Digital Library with texts encoded in TEI/XML are now available.</p>
            <p>These tools are divided into two main categories. In the first group there are software created in <hi rend="emph">hard</hi> Information Technology contexts (Database Management Systems, Native XML Databases, Publishing Framework, etc.), and these programs need to be adapted to the specific aims of a digital library<note n="1">E.g., <title level="m">Apache Cocoon</title>
                  <xptr to="http://cocoon.apache.org/"/>, <title level="m">Apache AxKit</title>
                  <xptr to="http://axkit.org/"/>, <title level="m">eXist</title>
                  <xptr to="http://exist.sourceforge.net/"/>
               </note>. In the second, and this is a new trend, these tools are being developed in academic contexts<note n="2">
                  <title level="m">Anastasia</title>
                  <xptr to="http://anastasia.sourceforge.net/"/>, <title level="m">teiPublisher</title>
                  <xptr to="http://teipublisher.sourceforge.net/docs/index.php"/>, <title level="m">XPhilologic</title>
                  <xptr to="http://barkov.uchicago.edu/xphilo/"/>
               </note>, created from the scratch, or using programs from the first group as the basic core; in both cases the final function is the utilization with digital cultural resources and overall <title>TEI</title> encoded texts. Compared to the first group, the logical added value of these tools is that often they provide some specific features for the textual field<note n="3">See for example the <title level="m">TAPoRware</title> set for textual analysis <xptr to="http://cheiron.mcmaster.ca/~taporware/"/> or the <title level="m">Versioning Machine</title>
                  <xptr to="http://mith2.umd.edu/products/ver-mach/"/> for the comparison of different versions and editions of the same text.</note>.</p>
            <p>The choice is now much wider than it was just a few years ago, but it could also cause some confusion in selecting the right tool. As the experts of knowledge management well know, too much information, if not well structured and organized, is equivalent to no information at all, thus being completely useless. Each software has its own peculiarities, which should be evaluated and confronted against the characteristics of the texts being encoded and the general needs and aims of the project of which the digital library is part. What is the main aim of a project? It’s the visualization one, with perhaps a multiple output feature? It’s the research, with some form of advanced textual analysis? It is possible to combine all these aspect? And if somebody has already found a solution for our problems how can we find it over the net?</p>
            <p>Trying to find a solution for these problems, or better, trying to share the same problems to have common solutions, during the TEI Members Meeting 2003, there was the first reunion of the <title>TEI Presentation Tools Special Interest Group</title>
               <note n="4">
                  <xptr to="http://www.tei-c.org/Members/2003-Nancy/mm17.html#tap-sig"/>
               </note>. The <title>Presentation Tools SIG</title> has two main initiatives: the creation and update of a tool list and of a sample collection of texts for testing.</p>
            <p>The first version of the tool list has been presented during the TEI Meeting in Baltimore, last October. This list is actually a digital document encoded with the TEI/XML standard, and in this way it’s currently published, using an XSLT stylesheet, in HTML<note n="5">Available on line at <xptr to="http://miro.acs.its.nyu.edu/tei_cms/show.php"/>.</note>. With a simple structure this list presents the various software in an alphabetical order with a short description and the links to the various implementations. From the descriptions and the links it’s possible to have an idea of the distinctive features of each tool, that for example the software <title level="m">XPhilologic</title> is very good for full-text search and document retrieval<note n="6">See for example the demo on the <title level="m">Brown Writer Women Collection</title> at <xptr to="http://barkov.uchicago.edu/xphilo/search.brownwwp.html"/>.</note> and that <title level="m">Apache Cocoon</title> could be implemented so to have an XML framework for format scalable output of the same TEI document<note n="7">A good implementation of <title level="m">Cocoon</title> with TEI can be found at <xptr to="http://www.nzetc.org/"/>.</note>  And again, with <title level="m">Anastasia</title> is possible to have an electronic text/image edition of a medieval manuscript<note n="8">See the <title level="m">Caxtons’ Canterbury Tales</title> at <xptr to="http://www.cta.dmu.ac.uk/Caxtons/"/>.</note>  and <title level="m">eXist</title> is a powerful native XML database that could be used for queries and researches.<note n="9">See the <title level="m">Digital Quaker Collection</title> at <xptr to="http://esr.earlham.edu/dqc/"/>.</note>
            </p>
            <p>But this is not enough. Perhaps from a list you can obtain some information, but what is really needed (and planned since the beginning) it’s an higher level of classification, and this become more necessary as the number of such software is increasing<note n="10">See the presentations on this subject at ALLC/ACH 2004  <xptr to="http://www.hum.gu.se/allcach2004/AP/"/>. Among the others: Kumar, Amit et al., <title level="m">teiPublisher a repository management system for TEI documents</title>
                  <xptr to="http://www.hum.gu.se/allcach2004/AP/html/prop118.html"/>; Matthew  Zimmerman, <title level="m">Using AMP technology (Apache, MySQL, PHP) for XML publication</title>
                  <xptr to="http://www.hum.gu.se/allcach2004/AP/html/prop156.html"/>;  Stephen Ramsay, Geoffrey Rockwell, Stéfan Sinclair, <title level="m">TAPoRware: Simple Portal Tools for Text Analysis </title>
                  <xptr to="http://www.hum.gu.se/allcach2004/AP/html/prop136.html"/>
               </note>.  It’s a software written in Java or in Perl? Which are requisites for running it on a computer? It’s XML-aware? It allows XSLT transformation? The texts are stored in the file system or in a database? What are its peculiar features? It could be integrated with other software in order to augment the possibilities? It can be customized? It’s clear that a simple list cannot answer to all these questions.</p>
            <p>Many discussions have been made about the kind of classification to apply to the tool list and in my opinion it should be made using practical rather than theoretical principles, with a sort of empirical and pragmatic observation, including also the links to the most possible numbers of the concrete implementations of these tools, so to highlight the best practices and the particular features of each digital library.</p>
            <p>A good way of realizing this classification could be the use of the standard <title>ISO 13250</title>
               <note n="11">
                  <xptr to="http://www.isotopicmaps.org/rm4tm/"/>
               </note>  or TopicMaps, and the respective <term>XML Topic Map</term> (<term>XTM</term>) syntax<note n="12">
                  <xptr to="http://www.topicmaps.org/xtm/1.0/"/>
               </note>. A TopicMap is based on the definition of a general topic, the particular and real occurrences of that topic, and the associations between different topics, thus in my opinion it’s the best way to obtain a complete classification, which will include the various aspects, from the most technical, concerning the programming languages used or the technical specification needed, to the functionalities of visualization, text research and analysis.<note n="13">For an introduction to TopicMap see Steve Pepper, <title level="m">The TAO of Topic Maps, finding the way in the age of infoglut</title>
                  <xptr to="http://www.gca.org/papers/xmleurope2000/papers/s11-01.html"/>.</note>
            </p>
            <p>So what is now a TEI document should be elaborated in a XTM document, detecting, separating, organizing, linking and classifying all the information that now are presented in a linear structure.</p>
            <p>Once created, the XTM file representing the TopicMap can be used and navigated in several ways. Being an XML file, it is possible to apply the same technologies used for the TEI texts, but there are also available some dedicated software which can exploit the great potentialities of this standard as, for example, the <title level="m">Omnigator</title> from <title level="m">Ontopia</title>
               <note n="14">
                  <xptr to="http://www.ontopia.net/omnigator/models/index.jsp"/>
               </note>, or the <title level="m">TM4J</title>
               <note n="15">
                  <xptr to="http://tm4j.org/"/>
               </note>, a java open-source package expressly developed for creating, manipulating and publishing topic maps.</p>
            <p>The TopicMap technology has been presented for the first time related to the <title>TEI</title> during the 2003 meeting, and it’s growing in interest from this community, for its possibility of adding a metadata semantic layer to the digital collections<note n="16">John Bradley, <title level="a">A Model for Text Analysis Tools</title>
                  <xptr to="http://llc.oupjournals.org/cgi/content/abstract/18/2/185"/>
               </note>. Moreover, thanks to the possibilities of merging different XTM documents each representing a different map, the Presentation Tools TopicMap could be integrated with other map about other subjects, the textual content for example<note n="17">See John A. Walsh, <title level="a">Topic Maps and TEI-Encoded Literary Texts</title>, <xptr to="http://drh2004.ncl.ac.uk/abstract.php?abstract=177"/>
               </note> or the documentation of the local views of the DTDs<note n="18">Stuart Brown,  <title level="a">A Topic Map for the TEI</title>
                  <xptr to="http://www.tei-c.org/Members/2003-Nancy/index.html#SB-abs"/>
               </note>, thus creating the basis for the definition of what could become a <soCalled>TEI Ontology</soCalled>.
         </p>
         </div0>
      </body>
      <back>
         <div type="Bibliography">
            <head>Bibliography</head>
            <listBibl>
               <biblStruct>
                  <monogr>
                     <title level="m">
                        <name reg=" Anastasia"> Anastasia</name>
                     </title>
                     <imprint/>
                  </monogr>
                  <note>
                     <xptr crdate="2005-05-19" to="http://anastasia.sourceforge.net/"/>
                  </note>
               </biblStruct>
               <biblStruct>
                  <monogr>
                     <author/>
                     <title level="m">
                        <name reg="teiPublisher">teiPublisher</name>
                     </title>
                     <imprint/>
                  </monogr>
                  <note>
                     <xptr crdate="2005-05-19" to="http://teipublisher.sourceforge.net/docs/index.php"/>
                  </note>
               </biblStruct>
               <biblStruct>
                  <monogr>
                     <author/>
                     <title level="m">
                        <name reg="XPhilologic">XPhilologic</name>
                     </title>
                     <imprint/>
                  </monogr>
                  <note>
                     <xptr crdate="2005-05-19" to="http://barkov.uchicago.edu/xphilo/"/>
                  </note>
               </biblStruct>
               <biblStruct>
                  <monogr>
                     <author/>
                     <title level="m">
                        <name reg="TAPoRware">TAPoRware</name>
                     </title>
                     <imprint/>
                  </monogr>
                  <note>
                     <xptr crdate="2005-03-11" to="http://cheiron.mcmaster.ca/~taporware/"/>
                  </note>
               </biblStruct>
               <biblStruct>
                  <monogr>
                     <author/>
                     <title level="m">
                        <name reg="Versioning Machine">Versioning Machine</name>
                     </title>
                     <imprint/>
                  </monogr>
                  <note>
                     <xptr crdate="2003-12-09" to="http://mith2.umd.edu/products/ver-mach/"/>
                  </note>
               </biblStruct>
               <biblStruct>
                  <analytic>
                     <author>
                        <name reg="Kumar, Amit, et al.">Amit Kumar et al.</name>
                     </author>
                     <title level="a">
                        <name reg="teiPublisher  a repository management system for TEI documents">teiPublisher  a repository management system for TEI documents</name>
                     </title>
                  </analytic>
                  <monogr>
                     <title level="u">Paper delivered at the ALLC/ACH 2004 Conference, Göteborg</title>
                     <imprint>
                        <date value="2004">2004</date>
                     </imprint>
                  </monogr>
                  <note>
                     <xptr crdate="2005-05-19" to="http://www.hum.gu.se/allcach2004/AP/html/prop118.html"/>
                  </note>
               </biblStruct>
               <biblStruct>
                  <analytic>
                     <author>
                        <name reg="Zimmerman, Matthew">Matthew Zimmerman</name>
                     </author>
                     <title level="a" type="WWW document">Using AMP technology (Apache, MySQL, PHP) for XML publication</title>
                  </analytic>
                  <monogr>
                     <title level="u">Paper delivered at the ALLC/ACH 2004 Conference, Göteborg</title>
                     <imprint>
                        <date value="2004">2004</date>
                     </imprint>
                  </monogr>
                  <note>
                     <xptr crdate="2004" to="http://www.hum.gu.se/allcach2004/AP/html/prop156.html"/>
                  </note>
               </biblStruct>
               <biblStruct>
                  <analytic>
                     <author>
                        <name reg="Ramsay, Stephen">Stephen Ramsay</name>
                     </author>
                     <author>
                        <name reg="Rockwell, Geoffrey">Geoffrey Rockwell</name>
                     </author>
                     <author>
                        <name reg="Sinclair, Stéfan">Stéfan Sinclair</name>
                     </author>
                     <title level="a" type="WWW document">TAPoRware: Simple Portal Tools for Text Analysis</title>
                  </analytic>
                  <monogr>
                     <title level="u">Paper delivered at the ALLC/ACH 2004 Conference, Göteborg</title>
                     <imprint>
                        <date value="2004">2004</date>
                     </imprint>
                  </monogr>
                  <note>
                     <xptr crdate="2004" to="http://www.hum.gu.se/allcach2004/AP/html/prop136.html"/>
                  </note>
               </biblStruct>
               <biblStruct>
                  <analytic>
                     <author>
                        <name reg="Pepper, Steve">Steve Pepper</name>
                     </author>
                     <title level="a" type="WWW document">The TAO of Topic Maps, finding the way in the age of infoglut</title>
                  </analytic>
                  <monogr>
                     <title level="u">Paper delivered at the XML Europe 2000 Conference, Paris</title>
                     <imprint>
                        <date value="2000">2000</date>
                     </imprint>
                  </monogr>
                  <note>
                     <xptr crdate="2005-05-19"
                           to="http://www.gca.org/papers/xmleurope2000/papers/s11-01.html"/>
                  </note>
               </biblStruct>
               <biblStruct>
                  <analytic>
                     <author>
                        <name reg="Bradley, John">John Bradley</name>
                     </author>
                     <title level="a" type="WWW document">A Model for Text Analysis Tools</title>
                  </analytic>
                  <monogr>
                     <title level="j">Literary and Linguistic Computing</title>
                     <imprint>
                        <biblScope type="vol">18.2</biblScope>
                        <biblScope type="pages">185-207</biblScope>
                        <date value="2003">2003</date>
                     </imprint>
                  </monogr>
                  <note>
                     <xptr crdate="2005-05-19"
                           to="http://llc.oupjournals.org/cgi/content/abstract/18/2/185"/>
                  </note>
               </biblStruct>
               <biblStruct>
                  <analytic>
                     <author>
                        <name reg="Walsh, John A.">John A. Walsh</name>
                     </author>
                     <title level="a" type="WWW document">Topic Maps and TEI-Encoded Literary Texts</title>
                  </analytic>
                  <monogr>
                     <title level="u">Paper delivered at the Digital Resources for the Humanities Conference, Newcaslte Upon Tyne</title>
                     <imprint>
                        <date value="2004">2004</date>
                     </imprint>
                  </monogr>
                  <note>
                     <xptr crdate="2005-05-19" to="http://drh2004.ncl.ac.uk/abstract.php?abstract=177"/>
                  </note>
               </biblStruct>
               <biblStruct>
                  <analytic>
                     <author>
                        <name reg="Brown, Stuart">Stuart Brown</name>
                     </author>
                     <title level="a" type="WWW document">A Topic Map for the TEI</title>
                  </analytic>
                  <monogr>
                     <imprint>
                        <publisher>TEI Consortium</publisher>
                        <date value="2003">2003</date>
                     </imprint>
                  </monogr>
                  <note>
                     <xptr to="http://www.tei-c.org/Members/2003-Nancy/index.html#SB-abs"/>
                  </note>
               </biblStruct>
            </listBibl>
         </div>
      </back>
   </text>
</TEI.2>