<?xml version="1.0" encoding="UTF-8"?>
<TEI.2 id="panel_14_siemens">
   <teiHeader>
      <fileDesc>
         <titleStmt>
            <title>Theory and Practise in Literary Textual Analysis Tools</title>
            <author>
               <name reg="Siemens, Ray">Ray Siemens</name>
            </author>
            <author>
               <name reg="Rockwell, Geoffrey">Geoffrey Rockwell</name>
            </author>
            <author>
               <name reg="Schreibman, Susan">Susan Schreibman</name>
            </author>
            <author>
               <name reg="Jockers, Matthew">Matthew Jockers</name>
            </author>
            <respStmt>
               <resp>Marked up by </resp>
               <name reg="Holmes, Martin">Martin Holmes</name>
               <lb/>
               <name reg="Baer, Patricia">Patricia Baer</name>
            </respStmt>
         </titleStmt>
         <publicationStmt>
            <p>Marked up to be included in the ACH/ALLC 2005 Conference Abstracts book.</p>
         </publicationStmt>
         <sourceDesc>
            <p>None</p>
         </sourceDesc>
      </fileDesc>
      <profileDesc>
         <textClass>
            <classCode>panel</classCode>
            <keywords>
               <list>
                  <item>text analysis</item>
                  <item>literary criticism</item>
                  <item>disciplinary integration of computing tools</item>
               </list>
            </keywords>
         </textClass>
      </profileDesc>
      <revisionDesc>
         <list>
            <item>MDH: Created from John Bradley's XML <date value="2005-03">March 2005</date>
            </item>
            <item>MDH: Proofed by Ray Siemens <date value="2005-04-02">2 April 2005</date>
            </item>
            <item>MDH: Fixed capitalization error caught on final proof by SLA <date value="2005-05-27">27 May 2005</date>
            </item>
         </list>
      </revisionDesc>
   </teiHeader>
   <text>
      <front>
         <docTitle n="Theory and Practise in Literary Textual Analysis Tools">
            <titlePart>Theory and Practise in Literary Textual Analysis Tools</titlePart>
         </docTitle>
         <docAuthor>
            <name reg="Siemens, Ray">Ray Siemens</name>
            <address>
               <addrLine>siemens@uvic.ca</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">University of Victoria</titlePart>
         <docAuthor>
            <name reg="Rockwell, Geoffrey">Geoffrey Rockwell</name>
            <address>
               <addrLine>georock@mcmaster.ca</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">McMaster University</titlePart>
         <docAuthor>
            <name reg="Schreibman, Susan">Susan Schreibman</name>
            <address>
               <addrLine>sschreib@umd.edu</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">University of Maryland</titlePart>
         <docAuthor>
            <name reg="Jockers, Matthew">Matthew Jockers</name>
            <address>
               <addrLine>mjockers@stanford.edu</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">Stanford University</titlePart>
      </front>
      <body>
         <div0>
            <head>Panel Description</head>
            <p>Through discussion of several exemplary literary textual analysis tools, participants on this panel explore elements of the literary studies community's reaction to textual analysis computer tool development -- and, particularly, how theorists perceive the development of tools as an activity that supports, tests, models, and expands upon their work.  Panel contributors challenge the oft-perceived disparity between the <soCalled>lower</soCalled> criticism (enumerative, bibliographic, re-presentative, &amp;c.) in which most computing tools that we use have their origins and the <soCalled>higher</soCalled> criticism often associated with thematically-oriented literary critical theory.</p>
            <p>
               <hi rend="bold">Geoffrey Rockwell</hi>, McMaster U (presenter)<lb/>
               <hi rend="bold">Matt Jockers</hi>, Stanford U (presenter)<lb/>
               <hi rend="bold">Susan Schreibman</hi>, U Maryland (presenter)<lb/>
               <hi rend="bold">Ray Siemens</hi>, U Victoria (chair and respondent)<lb/>
            </p>
         </div0>
         <div0 type="EmbeddedDoc">
            <div1>
               <head>Interrupting the Machine to Think About It</head>
               <p rend="Presenter">Geoffrey Rockwell</p>
               <ab>
                  <cit>
                     <q>A machine may be defined as a <hi rend="italics">system of interruptions</hi> or breaks (<hi rend="italics">coupures</hi>). . . Every machine, in the first place, is related to  a continual material flow (<hi rend="italics">hylè</hi>) that it cuts into.</q>
                     <bibl>Deleuze and Guattari 36</bibl>
                  </cit>
               </ab>
               <p>Text analysis tools (and for that matter any form of analysis) perform two types of operations. They interrupt the flow of continuous analog information in order to break it down into samples that can be quantified and then they synthesize new eruptions out of the samples. Even the representation of a text in digital form is a matter of machined sampling and quantitative representation whether you chose to represent a printed page as pixels or characters.</p>
               <p>This interrupting and breaking down is a process that constrains what computer-based tools can do and that is the first point of this paper. The sampling and quantization also makes it possible to develop synthetic processes that create new hybrid artefacts like text visualizations or sonoric representations, the second point of this paper.</p>
               <p>Finally, the breaking down (and not the transparent functioning)  is the (error) message of the textual machine. We know the machine when it fails, when it is in error, and when it delivers monstrous results. To stand back and look at a machine, as opposed to looking through it, is to think through ambitious failure.</p>
               <p>Such a thinking through a computer is pragmatic theorizing in a tradition of thinking while tinkering - a thinking often provoked by what is at hand. What is proposed is a theory of computer assisted text analysis that addresses the way such ruptures stress interpretation. Development happens in rupture, both the programming development that scripts computers and the performance of thinking (about machines and texts) called developing a theory.</p>
               <p>In the meantime, <hi rend="italics">The Bug</hi> that mocks us and interrupts our demonstrations is also what provokes reflection and adaptation. We wouldn't want it any other way, except at the moment of machined  interruption, for which reason a demonstration of TAPoRware text analysis tools will interrupt this paper.</p>
            </div1>
            <div1>
               <head>Bibliography</head>
               <listBibl>
                  <biblStruct>
                     <monogr>
                        <author>
                           <name reg="Ullman, Ellen">Ellen Ullman</name>
                        </author>
                        <title level="m">The Bug</title>
                        <imprint>
                           <publisher>Nan. A. Talese</publisher>
                           <pubPlace>New York</pubPlace>
                           <date value="2003">2003</date>
                        </imprint>
                     </monogr>
                  </biblStruct>
                  <biblStruct>
                     <monogr>
                        <author>
                           <name reg="Deleuze, Gilles">Gilles Deleuze</name>
                        </author>
                        <author>
                           <name reg="Guattari, Félix">Félix Guattari</name>
                        </author>
                        <title level="m">Anti-Oedipus: Capitalism and Schizophrenia</title>
                        <respStmt>
                           <resp>Trans.</resp>
                           <name reg="Hurley, Robert">Robert Hurley</name>
                        </respStmt>
                        <imprint>
                           <publisher>University of Minnesota Press</publisher>
                           <pubPlace>Minneapolis</pubPlace>
                           <date value="1983">1983</date>
                        </imprint>
                     </monogr>
                  </biblStruct>
                  <biblStruct>
                     <monogr>
                        <author>
                           <name reg="Yan, Lian">Lian Yan</name>
                        </author>
                        <author>
                           <name reg="Rockwell, Geoffrey">Geoffrey Rockwell</name>
                        </author>
                        <title level="m" type="WWW document">TAPoRware</title>
                        <imprint/>
                     </monogr>
                     <note>
                        <xptr crdate="2005-03-22" to="http://taporware.mcmaster.ca/"/>
                     </note>
                  </biblStruct>
               </listBibl>
            </div1>
         </div0>
         <div0 type="EmbeddedDoc">
            <div1>
               <head>Visualizing the Hypothetical, Encoding the Argument</head>
               <p rend="Presenter">Susan Schreibman</p>
               <p>The <title level="m">Versioning Machine</title> (<title level="m">VM</title>) <xptr to="http://www.mith2.umd.edu/products/ver-mach"/> was launched at ACH/ALLC 2002 as a tool to display multiple witnesses of deeply encoded text. It was designed as a presentation tool so that editors could engage with the challenging work of textual editing, rather than becoming experts in other technologies, such as XSLT, JavaScript and CSS, all components of the <title level="m">Versioning Machine</title>. The application allows encoders who utilize the <title level="m">Text Encoding Initiative</title>’s Parallel Segmentation method of encoding to view their documents through a browser-based interface which parses the text into its constituent documents (at present the <title level="m">VM</title> works best with <title level="m">Internet Explorer</title> 6.0 and higher, but it also works with <title level="m">Firefox</title> for PC and Mac). The <title level="m">Versioning Machine</title> also provides several features for the end user to engage with texts, including highlighting a structural unit (paragraphs, lines, or divs) across the witness set, synchronized scrolling, and the ability to display a robust typology of notes. </p>
               <p>The <title level="m">TEI</title>’s Critical Apparatus tagset (as outlined in Chapter 19 of the <title level="m">TEI</title>’s <title level="m">Guidelines</title>) provides a method for capturing variants across a witness set. This highly structured encoding brings together in one document n number of witnesses which an editor considers the same work. The encoding enabled by parallel segmentation provides a typology for indicating what structural units of text, or parts of structural units, belong to each witness. In this way, content which appears in more than one version of the work is encoded once, with attribute values indicating which witness or witnesses it belongs to. It is an extremely efficient way of encoding in that the editor is saved the repetitious work of encoding the content which persists over multiple witnesses, as one would do if each witness were encoded as a separate document. </p>
               <p>The apparatus element or <hi rend="code">&lt;app&gt;</hi> acts as a container element binding together the various readings, which are encoded within a reading <hi rend="code">&lt;rdg&gt;</hi> element. Attribute values indicate which witness or witnesses a particular structural unit (a paragraph or line, for example), or subunit, belongs to (See figure 1.). </p>
               <ab>
                  <hi rend="code">&lt;lg n="1"&gt; <lb/>
                  &lt;l n="1"&gt; <lb/>
                  &lt;app&gt; <lb/>
                  &lt;rdg wit="a1 a2 a3 a4 pub"&gt;The sun burns out,&lt;/rdg&gt; <lb/>
                  &lt;/app&gt; &lt;/l&gt; <lb/>
                  &lt;l n="2"&gt; <lb/>
                  &lt;app&gt; <lb/>
                  &lt;rdg wit="a1"&gt;The world withers,&lt;/rdg&gt; <lb/>
                  &lt;rdg wit="a3 a4"&gt;The world withers,&lt;milestone unit="stanza"/&gt;&lt;/rdg&gt; <lb/>
                        &lt;rdg wit="a2 pub"&gt;The world withers&lt;milestone unit="stanza"/&gt;&lt;/rdg&gt; <lb/>
                     &lt;/app&gt; &lt;/l&gt;</hi>
               </ab>
               <p rend="caption">Figure 1. A fragment of parallel segmentation encoding</p>
               <p>When parsed in the <title level="m">Versioning Machine</title>, the aforementioned fragment, the title of the text, along with the first few lines, is rendered as follows for the first three versions:</p>
               <figure rend="ImageLink">
                  <head>Figure 2: The title of ‘Autumn’ rendered in the Versioning Machine</head>
                  <p>
                     <xref>panel_14_siemens_2.pdf</xref>
                  </p>
                  <figDesc>Figure 2: The title of ‘Autumn’ rendered in the Versioning Machine</figDesc>
               </figure>
               <p>In Lessard and Levison’s 1998 article <title level="a">Introduction: quo vadimus</title>, they argue that computational humanities research has not achieved a level of acceptance because of the differences in <cit>
                     <q>opposing intellectual paradigms, the scientific and the humanistic</q>
                  </cit>. The scientific, they argue, is based on formulation of hypotheses, collection of data and controlled testing and replication. The humanistic paradigm, they argue is based on argument from example, <cit>
                     <q>where the goal is to bring the interlocutor to agreement by coming to see the materials at hand in the same light</q>
                     <bibl>263</bibl>
                  </cit>.</p>
               <p>While the <title level="m">Versioning Machine</title> was designed as a visualization tool, it is no less importantly an environment within which editors realize a theory of the text, bringing readers to an understanding of the work as embodied in its multiple witnesses. It can thus be seen within Lessard and Levison humanistic paradigm, as a tool for presenting a reading of the work through its editing and encoding, itself a primary theoretical event  <cit>
                     <bibl>McGann 75</bibl>
                  </cit>. Moreover, this primary event can illuminated and explicated though more traditional scholarly apparatus, such as annotation, adding an additional layer of textual analysis.</p>
               <p>Thus the <title level="m">Versioning Machine</title> provides a venue not only to realize contemporary editorial theory, but to challenge it. It meets the requirement that Stéfan Sinclair outlines in his 2003 article <title level="a">Computer-Assisted Reading; Reconceiving Text Analysis</title> in that it is a tool which is relevant to literary critics’ current approaches to textual criticism <cit>
                     <bibl>178</bibl>
                  </cit>. The <title level="m">Versioning Machine</title> is an active editing environment: it has been used by encoders editing texts as different as Renaissance plays and Dadaist poetry. The <title level="m">Versioning Machine</title> is a tool which takes as its premise that the goal of much contemporary editing is not to create a definitive edition, but rather a <cit>
                     <q>hypothesis</q>
                  </cit> of the text <cit>
                     <bibl>Kane-Donaldson as quoted in McGann 77</bibl>
                  </cit>, which can be read alongside an unedited edition of the text (that is, a reproduction of an image of the text in documentary form; McGann 77, Siemens). As such, it makes visible encoding as criticism, providing an environment to challenge our approaches to complex texts in terms of theories of encoding, as well as contemporary editorial theory. </p>
            </div1>
            <div1>
               <head>Bibliography</head>
               <listBibl>
                  <biblStruct>
                     <analytic>
                        <author>
                           <name reg="Lessard, G.">G. Lessard</name>
                        </author>
                        <author>
                           <name reg="Levinson, M">M. Levinson</name>
                        </author>
                        <title level="a">Introduction: quo vadimus?</title>
                     </analytic>
                     <monogr>
                        <title level="j">Computers and the Humanities</title>
                        <imprint>
                           <biblScope type="vol">31.4</biblScope>
                           <biblScope type="pages">261-269</biblScope>
                           <date value="1998">1998</date>
                        </imprint>
                     </monogr>
                  </biblStruct>
                  <biblStruct>
                     <monogr>
                        <author>
                           <name reg="McGann, Jerome">Jerome McGann</name>
                        </author>
                        <title level="m">Radiant Textuality: literature after the World Wide Web</title>
                        <imprint>
                           <publisher>Palgrave</publisher>
                           <pubPlace>New York</pubPlace>
                           <date value="2001">2001</date>
                        </imprint>
                     </monogr>
                  </biblStruct>
                  <biblStruct>
                     <analytic>
                        <author>
                           <name reg="Schreibman, Susan">Susan Schreibman</name>
                        </author>
                        <author>
                           <name reg="Kumar, Amit">Amit Kumar</name>
                        </author>
                        <author>
                           <name reg="McDonald, Jarom">Jarom McDonald</name>
                        </author>
                        <title level="a">The Versioning Machine</title>
                     </analytic>
                     <monogr>
                        <title level="j">Literary and Linguistic Computing</title>
                        <imprint>
                           <biblScope type="vol">18.1</biblScope>
                           <biblScope type="pages">101-107</biblScope>
                           <date value="2003">2003</date>
                        </imprint>
                     </monogr>
                  </biblStruct>
                  <biblStruct>
                     <analytic>
                        <author>
                           <name reg="Siemens, Ray">Ray Siemens</name>
                        </author>
                        <title level="a">‘Unediting and Non-Editions’ The Theory (and Politics) of Editing</title>
                     </analytic>
                     <monogr>
                        <title level="j">Anglia</title>
                        <imprint>
                           <biblScope type="vol">119.3</biblScope>
                           <biblScope type="pages">423-455</biblScope>
                           <date value="2001">2001</date>
                        </imprint>
                     </monogr>
                  </biblStruct>
                  <biblStruct>
                     <monogr>
                        <editor>
                           <name reg="Sperberg-McQueen, C.M.">C.M. Sperberg-McQueen</name>
                        </editor>
                        <editor>
                           <name reg="Burnard, L.">L. Burnard</name>
                        </editor>
                        <title level="m">TEI P4: Guidelines for Electronic Text Encoding and Interchange</title>
                        <imprint>
                           <publisher>Text Encoding Initiative Consortium</publisher>
                           <date value="2002">2002</date>
                        </imprint>
                     </monogr>
                     <note>
                        <xptr crdate="2004-10-09" to="http://www.tei-c.org/P4X/"/>
                     </note>
                  </biblStruct>
                  <biblStruct>
                     <analytic>
                        <author>
                           <name reg="Sinclair, Stéfan">Stéfan Sinclair</name>
                        </author>
                        <title level="a">Computer-Assisted Reading; Reconceiving Text Analysis</title>
                     </analytic>
                     <monogr>
                        <title level="j">Literary and Linguistic Computing</title>
                        <imprint>
                           <biblScope type="vol">18.2</biblScope>
                           <biblScope type="pages">175-184</biblScope>
                           <date value="2003">2003</date>
                        </imprint>
                     </monogr>
                  </biblStruct>
                  <biblStruct>
                     <analytic>
                        <author>
                           <name reg="Vetter, Lara">Lara Vetter</name>
                        </author>
                        <author>
                           <name reg="McDonald, Jarom">Jarom McDonald</name>
                        </author>
                        <title level="a">Witnessing Dickinson's Witnesses</title>
                     </analytic>
                     <monogr>
                        <title level="j">Literary and Linguistic Computing</title>
                        <imprint>
                           <biblScope type="vol">18.2</biblScope>
                           <biblScope type="pages">151-165</biblScope>
                           <date value="2003">2003</date>
                        </imprint>
                     </monogr>
                  </biblStruct>
               </listBibl>
            </div1>
         </div0>
         <div0 type="EmbeddedDoc">
            <div1>
               <head>Electronic Text Analysis and a New Methodology for Canonical Research</head>
               <p rend="Presenter">Matt Jockers</p>
               <p>Using a combination of <soCalled>typical</soCalled> text analysis tools (concordance and collocation) and other custom tools developed by the author, this paper demonstrates that conventional <soCalled>higher</soCalled> criticism with its fashionable and thematically-oriented theoretical approaches fails as a means of assessing and generalizing about canons and genres of literature. Drawing on a case-study of the canon of Irish-American prose, the paper employs a quantitative and, indeed, scientific methodology to offer a radical reinterpretation of the canon.</p>
               <p>In support of this research the author collected, coded, and categorized a database collection of prose literature including over 750 individual works written by some 280 different authors. The collection spans a period of 300 years and nears being comprehensive in terms of its scope and coverage of the prose canon and genre of Irish-American ethnic literature.  In addition to the usual metadata associated with electronic archives, each work in the collection is tagged with metadata related to the nature of the work: metadata includes geographic setting (East or West of the Mississippi), regional setting (Northeast, Southwest, Mountain, Pacific, and etc), information about whether the work is set in an urban or rural environment as well as data specific to the author of each text.  Using his own <title level="m">Corpus Analysis Tools Suite</title> (<title level="m">CATools</title>), a set of analytic tools developed using php and mysql for doing both semantic and quantitative text-analysis of materials specifically housed within a relational database structure, the author has mined the material in order to reveal latent chronological, semantic, and geographic trends within the overall canon since its beginning in the late 18th century to the present.</p>
               <p>The results of this work not only challenge the best available scholarship on the subject of Irish-American literature but further challenge the efficacy of contemporary and fashionable theoretical approaches to literature that are based on the <soCalled>close-readings</soCalled> of texts. In making the case for a re-evaluation of the Irish-American canon, the paper challenges the basic and fundamental methodology of traditional literary study, and demonstrates in clear and indisputable terms that a quantitative and, indeed, scientific analysis of the literary data is not only valuable to the study of a genre or a canon of literature but essential if we are to ever go beyond the mere <soCalled>readings</soCalled> and interpretations of texts.</p>
            </div1>
         </div0>
         <div0 type="EmbeddedDoc">
            <div1>
               <head>Response</head>
               <p rend="Presenter">Ray Siemens</p>
            </div1>
         </div0>
      </body>
   </text>
</TEI.2>