<?xml version="1.0" encoding="UTF-8"?>
<TEI.2 id="session_301_rockwell">
   <teiHeader>
      <fileDesc>
         <titleStmt>
            <title>TAPoR: Five views through a text analysis portal (COCH/COSH Allied Association Session)</title>
            <author>
               <name reg="Rockwell, Geoffrey">Geoffrey Rockwell</name>
            </author>
            <author>
               <name reg="Sinclair, Stéfan">Stéfan Sinclair</name>
            </author>
            <author>
               <name reg="Chartrand, James">James Chartrand</name>
            </author>
            <respStmt>
               <resp>Marked up by </resp>
               <name reg="Holmes, Martin">Martin Holmes</name>
            </respStmt>
         </titleStmt>
         <publicationStmt>
            <p>Marked up to be included in the ACH/ALLC 2005 Conference Abstracts book.</p>
         </publicationStmt>
         <sourceDesc>
            <p>None</p>
         </sourceDesc>
      </fileDesc>
      <profileDesc>
         <textClass>
            <classCode>session</classCode>
            <keywords>
               <list>
                  <item>text analysis</item>
                  <item>tools</item>
                  <item>portals</item>
               </list>
            </keywords>
         </textClass>
      </profileDesc>
      <revisionDesc>
         <list>
            <item>MDH: Marked up from submission from Ray Siemens <date value="2005-05-04">4 May 2005</date>
            </item>
            <item>MDH: RS proofed and signed off without changes <date value="2005-05-18">18 May 2005</date>.</item>
            <item>MDH: Structure refactored: SLA caught bad nesting at final proof <date value="2005-05-27">27 May 2005</date>.</item>
         </list>
      </revisionDesc>
   </teiHeader>
   <text>
      <front>
         <docTitle n="TAPoR: Five views through a text analysis portal (COCH/COSH Allied Association Session)">
            <titlePart>
               <title level="m">TAPoR</title>: Five views through a text analysis portal (COCH/COSH Allied Association Session)</titlePart>
         </docTitle>
         <docAuthor>
            <name reg="Rockwell, Geoffrey">Geoffrey Rockwell</name>
            <address>
               <addrLine>georock@mcmaster.ca</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">McMaster University</titlePart>
         <docAuthor>
            <name reg="Sinclair, Stéfan">Stéfan Sinclair</name>
            <address>
               <addrLine>sgsinclair@gmail.com</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">McMaster University</titlePart>
         <docAuthor>
            <name reg="Chartrand, James">James Chartrand</name>
            <address>
               <addrLine>jc.chartrand@mcmaster.ca</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">OpenSky Solutions</titlePart>
      </front>
      <body>
         <div0>
            <head>A. Session Introduction</head>
            <p>The <title level="m">TAPoR</title> project started as a project to create a portal where users could manage texts, tools and then run tools on text. The Alpha version of the <title level="m">TAPoR</title> portal nicely demonstrated the potential of this simple workbench paradigm. <title level="m">TAPoR.2</title> builds on the individual project paradigm to make the portal useful for research communities. It does this in a number of ways:</p>
            <list type="ordered">
               <item>We have developed a <title>Try It</title> first encounter interface for use by new users, casual users, and just-in-time users. This interface has been developed in close coordination with usability researchers, though it is now going into extensive testing.</item>
               <item>
                  <title level="m">TAPoR.2</title> allows user information to be saved for groups or made public in a fashion similar to community information portals like <title level="m">del.icio.us</title> (<xptr to="http://del.icio.us"/>) and <title level="m">CiteULike</title> (<xptr to="http://www.citeulike.org"/>). Some types of information have always been intended for public viewing like the News built into <title level="m">TAPoR</title> from the beginning. We have not only extended the sharing model to all types of information managed, but we have added communal editing to selected types of information, especially documentation, with a wiki editing-like interface.</item>
               <item>We have extended the project paradigm to allow interfaces to be created that can be integrated into other projects and web sites. Thus advanced users can create projects that are styled to look like part of a different project.</item>
               <item>We have developed a tool developers interface so that tools as web services can be added and documentation quickly entered. We have also used the community building features of the portal to develop <title level="m">TA!DA!</title> or the <title level="m">TAPoR Developers Association</title> – a site for the developer community.</item>
               <item>We have developed <title level="m">TEA</title>, the <title level="m">TAPoR Engine of Association</title>, which is designed to help the serendipitous exploration of texts, references, links, people, projects and tools. <title level="m">TEA</title> combs and visualizes topic maps which associate items across users.</item>
            </list>
            <p>In this session we are going to present the portal from five views that move from a conventional first encounter view of a tool portal to an inverted view of the portal as a research community association engine. These five views will be presented as three coordinated papers.</p>
         </div0>
         <div0 type="EmbeddedDoc">
            <div1>
               <head>B1. <title level="m">TAPoR</title>: First Encounters</head>
               <p rend="Presenter">Geoffrey Rockwell</p>
               <p>The first paper will demonstrate the first encounter interface, <title>Try It</title>. Woven into this presentation will be a discussion of the usability research and testing that led to this interface hypothesis. It is our hope that this encounter interface will be of use to novices and advanced, but casual, users. It is an interface that doesn’t require a portal account so it can be used occasionally and it is optimized for ease of use and successful results.</p>
               <p>Rockwell will then demonstrate the basic user account paradigm for people who want to use the portal for sustained text analysis projects. He will demonstrate how from a first encounter once can get a myTAPoR account with which to organize links to texts, organize tools, and manage projects.</p>
            </div1>
         </div0>
         <div0 type="EmbeddedDoc">
            <div1>
               <head>B2. <title level="m">TAPoR</title>: Developing Encounters</head>
               <p rend="Presenter">Stéfan Sinclair</p>
               <p>The second paper will demonstrate and discuss the Tool Developers interface and the community tools designed to assist developers. In this context Sinclair will discuss the first <title level="m">TAPoR</title> “hackers ball” funded by the Social Science and Humanities Research Council of Canada through a grant led by Stéfan Sinclair. He will also discuss the technical design of the underlying tool broker and the data interfaces that allow results to be saved to a Data Bench for use as an input text for a different tool. This component of the presentation will end with a blatant attempt to enlist attendees in <title level="m">TA!DA!</title> so we can enrich the tools collection.</p>
               <p>The portal must bring together the text analysis community. In particular, the portal must make it as easy as possible for researchers who have existing tools, or want to write new tools — in their preferred programming language — to make the tools available through the portal. Web services provide a standard language and protocol to enable communication between different programming languages, and therefore are a very appropriate vehicle for connecting text analysis tools together through the portal. Further, most programming languages provide tools to publish existing program code as web services with little or no modification, and little extra setup. In some cases the tools will take an existing program function and create the entire infrastructure needed to make the function available over the internet: the web server, the code to listen for remote requests and translate them into calls to the local program code, and code to package the results up and return them to the original caller.</p>
               <p>Text analysis tools provided as web services are easier to combine in simple (<soCalled>piped</soCalled>) combinations, but can also be combined in very sophisticated arrangements (using scripting) — without requiring that the user learn new programming languages or run through elaborate setup procedures.</p>
            </div1>
         </div0>
         <div0 type="EmbeddedDoc">
            <div1>
               <head>B3. <title level="m">TAPoR</title>: Community Encounters</head>
               <p rend="Presenter">James Chartrand</p>
               <p>The third paper will discuss the underlying technologies deployed in the portal so as to show how the portal can be rethought as a community association engine. We chose <title level="m">Apache Cocoon</title> as our web development framework for the portal. <title level="m">Cocoon</title> satisfies several of our objectives. <title level="m">Cocoon</title> provides a basic portal implementation geared towards custom development. <title level="m">Cocoon</title> is open source. Much of <title level="m">Cocoon</title> is made up of code donated from large scale software projects; code that has gone through numerous development cycles on large systems. <title level="m">Cocoon</title> is actively maintained and supported by hundreds of developers. <title level="m">Cocoon</title> is therefore stable, secure, and scalable. In addition, <title level="m">Cocoon</title> runs on Java and therefore, can run without modification on <title level="m">Linux</title>, <title level="m">Windows</title> and the <title level="m">Mac</title>, allowing new projects to install the portal with ease.</p>
               <p>The portal must provide a uniform and single point of access for text analysis tools, but must also engender an online community of knowledge. We chose Topic Maps for knowledge management because they are adaptable, simple, and standards based. Topic Maps can be thought of as a very rich index. An index that doesn't just point into texts, but can describe relationships between almost any object or idea. In our case, the relationships are between texts, between tools, between texts and tools, between projects, between projects and tools, between projects and users, between users and texts, and so on. Topic Maps also make the portal more adaptable to the needs of other projects outside the text analysis community.</p>
               <p>In the context of underlying technologies James Chartrand will demonstrate the portal again, but now from the view-point of how it can be used to develop a research group or project taking advantage of the incorporated technologies. He will demonstrate the deep skinning features that allow users to create views that suit their research, their groups, or their projects. In this context he will illustrate how the <title level="m">TAPoR</title> portal, is, from one perspective, just a web of associations between links, notes, tools, and topics. </p>
            </div1>
         </div0>
         <div0>
            <head>C. Issues</head>
            <p>There are a number of key issues that underlie all three papers.</p>
            <list type="lower-roman">
               <item>Peer review of tools and academic credit. In a panel organized for the ACH/ALLC 2003 in Athens Georgia by Stéfan Sinclair on <title level="a">Peer Review of Humanities Computing Software</title> we presented some models for how review of tools could be supported. <title level="m">TAPoR</title> as a public portal that gives access to tools elsewhere that run as web services can be site for the review and documentation of software tools. We will present a documentation interface that allows public comments and reviews of tools that could serve some of the need for a peer review system.</item>
               <item>Open source. A popular paradigm for the creation and maintenance of community tools is to release them as open source under one of the various licenses available. We will discuss the way in which the portal as software is open source and the ways individual tools can be made available or protected. Likewise we will discuss the need for authentication for selected texts which cannot be made available openly.</item>
               <item>Humanities software development. The portal must, fundamentally, meet the needs of a research community. Needs which aren't, by definition, yet completely defined as research evolves. To that end, we have adopted an "agile" development process that involves regular meetings and storytelling. This approach has proven extremely effective. We have avoided getting bogged down in over-analysis and excessive documentation, and at same time have been able to adapt development cycles to meet the evolving needs of the project. Adaptability is particularly important for a research project like this where midstream research outcomes can lead to new paths, or close others. </item>
               <item>Stories. The <soCalled>story</soCalled> is the fundamental unit of work in our process. Stories are informal descriptions of how the end-user would like to use the portal. Stories can be written in whatever style makes sense for the user. Stories and other documentation is kept in the <title level="m">TAPoR</title> Wiki which is a shared development space. The stories are then broken down by the Open Sky Solutions team into <soCalled>tasks</soCalled> that are assigned time estimates. </item>
               <item>Adaptability. An important objective of the project is to enable other projects to adapt the portal and to contribute to its development. We have, therefore, organized the development process around standards that make it straightforward to not only download and install the portal, but to setup the development environment. Our goal is to ensure continued development of the portal.</item>
            </list>
         </div0>
         <div0>
            <head>D. Conclusions</head>
            <p>The <title level="m">TAPoR</title> Portal is fundamentally conceived of and designed to be an extensible, network-based research environment. As such, it has been crucial to devise mechanisms for enriching the portal by allowing developers and users to encounter the portal, use it, and adapt it for others. It is worth emphasizing how this approach differs from the development of text analysis tools of the past, such as <title level="m">OCP</title> and <title level="m">TACT</title>, that are essentially pre-defined workstation-based programs. <title level="m">TAPoR</title>, by contrast, seeks to accommodate unknown and unanticipated resources. Such flexibility requires considerable engineering to ensure compatibility between disparate texts and tools. We will present a model for such flexibility, but recognize that it will need testing and scrutiny to become genuinely useful.</p>
         </div0>
      </body>
   </text>
</TEI.2>