<?xml version="1.0" encoding="UTF-8"?>
<TEI.2 id="panel_184_short">
   <teiHeader>
      <fileDesc>
         <titleStmt>
            <title>Keyword Extraction in Information Retrieval</title>
            <author>
               <name reg="Short, Harold">Harold Short</name>
            </author>
            <author>
               <name reg="Deegan, Marilyn">Marilyn Deegan</name>
            </author>
            <author>
               <name reg="Hunyadi, Laszlo">Laszlo Hunyadi</name>
            </author>
            <author>
               <name reg="Baker, Paul">Paul Baker</name>
            </author>
            <author>
               <name reg="Archer, Dawn">Dawn Archer</name>
            </author>
            <author>
               <name reg="McEnery, Tony">Tony McEnery</name>
            </author>
            <respStmt>
               <resp>Marked up by </resp>
               <name reg="Holmes, Martin">Martin Holmes</name>
               <lb/>
               <name reg="Baer, Patricia">Patricia Baer</name>
            </respStmt>
         </titleStmt>
         <publicationStmt>
            <p>Marked up to be included in the ACH/ALLC 2005 Conference Abstracts book.</p>
         </publicationStmt>
         <sourceDesc>
            <p>None</p>
         </sourceDesc>
      </fileDesc>
      <profileDesc>
         <textClass>
            <classCode>panel</classCode>
            <keywords>
               <list>
                  <item>keyword extraction</item>
                  <item>information retrieval</item>
                  <item>textual analysis</item>
               </list>
            </keywords>
         </textClass>
      </profileDesc>
      <revisionDesc>
         <list>
            <item>MDH: Created from John Bradley's XML <date value="2005-03">March 2005</date>
            </item>
            <item>MDH: Marked up <date value="2005-03-23">23 March 2005</date>
            </item>
            <item>MDH: RS proofed and signed off without changes <date value="2005-05-18">18 May 2005</date>.</item>
         </list>
      </revisionDesc>
   </teiHeader>
   <text>
      <front>
         <docTitle n="Keyword Extraction in Information Retrieval">
            <titlePart>Keyword Extraction in Information Retrieval</titlePart>
         </docTitle>
         <docAuthor>
            <name reg="Short, Harold">Harold Short</name>
            <address>
               <addrLine>harold.short@kcl.ac.uk</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">King's College London</titlePart>
         <docAuthor>
            <name reg="Deegan, Marilyn">Marilyn Deegan</name>
            <address>
               <addrLine>marilyn.deegan@kcl.ac.uk</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">King's College London</titlePart>
         <docAuthor>
            <name reg="Hunyadi, Laszlo">Laszlo Hunyadi</name>
            <address>
               <addrLine>hunyadi@ling.arts.klte.hu</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">University of Debrecen</titlePart>
         <docAuthor>
            <name reg="Baker, Paul">Paul Baker</name>
            <address>
               <addrLine>p.baker@lancaster.ac.uk</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">Lancaster University</titlePart>
         <docAuthor>
            <name reg="Archer, Dawn">Dawn Archer</name>
            <address>
               <addrLine>d.archer@lancaster.ac.uk</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">University of Central Lancashire</titlePart>
         <docAuthor>
            <name reg="McEnery, Tony">Tony McEnery</name>
            <address>
               <addrLine>a.mcenery@lancaster.ac.uk</addrLine>
            </address>
         </docAuthor>
         <titlePart type="affil">Lancaster University</titlePart>
      </front>
      <body>
         <div0>
            <p>Session chair: Harold Short.</p>
         </div0>
         <div0>
            <div1>
               <head>Keyword extraction: an Overview</head>
               <p rend="Presenter">Laszlo Hunyadi</p>
               <p>With the ever increasing amount of information made available in all areas of life, including economy, science, education and culture, there is an imperative need to retrieve, elaborate and present this information in the most optimal way. Since the bulk of information is textual, information retrieval is mainly concerned with texts. This talk will present an outline of the approaches, principles and techniques used in textual data retrieval based on keyword extraction. There will be an analysis of the capabilities of various approaches showing their appropriate uses.</p>
            </div1>
         </div0>
         <div0>
            <div1>
               <head>Keyword extraction in the project <title level="m">Forced Migration Online</title>
               </head>
               <p rend="Presenter">Marilyn Deegan</p>
               <p>
                  <title level="m">Forced Migration Online</title> (<xptr to="http://www.forcedmigration.org/"/>) provides instant access to a wide variety of online resources dealing with the situation of forced migrants worldwide. Being one of the most comprehensive textual resources dealing with humanitarian issues across a large number of countries, it faces the challenges of multilingualism, multiculturism and the essential requirement to be up-to-date and informative. That is why the organisation and retrieval of textual information as well as operability have high priority in the design and functioning of the system. This talk presents the essentials of this online resource, including its pioneering beginnings and view of future development.</p>
            </div1>
         </div0>
         <div0>
            <div1>
               <head>Querying Keywords: Questions of difference, frequency and sense</head>
               <p rend="Presenter">Paul Baker</p>
               <p>This paper examines issues to do with interpreting keyword lists <cit>
                     <bibl>Scott
1999</bibl>
                  </cit>, such as over-attending to lexical differences whilst ignoring
differences in word usage and/or similarities between texts. Using a
variety of technniques (e.g. analysis of key clusters or annotated
data), I show how researchers can use keyword analyses to obtain a more
accurate picture of the distinctive features of their texts or corpora.</p>
            </div1>
         </div0>
         <div0>
            <div1>
               <head>Love - a familiar or a devil? An exploration of key domains in Shakespeare's Comedies and Tragedies</head>
               <p rend="Presenter">Dawn Archer, Jonathan Culpeper, Paul Rayson</p>
               <p>Love is a common theme in Shakespeare's works. In this paper, we show
   how the <title level="m">UCREL Semantic Annotation Scheme</title> (henceforth <title level="m">USAS</title>), a software
program for automatic dictionary-based content analysis, can help us to
explore the semantic field of 'love' within a selection of Shakespeare's
   plays. Specifically, we will explore 3 love-tragedies (<title level="m">Othello</title>, <title level="m">Antony
      and Cleopatra</title>, and <title level="m">Romeo and Juliet</title>) and 3 love-comedies (<title level="m">A Midsummer
         Night's Dream</title>, <title level="m">The Two Gentlemen of Verona</title> and <title level="m">As You Like It</title>) to
determine differences in their (re)presentation of <soCalled>love</soCalled>. We will also
discuss how the semantic field of <soCalled>love</soCalled> co-occurs with different
domains in the plays, and assess the implications this has on our
understanding of <soCalled>love</soCalled> as a concept. This research builds on (i)
   Jonathan Culpeper's work on keywords in Shakespeare, using <title level="m">Wordsmith</title>
                  <cit>
                     <bibl>Culpeper 2002</bibl>
                  </cit>, (ii) Paul Rayson's comparisons of key word and key
domain analysis <cit>
                     <bibl>Rayson 2003</bibl>
                  </cit>, and (iii) Dawn Archer and Paul Rayson's
work on the identification of key domains in refugee literature, using
   <title level="m">USAS</title>
                  <cit>
                     <bibl>Archer and Rayson forthcoming</bibl>
                  </cit>.</p>
            </div1>
         </div0>
         <div0>
            <div1>
               <head>Key words and the analysis of discourses in historical contexts</head>
               <p rend="Presenter">Tony McEnery</p>
               <p>This paper examines the use of keywords to approach the discourse of
moral panic evident in the writings of the Society for the Reformation
of Manners in late seventeenth/early eighteenth century England. The
keyword approach, I will argue allows one to populate a model of moral
panic discourse, while simultaneously showing how, in that historical
context, links were forged between concepts which, while unlinked then,
have become naturalised as being linked in modern English. By showing
how keywords relate to discourse, and ultimately to a process whereby
meanings and objects become linked, the paper will argue that keywords
are important tools for the historical linguist in studying the shifting
patterns of word association in language.</p>
            </div1>
         </div0>
      </body>
      <back>
         <div type="Bibliography">
            <head>Bibliography</head>
            <listBibl>
               <biblStruct>
                  <analytic>
                     <author>
                        <name reg="Archer, D.">D. Archer</name>
                     </author>
                     <author>
                        <name reg="Rayson, P.">P. Rayson</name>
                     </author>
                     <title level="a">Using the UCREL automated semantic analysis system to investigate differing concerns in refugee literature</title>
                  </analytic>
                  <monogr>
                     <editor>
                        <name reg="Deegan, M.">M. Deegan</name>
                     </editor>
                     <editor>
                        <name reg="Hunyadi, L.">L. Hunyadi</name>
                     </editor>
                     <editor>
                        <name reg="Short, H.">H. Short</name>
                     </editor>
                     <title level="m">The Keyword Project: Unlocking Content Through Computational Linguistics</title>
                     <imprint>
                        <publisher>Office for Humanities Communication Publications</publisher>
                        <date>Forthcoming</date>
                     </imprint>
                  </monogr>
               </biblStruct>
               <biblStruct>
                  <analytic>
                     <author>
                        <name reg="Culpeper, J.">J. Culpeper</name>
                     </author>
                     <title level="a">Computers, language and characterisation: An analysis of six characters in Romeo and Juliet</title>
                  </analytic>
                  <monogr>
                     <editor>
                        <name reg="Melander Marttala, Ulla">Ulla Melander Marttala</name>
                     </editor>
                     <editor>
                        <name reg="Ostman, Carin">Carin Ostman</name>
                     </editor>
                     <editor>
                        <name reg="Kyto, Merja">Merja Kyto</name>
                     </editor>
                     <title level="m">Papers from the ASLA symposium, Conversation in life and literature</title>
                     <imprint>
                        <publisher>Association Suedoise de Linguistique Appliquee</publisher>
                        <pubPlace>Uppsala</pubPlace>
                        <date value="2002">2002</date>
                        <biblScope type="pages">11-30</biblScope>
                     </imprint>
                  </monogr>
               </biblStruct>
               <biblStruct>
                  <analytic>
                     <author>
                        <name reg="Rayson, P.">P. Rayson</name>
                     </author>
                     <title level="a">Matrix: A statistical method and software tool for linguistic analysis through corpus comparison</title>
                  </analytic>
                  <monogr>
                     <imprint>
                        <publisher>Ph.D. thesis, Lancaster University</publisher>
                     </imprint>
                  </monogr>
               </biblStruct>
            </listBibl>
         </div>
      </back>
   </text>
</TEI.2>