Wrote the Indexing system for teiJournal
Pulled a long day on this one, starting from scratch with designing the system, through the XML data for the DB, XQuery to pull it out, and XSLT to configure and link it. This is how it works:
There's a file called indexes.xml that sits in the settings/default subcollection of the database. This is a TEI file which contains a list of items, like this:
<text>
<body>
<head>Indexes</head>
<list>
<item>
<name>People</name>
<code lang="xpath" n="find">/text//name[(@type='person') or not(@type)]</code>
<code n="display">if ($hit/surname and $hit/forename) then (concat($hit/surname, ', ', $hit/forename)) else (string($hit))</code>
</item>
<item>
<name>Organizations</name>
<code lang="xpath" n="find">/text//name[@type='org']</code>
</item>
[... and so on.]
Each item constitutes an index that will be created and displayed. The <name> element specifies the heading which will show up on the page, the first <code> element specifies the XPath which finds all the XML nodes you want to index, and the second (optional) <code> element is some XPath which can format the items you find in a particular way.
Next, an XQuery document called indexes.xq parses this, and constructs actual queries from it, which it throws at the database, producing all the actual indexes, again in the form of a TEI file with lots of lists and items. That file is passed to indexes.xsl, which figures out which index has been selected (based on a parameter from the URL, which XQuery has used to select an index by adding an attribute to it), and then renders each distinct item in the list, in order, followed by links to each of the documents that contains it.
The XSLT took a while, because some items need to be rendered using existing templates (e.g. abbreviations), while others are plain text and just need to be displayed. Also, this uses some of the new grouping features in XSLT 2.0, which are relatively new to me.
Right now, each index item is followed by a list of links to the articles that contain it. My last task for this week is to make those links, along with the links to articles found through searches in the Contents/Search page, display with the relevant text highlighted. Once that's done, I think phase one is complete.