ACH/ALLC Conference 2005
June 15 - June 18, 2005

Doing Text Analysis on the ACH Abstracts

The Conference Abstracts are all encoded as TEI P4 XML, and we have made an effort to provide URL access to individual abstracts as well as the whole collection in both XML and plain text format. We would like to encourage researchers to do text-analysis operations on the collection; it represents a snapshot of the state of humanities computing in 2005, and much could be learned from treating it as a textbase for analysis. You can access the documents as described below:

Accessing XML

Accessing plain text

Demonstration usages

Although not strictly text-analysis operations, these are a couple of examples of transformations and renderings created from the XML feeds:

How the system works

The program page, author page, title list and keyword list, along with the text-analysis feeds described above, are all based on the same underlying set of XML documents. This is how we built the system: