JD pointed me at an OAI feed from ContentDM, which is exactly what I need for my metadata harvesting. This is my plan:
I've started work on an XSLT stylesheet to do the job. The purpose of the stylesheet is to process detailed OAI metadata records which use Dublin Core identifiers into teiHeader elements suitable for adding to TEI documents Despatches project.
The OAI metadata is in the file oai_from_contentdm.xml, and originates in the UVic Library's ContentDM system. It contains 261 records relating to Early BC Maps, and most of these are maps also in the Colonial Despatches project collection. The ContentDM metadata is well-organized and has been considerably enhanced, so we're going to take that data and generate new teiHeader elements for our TEI files from it.
The first stage is to create a mapping between each of the fields in the OAI data and the location in the teiHeader where we propose to store it.
- oai_from_contentdm.xml (OAI record set).
- ../xml/maps/*.xml (TEI documents for each of the maps)
- map_lookup.xml (simple XML document which hopefully provides enough data to allow this transformation process to retrieve the correct TEI document for each record in the OAI data. This lookup will be based on a number of factors, including Penfold number, title, and descriptive information. Creating this file is the next stage in the process.
- ../xml/maps_enhanced/*.xml (from each TEI document we have, create an enhanced version which incorporates the original @xml:id and metadata, as well as the facsimile element with data about the image file, but also builds in the metadata gleaned from the OAI file. These files will eventually replace the original TEI files in the Despatches site, once the Map Gallery code has been rewritten to work with them.