Dean Rehberger and Joy Palmer
This paper will examine the role of secondary repositories can play in enhancing access and interaction for students and scholars in the humanities. While access to online resources has steadily improved in the last decade, online archives and digital libraries still remain difficult to use, particularly for students and novice users (Arms). In some cases, a good deal of resources have been put into massive digitization initiatives that have opened rich archives of sources to a wide range of users. Yet, the traditional cataloging and dissemination practices of libraries and archives make it difficult for these users to locate and use effectively these sources, especially within scholarly and educational contexts of the humanities. Many digital libraries around the country, large and small, have made admirable efforts toward creating user portals and galleries to enhance the usability of their holdings, but these results are often expensive and labor intensive, often speaking only directly to a small segment of users.
To address these problems, we begin with the assumption that access and preservation are mutually dependent concepts. Preservation and access can no longer be thought of in terms of stand alone files or individual digital objects, but rather must directly impact the ways in which users reuse, repurpose, combine and build complex digital objects. This assumption relies on a more complex meaning for the term access. Many scholars in the field have called for a definition of access that goes beyond search interfaces to the ability of users to retrieve information "in some form in which it can be read, viewed, or otherwise employed constructively"
((Borgman 57)). Access thus implies four related conditions that go beyond the ability to link to a network:
- equity ߞ the ability of 'every citizen' and not simply technical specialists to use the resources;
- usability ߞ the ability of users to easily locate, retrieve, use, and navigate resources;
- context ߞ the conveyance of meaning from stored information to users, so that it makes sense to them;
- interactivity ߞ the capacity for users to be both consumers and producers of information.
The keys to enhancing access for specific user groups, contexts, and disciplines are to build secondary repositories with resources and tools that allow users to enhance and augment materials (Shabajee), share their work with a community of users (Waller), and easily manipulate the media with simple and intuitive tools (or at least build interfaces that match existing, well-known applications). Users will also need portal spaces that escape the genre of links indexes and become flexible work environments that allow users to become interactive producers (Miller).
Herbert Van de Sompel has proposed a successful system (OpenURL/SFX framework for context sensitive reference linking) for disaggregating reference linking services from e-publishing. In his framework, the service of providing links between references and across e-publisher's digital repositories is separated from the services provided by the e-publishers. In so doing, the service provides "seamless interconnectivity between ever-increasing collections of heterogeneous resources", freeing primary repositories from the difficult and expensive task of ensuring links to references while giving users greater access to resources and increasing the value of the digital object (Van de Sompel). Similarly, we propose the concept of secondary repositories that would be responsible for handling secondary metadata, extended materials and resources, interactive tools and application services. This information is cataloged, stored, and maintained in a repository outside of the primary repository that holds the digital object. The comments and observations generated by users in this context are usually highly specialized because such metadata is created from discipline-specific, scholarly perspectives (as an historian, social scientist, teacher, student, enthusiast, etc.) and for a specific purpose (research, publishing, teaching, etc.). Even though the information generated by a secondary repository directly relates to digital objects in primary repositories, secondary repositories remain distinctly separate from the traditional repository. The information gathered in secondary repositories would rarely be used in the primary cataloging and maintenance of the object, and primary repositories would continue to be responsible for preservation, management, and long-term access but would be freed from creating time-consuming and expensive materials, resources, services, and extended metadata for particular user groups.
In line with digital library best practices, digitized sources are typically cataloged to describe their bibliographic information, along with technical, administrative, and rights metadata. While these practices are essential for preserving the digital object and making it available to users, unfortunately they do so in a language and guise often difficult to understand within the context of use (Lynch 2003). Even though the author's name, the title of the work, and keywords are essential for describing and locating a digital object, this kind of information is not always the most utilized information for ascertaining the relevance of a digital object. For instance, K-I2 teachers often do not have specific authors or titles in mind when searching for materials for their classes. Teachers more frequently search in terms of grade level, the state and national standards that form the basis of their teaching, or broad overarching topics derived from the required content and benchmark standards (e.g., core democratic values or textbook topics) that tend to display too many search returns to make the information of value.
While cursory studies have indicated these access issues, still very little is known about archival use or how these users express their information needs (Duff, Duff & Johnnson). For digital libraries to begin to fulfill their potential, much research is needed to better understand the processes by which primary repositories are accessed and how information needs are expressed. For example, research needs to address the ways in which teachers integrate content into their pedagogy so that bridges can be built from digital repositories to the educational process, bridges that greatly facilitate the ability of teachers and students to access specific information within the pedagogical process. Recent research strongly suggests that students need conceptual knowledge of information spaces that allow them to create mental models to do strategic and successful searches. As with any primary source, the materials in digital libraries do not literally 'speak' for themselves and impart wisdom; they require interpretation and analysis lysis (Bowker & Star; Duff; Duff & Johnson). Allowing communities of users to enhance metadata and actively use, reuse, repurpose, combine and build complex digital objects can help users to contextualize the information they find, draw from deeper resources within the digital library, and find more meaningful relationships between digital objects and their needs. Thinking in terms of a distributed model (similar to the open source software community) that allows users both easier access to materials and a greater range of search criteria and also provides opportunity for active engagement in the generation of metadata and complex digital objects, promises to help us rethink our most basic assumptions about user access and long-term preservation.
Collections can also benefit by defining communities of users. For example, with the recent release of secret White House tapes (
http://millercenter.virginia.edu/), the sheer number of tapes and hours make it impossible for adequate cataloging of content as well as the difficulty of determining the context and people involved (or even what is said given the poor quality of many tapes). Those historians and scholars (a more regulated and highly defined set of experts) allowed access to the collections could supply information about content and context as well as set terms for debates over more questionable areas of interpretation (e.g., when sound quality makes passages inaudible). While metadata gathered in these ways would need to be qualified (maintained in a secondary repository) because of lack of quality control, the processes could make large quantities of data that is key to many disciplines in the humanities more available and usable.