The Virtual Lightbox for Museums and Archives: A Distributed Solution for Structured Data Reuse Across Multiple Visual Resources Amy Smith a.c.smith@reading.ac.uk University of Reading Brian Fuchs fuchs@mpiwg-berlin.mpg.de Max Planck Institute for the History of Science, Berlin Leif Isaksen li103@soton.ac.uk University of Southampton The​ Virtual Lightbox for Museums and Archives (VLMA) is a tool for collecting and reusing distributed visual archives via RDF syndication and P2P technology. It aims to assist students and scholars in locating and exporting learning object/metadata groups that may then be reused and shared among users and user-groups. The VLMA is a response to the challenge of contextualization. It grew out of the experience of digitizing a small university collection —the Ure Museum of Classical Archaeology ()— and integrating its contents into a humanities portal —the ECHO project ()— with diverse content. Research and teaching required access to external resources, in order to provide context and points of comparison, while integration into a large portal with diverse objects demonstrated the need for a means to assemble groups of objects from diverse sources in a manner independent of their online presentation. It also became apparent that the need to allow students to collect objects from diverse online sources for reuse in other contexts, e.g. presentations, as opposed to integration into a collection's website, could not easily be met in the current situation of online collections. Recent technological innovations, such as metadata unification (e.g. Dublin Core), distributed metadata (e.g. Open Archives Initiative), and meta-metadata (CIDOC's Conceptual Reference Model), had indeed done much to ease the difficulties involved in content integration; but in all of these cases, content integration has remained almost exclusively the province of the data provider, who is responsible for repackaging harvested material. Thus, even when suitable metadata is available, collection and reuse of distributed content requires Herculean efforts on the part of the individual user. And because of the heterogeneous nature of the material collected, the individual user's efforts typically cannot themselves be rewarded with reuse in any significant fashion. The VLMA seeks to complement other methods of content integration with a point-of-reuse approach. Through this approach collections with intrinsically heterogeneous metadata sets are syndicated via RDF and then collected —browsed, stored, viewed, and reused— at the peer/client level. The idea is to add to the current integration options available to collections publishers/users a distributed variant, in which each peer determines its own strategy for metadata integration and content reuse. The problem of how to integrate collection metadata then becomes a question of reuse and syndication at the level of the individual user, rather than the provider, with content federation strategies flourishing or withering depending upon the current needs of the user community. Syndication of content can take several forms--a lecture, a student paper, or actual resyndication across the network in the form of a new collection. The latter possibility is particularly exciting, as it provides an easy method for bringing added value to published content (for example, an online scholarly article that discusses related artworks in diverse collections might provide a unified set of illustrations rather than, as is necessary at present, laboriously providing links to the diverse websites on which these objects might be illustrated) as well as a simple way of creating thematically related collections with distributed content. The VLMA method for content syndication is designed to be simple. A content producer seeds the network by syndicating already published content using a syndication tool which writes RDF to a lightbox namespace. The basic units of lightbox namespace are services, collection objects, and images, each of which is represented by an RDF fragment. In the simplest instantiation of a service, a consumer browses online objects in a collection, which s/he then captures to the lightbox. The lightbox then displays the images and metadata sets associated with this object, and "syndicates" them as a local collection, which appears in the service hierarchy alongside other collection browsers that have been discovered on the network. The consumer then has several reuse options, such as annotation, publication, export and local storage, which allow syndication with added value. The VLMA is an open-source tool, written in Java under a GPL, and funded in its first phase (through March 2005) by JISC. A functioning prototype applet is available from the project's website (; ). A crucial feature of the current implementation is an RDF store web service, which has been implemented with Sesame (), an open-source RDF database, as a backend. The use of such a service allows not only reduction in applet size but also significant latency reduction in RDF harvesting and querying, as visual collections clients typically access the same RDF material. The project has also agreed to pursue parallel development with Virtual Lightbox (), developed at the Maryland Institute for Technology in the Humanities, and has designed its code in a modular fashion so as to be easily able to incorporate developments at MITH. Bibliography Ure Museum of Classical Archeology University of Reading Virtual Lightbox for Museums and Archives Joint Information Systems Committee & Max Planck Institute for the History of Science Virtual Lightbox Maryland Institute for Technology in the Humanities ECHO European Commission Sesame openRDF.org