Digital preservation - are metadata really crucial?
Panel 3, ECDL 2003, 7th European Conference on Research and Advanced Technology for Digital Libraries, Trondheim, Norway, 20 August 2003
Chair: Andreas Rauber (Institute of Software Technology & Interactive Systems, Vienna Technical University, Austria)
Participants: Catherine Owen (Performing Arts Data Service, University of Glasgow, UK), Steve Knight (National Library of New Zealand), Michael Day (UKOLN, University of Bath, UK)
Opening statement by Michael Day (UKOLN)
As someone who has been talking about 'metadata for digital preservation' for over six years, I would strongly argue that metadata are crucial for digital preservation. In this brief opening statement, I would like to make three short comments on the status of digital preservation metadata initiaives at the present time:
- Firstly, we need - and have always needed - far more practical experience of the data that we call preservation metadata. Because of the many roles that it is intended to fulfil - supporting preservation strategies, the integrity of objects, rights management, access control, etc. - preservation metadata schemas tend to become large in scale but also risk being based on assumptions that have not been rigorously tested in practice. We tried hard when developing the Cedars outline specification [1] to 'walk-through' the elements with reference to 'real' objects, but even this process could not help us prove that the schema would enable the successful preservation of these objects. We need far more practical experience of implementing preservation metadata, and these experiences need to feed back into the production and further reiteration of relevant standards. More controversally, I might also suggest that an over emphasis on the high-level information model defined in the Reference Model for an Open Archival Information System (OAIS) [2] has not always been entirely helpful in designing practical metadata schemas.
- Secondly, metadata costs are unknown, but are assumed to be expensive. On the other hand, the costs of data rescue are likely to be even higher - and the risk of data loss would be much greater. That said, schema developers need to be careful about imposing unnecessary costs on the preservation process. We might characterise this as needing to identify the 'right metadata.'
- Talk of costs brings me to my third (and final) point. There is already a good deal of metadata 'out there,' e.g. in Web pages or stored in databases. We need to take advantage of this. Metadata is already being captured and created to support all kinds of functionality - resource description, access control, object integrity, etc. - and, wherever possible, preservation systems need to take advantage of this, preferably in an automated way. There is a need for tools that automatically generate some metadata, that can extract it from other schemas on ingest into a repository, and that can capture metadata about preservation processes enacted thereafter. Of course, there will be dependencies on the unique identification of objects and problems dealing with the complexity and granularity of others. Also, there is a possible need for registries of metadata and formats [3, 4] that can help manage information about the schemas themselves. However, I have no time to explore these issues any further in this short introduction.
References
- Metadata for digital preservation: the Cedars project outline specification. Leeds: Cedars Project, 2000: http://www.leeds.ac.uk/cedars/metadata.html
- Reference model for an Open Archival Information System (OAIS). CCSDS 650.0-B-1 (2002): http://ssdoo.gsfc.nasa.gov/nost/isoas/
- Margaret Hedstrom, "Research challenges in digital archiving and long-term preservation." NSF Post Digital Library Futures Workshop, Chatham, Mass., USA, 15-17 June 2003: http://www.sis.pitt.edu/~dlwkshop/paper_hedstrom.html
- Stephen L. Abrams and David Seaman, "Towards a global digital format registry." 69th IFLA General Conference and Council, Berlin, Germany, 1-9 August 2003: http://www.ifla.org/IV/ifla69/papers/128e-Abrams_Seaman.pdf
Maintained by: Michael Day, UKOLN, University of Bath.
Last updated: 27-Aug-2003. |
|