[Prev Page] [Next Page ] [Contents]
Preserving electronically-held data for posterity is a complex and challenging problem. The instability of the storage media, and the rapid obsolescence of the equipment needed to read the data, pose problems which can seem insoluble in the face of accelerating technological change and current financial constraints. For archival purposes, hopes centre on optical discs, which carry guarantees of up to 30 years' life, with one manufacturer claiming 100 years (ref20). The hardware and software needed to read these products is usually unsupported within a decade.
Documents which are delivered on a physical medium will present conservation problems in respect of the substrate itself (a CD-ROM, for example, will presumably need to be copied to a fresh disc before it deteriorates beyond the point of readability - the same order of problem as that posed by acidic paper). With magnetic media, the signal on the substrate tends to attenuate fairly rapidly, and it is customary to rewrite records on magnetic tape annually, in order to preserve them uncorrupted. As well as attenuation of the record, magnetic media are vulnerable to corruption by magnetic and electrostatic fields, and physical and chemical changes to the plastic substrate attributable in some cases to bad storage. It might reasonably be expected that the suppliers of electronic publications should be able to give their customers some idea of the estimated or probable lifespan of their products. As some media are more durable than others, it is conceivable that materials destined for libraries could be supplied on the most durable medium, an electronic equivalent of "library binding".
One response to these problems is the "shell concept": stripping away software and machine dependence from a file, documenting it as necessary to preserve provenance and meaning, and converting it to a common character format (ref21), such as ASCII or Unicode, which will then require periodic refreshment by re-copying to the most stable medium currently available. Data archives which have developed since the 1960s, notably those in Denmark, Austria, the UK and the Netherlands, have traditionally adopted this approach (ref22). At the ESRC Data Archive, for example, all data is stored at some point as plain ASCII text, and care is taken to employ house standards that avoid any lock-in of data.
The costs associated with this commitment to recopy are high, and there may be problems of copyright. Moreover, whilst such techniques may be feasible for straightforward textual and statistical data, in other cases severing the link between the software and the records renders the records themselves meaningless. "Archiving the computer" is not a practical option. Representative examples of obsolete equipment are already being collected and displayed by specialist museums, but the prospects of keeping this equipment in working order are poor; such ventures are necessarily limited in scope and location, and they cannot adequately address the problem of access.
For systems which display inter-dependence between software and data, standards need to be developed for the taking of meaningful "snapshots" for archival purposes, and system software developers need some incentive to meet such standards. Similar standards and procedures are required in respect of the many systems where the data held is transient. These include both electronic publications distributed by non-material means (networks and broadcasting), and also most office systems, which will otherwise bequeath no trace of their activity to the long-term future. Unless these standards and procedures for archival capture and sampling can be achieved, a mixed, and many would suggest inadequate, policy will be required, based on an acceptance that the only reliable way to preserve some electronic records is to make printouts on acid-free paper or silver halide microfilm (ref23) .
For documents which are delivered over networks, and which may have no physical existence other than as files on a hard disc somewhere in the world, conservation reduces to a need to ensure that they continue to be available - not necessarily from the original host. This implies the development of a set of "Rules of the Game" to be accepted by all who make documents available, or alternatively of the development of a structure, possibly at national or regional levels, of collectors and archivers of electronic materials.
Even if standards were in place, the communications structure to enable the proper accessibility of archived data does not yet exist: it is hoped that the opening-up of SuperJANET to non-academic users will soon be a reality.
Whether or not electronically-held data has been published in any traditional or legal sense, the techniques required to ensure its indefinite preservation and availability in the future are broadly similar. In pointing out the absence of a national centre for data archiving it is not intended to obscure achievements to date. The ESRC Data Archive at the University of Essex is the largest British repository of accessible computer-readable data relating to social and economic affairs. It seems likely that others will follow: a recent report (ref24) recommends the creation of a humanities data archive along similar lines. There is clearly a need for a national archive of publications on CD-ROM.
It is hoped that research currently under way at the British Library into the longevity of magnetic media will continue. In recent years, this has linked with practical experiments at the Public Record Office, which is preparing to establish a computer-readable data archive (CRDA) with the intention of accessioning data from government departments by the end of 1995. There is, however, a clear need now for more focused cooperation in the development of data archiving in the UK, to prevent duplication of effort and resources, and to define and foster the adoption of much-needed standards and procedures for archival data capture and storage.
[Prev Page] [Next Page ] [Contents]
[UKOLN Home Page] [Papers and Reports] [British Library Papers and Reports ]