Collection Level Description
1. Introduction
This study reviews existing practice for providing collection
level descriptions, as it exists in the library, archival, museum
and Internet communities.
It originated from discussions at MODELS workshops [MODELS], where the need for a review of different approaches to collection level description was identified, particularly in the context of phase 3 of the Electronic Libraries Programme [ELIB]. The study was taken forward by UKOLN as a MODELS recommendation.
At the simplest level one can think of a 'collection' as being
any aggregation of individual 'items' (also known as
objects or resources). Items
may be physical or digital. Physical items include books, journals,
museum artefacts, photographs, papers etc. Digital items include
Web pages, databases, images, etc.
In some cases the digital items are surrogates of
physical items, in others the digital items are the primary (only) manifestation of the item.
Some collections are
catalogues (metadata) for other collections. For example, a library
catalogue, which is itself a collection, typically describes the items
in one or more collections within a library.
Collections may be grouped by
type, by subject area, by geographic location of resources or
according to some other criteria.
Collections may be permanent or transient.
Collections of Web resources may only exist long enough to
transfer information about the collection from one application to
another.
Section 2 of this study provides a more detailed
discussion on what the term 'collection' means, firstly from the
perspective of libraries, archives and museums and then taking a
look at the more recent meaning of the term as it is used on the
World Wide Web.
This study uses the following terminology:
- Collection
- An aggregation of resources.
Collections are exemplified by the following non-exhaustive list:
Internet catalogues (e.g. Yahoo);
subject gateways (e.g. SOSIG, OMNI, ADAM, EEVL, etc.);
library, museum and archival catalogues;
Web indexes (e.g. Alta Vista);
collections of text, images, sounds, datasets, software, other material or combinations of these (this includes databases, CD-ROMs and
collections of Web resources);
collections of events (e.g. a lecture series);
library and museum collections;
archives.
A variety of mechanisms for
providing collection level descriptions are described in
section 3 of this study.
- Item
- An individual object, for example a Web page, an image file, an audio file,
or a movie.
Items are often referred to as
resources, objects, documents or document like objects (DLO).
The dividing line between collections and items is somewhat vague because
items may
themselves be collections of other objects.
For example, a Web page
may be a collection of text, images, applets, etc. However, because
the component parts are intended to be rendered together as a whole, a Web page
is typically treated as an item rather than a collection.
Description of individual objects
is well established within the curatorial traditions.
Consider, for example, the MARC records used in libraries to
describe books and journals.
On the Internet,
resource description is less well established but a variety of
mechanisms either have been or are being developed, including GILS
[GILS]
and the Dublin Core
[DC].
These mechanisms for resource
description are only described in detail in this study insofar as
they may be used to provide collection level descriptions.
- Service
- An application level service and associated protocol.
Services provide the mechanism for end-users or,
in the case of digital resources,
end-user's client software, to gain access to
collections and their component items.
Services may be physical, a library or museum service, or digital,
a Z39.50 server.
Access to digital services is typically
based on the client-server model currently, though we are likely to
see a move towards distributed object models in the future.
Digital service descriptions typically provide the client
with enough information to connect to the server. The information
they contain is dependent on the particular protocol in use but is
often as simple as a machine name, a port and a database name.
Section 4 of this report considers a variety of mechanisms that
have been, or currently are being, developed to describe
application level services.
- Service Provider
- An organisation or individual who manages and provides access
to resources or collections of resources. Describing organisations
and individuals is well understood and a variety of mechanisms for
providing directories, often called 'white-pages services', have
been developed, both within ISO (X.500
[X500]) and by the Internet
community (LDAP
[LDAP], WHOIS++
[RFC1835]). This report does not discuss
service provider description in any detail.
Andy Powell, UKOLN