Metadata: an overview of current resource description practice
Work Package 3 of Telematics for Research project DESIRE (no. 1004)
Title page
Table of Contents

Dublin Core

Note: see also the entry for Warwick Framework

Environment of use

Documentation

'Dublin Core' is shorthand for the Dublin Metadata Core Element Set which is a core list of metadata elements agreed at the OCLC/NCSA Metadata Workshop in March 1995. The workshop report forms the documentation for the Dublin Core element set.1

Constituency of use

The workshop was organised by OCLC and the National Centre for Supercomputer Applications (NCSA) to progress development of a metadata record to describe networked electronic information. This workshop followed on from joint meetings and discussions of th e American Library Association. The workshop brought together a range of interested parties from different professional backgrounds and subject disciplines, all of whom had been involved with metadata issues. The motivation progressing Dublin Core has bee n to reach a consensus among stakeholders on a minimal resource description which can be used for the benefit of all involved in the creation, search and retrieval of electronic resources. There has been high commitment and involvement from a range of pro fessions (publishers, computer specialists, librarians and information workers) and sectors (library utilities, software producers, service providers, libraries).

The Dublin Core is positioned as a simple information resource description. However, importantly it also aims to provide a basis for semantic interoperability between other, probably more complicated, formats. A third target use is to provide the basis fo r resource-embedded description, initially with HTML documents.

Ease of creation

The objective of Dublin Core is to define a simple set of data elements so that authors and publishers of internet documents could create their own metadata records with no extensive training. The Dublin Core approach is to have the level of bibliographic control midway between the detailed approaches of MARC and 'structured' TEI, and the automatic indexing of locator services such as Lycos. It is acknowledged that the Dublin Core is a minimal set, and that many 'publishers' or metadata producers may wish to augment this simple set with more specialised data.

Progress towards international standardisation

Initial attempts to include consideration of Dublin core elements as part of an IETF working group were not taken forward, on the grounds that the content of metadata records is outside the scope of IETF standards. However the Dublin core elements have be en considered by USMARC as central to their development of the USMARC record so the impact has already been seen in the formation of other metadata.

Ambitions to actualise Dublin Core were carried forward by a second international workshop which took place in the UK at the University of Warwick to UK in April 1996 sponsored by UKOLN and OCLC. This workshop looked at the implementation of Dublin Core a nd the requirements for extensibility, change control and dissemination. The need for a registration agency was discussed at this meeting.

Format issues

Designation and encoding

The Dublin Core is a set of elements that can be used to describe a resource but there was initially no attempt to prescribe an encoding method or record structure. During the Dublin Core workshop there was an explicit decision taken not to define syntax at this stage.

However certain principles were established for further development of the element set. Of particular relevance to encoding and designation are the principles of

• extensibility: the core set can be extended with further elements to describe intrinsic data of particular relevance to a particular community

• optionality: all elements are optional

• repeatability: all elements are repeatable

• modifiability: any element can be modified by one or more qualifiers

The sanctioning of qualifiers is of particular note as it is an attempt to bridge the gap between casual and sophisticated use. Qualifiers can be of two very different types: some indicating external schemes to be applied to processing e.g. OtherAgent(sch eme=TEI), some specifying more precise information about the attribute, in effect sub-dividing the element name e.g. OtherAgent(role=editor). If a scheme qualifier is used then this means the syntax of that scheme must be applied to the data in that eleme nt. So Author (scheme=USMARC) fields will contain data embedded with USMARC tags and sub-field markers, and OtherAgent (scheme=TEI) elements will contain data with TEI mark-up tags embedded. Potentially widespread use of qualifiers could cause severe prob lems with interoperability.

At the Warwick workshop a decision was taken to develop a concrete syntax for The Dublin Core in the form of an SGML DTD.

Content

Basic descriptive elements

The core element set includes the following bibliographic data elements:

• Title (name of the object)

• Author (person(s) primarily responsible for intellectual content)

• Publisher (agent or agency responsible for making the object available)

• OtherAgent (person(s) such as editors or transcribers, who have made other significant intellectual contributions to the work)

• Date (date of publication)

• ObjectType (genre of the object such as novel, poem, dictionary)

• Language (language of the intellectual content)

The Author element name does not distinguish the form of author (personal/corporate/meeting). Similarly the OtherAgent element name does not express the precise role of the other agent. It would be possible to use qualifiers to make these more precise dis tinctions, but the Dublin Core documentation does not attempt to make comprehensive recommendations. Suggested qualifiers are:

Author(scheme=USMARC)=100 1 Doyle, Conan $c Sir, $d 1859-1930

OtherAgent(role=editor)=Weibel,Stuart L.

As soon as such qualifiers are used the complexity of processing the data, and the difficulties for interoperability, will increase.

Subject description

The core element set includes the data elements:

• Subject (topic addressed by the work)

• Coverage (the spatial and temporal characteristics of the object)

The subject element can be used for headings controlled by a known classification scheme indicated in the qualifier, or can contain free text. The Coverage element allows spatial or temporal data to be included for geospatial data. This data might be in u nstructured form or in a format governed by a known scheme e.g.

Coverage(type=spatial)=Atlantic ocean

Coverage(type=spatial,scheme=LATLONG)=West=180,East=180,North=90,South=90

URIs

The core element set includes the data element:

• Identifier (string or number used to uniquely identify the object)

The data in this element could be an identifier conforming to an internationally recognised scheme (e.g. URL, ISBN) or it could be a local, privately administered number (e.g. university technical report number). The qualifier would need to be used to mak e the identifier generally useful.

Resource format and technical characteristics

The core element set includes the data element:

• Form (the data representation of the object such as Postscript file or windows executable file)

A constraint on the design of the Dublin Core, accepted by the workshop participants, was that the aim of the element set is to describe 'document like objects' (DLOs).

Administrative metadata

No administrative data is included in the Dublin Core set. A principle of intrinsically was established at the workshop which constrained the set to only include elements describing the intrinsic properties of the object. It would seem essential for any i mplementation of Dublin Core to include in a record such information as the record identification, record creation date, etc.

Provenance/source

The core element set includes the data element:

• Source (objects, either print or electronic, from which the resource is derived)

This element could be used to link different versions of an object which have the same intellectual content, whereas the relation element would be used to link objects with a different intellectual content.

Host administrative details/Terms of availability/copyright

An agreed constraint on Dublin Core is that extrinsic data such as cost and details of access methods would be excluded from the element set. It was accepted that only elements for resource discovery would be included, not

Other comments

At the Warwick Workshop it was decided that content-wise the Dublin Core should remain more or less as it was. It should not be indefinitely extended to encompass the variety of current and future metadata requirements which retrieval or request.

Ability to represent relationships between objects

The core element set includes the data element:

• Relation (relationship to other objects)

This element describes relationships to other objects with different intellectual content. It allows for a variety of relationships to be identified by use of the qualifier mechanism. Specification of a relationship would require use of at least two quali fiers, e.g.

Relation (type=ContainedIn) (identifier=URL) =http://www.ukoln.bath.ac.uk/metareview.html

might be raised; rather, these should be addressed in a Warwick Framework type solution.

Multi-lingual issues

The core element set includes the data element:

• Language (language of the intellectual content)

The problems of use of non-ASCII characters within the record were deliberately not addressed.

Fullness

The fullness of Dublin Core is low, by design. The attempt to compromise with sophisticated use by the qualifier mechanism could potentially lead to highly complex, much fuller records.

Conversion to other formats

MARBI Discussion Paper No 86 (Mapping the Dublin Core elements to USMARC) looks at options and problems in matching Dublin Core to USMARC. Because Dublin Core elements are less specific than MARC, some fields cannot be sufficiently identified to tag them correctly. For example the author field in MARC is identified as being personal or corporate name, whereas Dublin Core does not make this differentiation.

Rules for construction of these elements

No formulation of rules

Protocol issues

Not yet applicable.

Implementations

There have been a few early implementations of Dublin Core.

National Document and Information Service : this is a joint project between the National Libraries of Australia and New Zealand. Within this project the Dublin Core elements have been used as the core search attributes for their records, in effect the int ersection between their various databases. There has been flexibility in the use of semantics with mapping of other 'search fields' to the Dublin Core set.

DSTC in Australia is using the Dublin Core in the Research Data Network Co-operative Research Centre project for resource discovery.

References

1. Stuart Weibel, Jean Miller, Ron Daniel. OCLC/NCSA metadata workshop report. OCLC, March 1995. <URL: http://www.oclc.org:5046/conferences/metadata/dublin-core-report.html>

Next
                                                  
Table of Contents