Title:
|
DCMI Abstract Model |
Creator:
|
Andy
Powell
UKOLN, University of Bath, UK Mikael Nilsson KMR Group, CID, NADA, KTH (Royal Institute of Technology), Sweden Ambjrn Naeve KMR Group, CID, NADA, KTH (Royal Institute of Technology), Sweden |
Date Issued:
|
2003-12-18
|
Identifier:
|
|
Replaces:
|
|
Is Replaced By:
|
Not applicable
|
Latest Version:
|
|
Status of Document:
|
This is a DCMI Working Draft.
|
Description of Document: | This document describes an abstract model for DCMI metadata descriptions. |
|
This document specifies an abstract model for DCMI metadata descriptions [DCMI]. The primary purpose of this document is to provide a reference model against which particular DC encoding guidelines can be compared. To function well, a reference model needs to be independent of any particular encoding syntax. Such a reference model allows us to gain a better understanding of the kinds of descriptions that we are trying to encode and facilitates the development of better mappings and translations between different syntaxes.
The abstract model of the resources being described by DCMI metadata descriptions is as follows:
The abstract model of DCMI metadata descriptions is as follows:
The italicised terms used above are defined in the terminology section below. A number of things about the model are worth noting:
The DCMI abstract models for resources and descriptions are shown diagramatically in figures 1 and 2.
Figure 1
Figure 2
The abstract model described above indicates that each property used in a description must be an attribute of the resource being described. This is commonly referred to as the 1:1 principle - the principle that a DCMI metadata description describes one, and only one, resource.
However, real-world metadata applications tend to be based on loosely grouped sets of descriptions (where the described resources are typically related in some way). Such sets are often referred to as metadata records. For example, a metadata record might comprise descriptions of both a painting and the artist. Furthermore, it is often the case that the record will also contain a description about the metadata record itself (sometimes refered to as 'admin metadata' or 'meta-metadata').
This document defines a DCMI metadata record as follows:
A DCMI metadata value is the physical or conceptual entity that is associated with a property when it is used to describe a resource. For example, the value of the DC Creator property is a person, organisation or service - a physical entitiy. The value of the DC Subject property is a concept - a conceptual entity. The value of the DC Date property is a point in time - a conceptual entity. The value of the DC Coverage property may be a geographic region or country - a physical entity. Each of these entities is a resource.
The value may be identified using a value URI; the value may be represented by one or more value strings and/or rich values; the value may have some related descriptions - but the value is a resource.
The notions of simple and qualified DC are not defined by the abstract model. However, it is useful to consider how these two commonly used concepts can be described using the model:
The process of translating a qualified DC metadata record into a simple DC metadata record is normally referred to as 'dumbing-down'. The process of dumbing-down can be separated into two parts: property dumb-down and value dumb-down. Furthermore, each of these processes can be be approached in one of two ways. Intelligent dumb-down takes place where the software performing the dumb-down algorithm has knowledge built into it about the property relationships and values being used within a specific DCMI metadata application. Dumb dumb-down takes place where the software performing the dumb-down algorithm has no prior knowledge about the properties and values being used.
Based on this analysis, it is possible to outline a simple 'dumb-down algorithm' matrix, shown below:
Element dumb-down | Value dumb-down | |
Dumb | ignore any property that isn't in the Dublin Core Metadata Element Set [DCMES] | use value URI (if present) or value string as new value string |
Intelligent | recursively resolve sub-property relationships until one of the 15 properties in the Dublin Core Metadata Element Set [DCMES] is reached, otherwise ignore | use knowledge of the related descriptions or the value string to create a new value string |
In all cases, the dumb-down algorithm should also:
Note that software should make use of the DCMI term declarations represented in RDF schema language [DC-RDFS] and the DC XML namespaces [DC-NAMESPACES] to automate the resolution of sub-property relationships.
Particular encoding guidelines (HTML meta tags, XML, RDF/XML, etc.) [DCMI-ENCODINGS] do not need to encode all aspects of the abstract model described above. However, DCMI recommendations that provide encoding guidelines should refer to the DCMI abstract model and indicate which parts of the model are encoded and which are not. In particular, encoding guidelines should indicate whether any rich values or related descriptions associated with a statement are embedded within the record or are encoded in a separate record and linked to it using the record URI.
Appendices B, C and D below provide a summary comparison between the abstract model and the RDF/XML, XML and XHTML encoding guidelines.
This document uses the following terms:
Thanks to Pete Johnston, Mikael Nilsson, the members of the DC Usage Board and the members of the DC Architecture Working Group for their comments on previous versions of this document.
This appendix discusses 'structured values', as they are used in DC metadata applications at the time of writing.
Many existing applications of DC metadata have attempted to encode relatively complex descriptions (i.e. descriptions that contain more than simply a property and its string value). These attempts have been loosely referred to as 'structured values'. It is possible to identify a number of different kinds of structured values. Four are enumerated below. The first two of these are recommended by the DCMI, in the sense that there are a number of existing encoding schemes that define values that conform to these definitions of structured values. The latter two are not currently recommended, but it is likely that they are in fairly common usage across metadata applications worldwide.
These are strings that contain explicitly labelled components. Examples of this kind of structured value include:
<meta name="dcterms:temporal" scheme="dcterms:Period" content="start=Cambrian period; scheme=Geological timescale; name=Phanerozoic Eon;" />
<meta name="dc:creator" content="BEGIN:VCARD\nORG:University of Oxford\nEND:VCARD\n" />
Note that vCard is not currently a DCMI recommended encoding scheme.
These are strings that contain implicit components within the string, i.e. the components are determined based solely on their position within the string. Examples of this kind of structured value include:
<meta name="dc:date" scheme="dcterms:W3CDTF" content="2003-06-10" />
These are strings containing 'presentational' or other markup, for example adding paragraph breaks, superscripts or chemical/mathematical markup to a dc:description. It is possible to characterise various kinds of markup as follows:
These are metadata descriptions that describe a second resource (i.e. not the resource being described by the DC description). For example, a related description associated with the value of dc:creator could contain a complete description of the resource author (including birthday, eye-colour and favourite beverage if desired!).
In the past, 'related resource descriptions' have tended to be encoded using XML, vCard (see above) or by inventing multiple 'refinements' of DCMES properties (for example DC.Creator.Address). The RDF/XML encoding of DC (see below) provides us with a more thorough modelling of related metadata records through the use of multiple linked nodes in an RDF graph.
In DC metadata records, the following properties (and their element refinements) are used to provide the name or identifier of a second resource that is related to the resource being described:
In the case of the first three, this is typically done by providing the name (or in some cases the name and a small amount of additional information in order to better identify the person or organisation) of the related resource as the value string.
In the case of the last two, this is typically done by providing the URI (or some other identifier) of the related resource as the value URI. However, where no identifier is available, the name of the related resource can be provided instead (or as well) using the value string.
It should be noted that the value strings of these properties (and their element refinements) are not intended to be used to provide full descriptions of the related resource.
The categories outlined above are not watertight and there are certainly overlaps between them. For example, labelled strings can be viewed as a type of non-XML markup language. In addition, there will be cases where marked-up text (e.g. MathML) can be viewed as a related resource description.
Nevertheless, the purpose of the categorisation used here is to try and analyse existing usage of complex metadata structures within current DC metadata applications. In the context of the abstract model proposed here, all the types of structured values outlined above form part of the DCMI abstract model:
This appendix discusses the relationship between the DCMI abstract model and the Resource Description Framework (RDF).
RDF currently provides DCMI with the richest encoding environment of the available encoding syntaxes. It is therefore worth taking a brief look at how the abstract model described here compares with the RDF model.
Note that the intention here is not to provide a full and detailed description of how to encode DC metadata records in RDF. Instead, three simple examples of the use of DC in RDF are considered.
Figure 2 shows a simple RDF graph (and the RDF/XML document that represents it). The graph shows a resource with a single property (dc:creator). The value of the property is a second (blank) node, representing the creator of the resource. This second blank node has several properties, used to describe the creator, and an rdfs:label property that is used to provide the value string for the dc:creator property. |
|
Figure 3 shows the same information separated into two graphs. In this case the related description that describes the creator has been more clearly separated from the description of the resource by moving it into a separate RDF/XML document. In order to do this, the node representing the value has been assigned a value URI, allowing the two nodes in the two RDF/XML documents to be treated as representing the same thing. The related description in the second RDF/XML document is linked to the first using the rdfs:seeAlso property and the URI of the RDF/XML document. Note that it is not strictly necessary to separate the two graphs in this way; it is perfectly valid to represent the second graph as a sub-graph of the first, as shown in figure 2. However, for the purposes of this document, the two graphs have been separated in order to more clearly differentiate the description from the related description. In some cases it will be good practice to facilitate this separation anyway. For example, in order to serve the second graph from a directory service of some kind. |
|
Figure 4 shows a second simple RDF graph (and the RDF/XML document that represents it). The graph shows a resource with a single property (dc:subject). The value of the property is a second (blank) node, representing the subject of the resource. This second blank node has an rdfs:label property that is used to provide the value string for the dc:subject property, an rdf:value property that is used to provide the classification scheme notation and an rdf:type property to provide the encoding scheme URI. |
|
Figure 5 shows the same information separated into two graphs. In this case the related description that describes the subject has been more clearly separated from the description of the resource by moving it into a separate RDF/XML document. In order to do this, the node representing the value has been assigned a value URI, allowing the two nodes in the two RDF/XML documents to be treated as representing the same thing. The related description in the second RDF/XML document is linked to the first using the rdfs:seeAlso property and the URI of the RDF/XML document. Note that it is not strictly necessary to separate the two graphs in this way; it is perfectly valid to represent the second graph as a sub-graph of the first, as shown in figure 4. However, for the purposes of this document, the two graphs have been separated in order to more clearly differentiate the description from the related description. In some cases it will be good practice to facilitate this separation anyway. For example, in order to serve the second graph from a terminology service of some kind. |
|
Figure 6 shows a third simple RDF graph (and the RDF/XML document that represents it). The graph shows a resource with a single property (dc:description). The value of the property is a second (blank) node with an rdfs:label property that is used to provide the value string for the dc:description property. The second node also has an rdfs:seeAlso property that links to a rich value - in this case some HTML marked-up text that provides a richer representation of the description. Note that it is possible to embed the marked-up text within a single RDF graph (using rdf:parseType="Literal"). However, this is not shown here. |
|
By re-visiting the second figure from example 2 (figure 5) it is possible to layer the terminology used in the abstract models above over the RDF graph. |
|
This appendix compares the DCMI abstract model with the Guidelines for implementing Dublin Core in XML DCMI recommendation.
Figure 9
Figure 8 shows an example simple DC description encoded according to the XML guidelines above. The example shows how the encoding supports the property (and property URI), value string and value string language aspects of the DCMI abstract model. It should be noted that all the values that are encoded in this syntax are represented by value strings, even those that look, to the human reader, as though they are URIs.
Figure 10
Figure 9 shows an example qualified DC description encoded according to the XML guidelines above. This example shows how the encoding supports the property (and property URI), value string, value string language, value URI, encoding scheme (and encoding scheme URI) and resource class aspects of the DCMI abstract model. Note that the 'dcterms:URI' encoding scheme is used to indicate that the content of the XML element is a value URI. Note also that, although the resource class is indicated, the class URI is not encoded anywhere in this description.
The following aspects of the DCMI abstract model are supported by the Guidelines for implementing Dublin Core in XML recommendation:
The following aspects of the DCMI abstract model are not supported:
The following constraints apply:
This appendix compares the DCMI abstract model with the Expressing Dublin Core in HTML/XHTML meta and link elements DCMI proposed recommendation.
Figure 11
Figure 10 shows an example simple DC description encoded according to the XHTML guidelines above. This example shows how the encoding supports the property (and property URI), value string and value string language aspects of the DCMI abstract model. Again, it should be noted that all the values represented in this encoding syntax are denoted by value strings, even those that look, to the human reader, as though they are URIs.
Figure 12
Figure 11 shows an example qualified DC description encoded according to the XHTML guidelines above. This example shows how the encoding supports the property (and property URI), value string, value string language, value URI, encoding scheme (and encoding scheme URI) and resource class aspects of the DCMI abstract model. Again, note that the 'dcterms:URI' encoding scheme is used to indicate that the content of the XHTML <meta> element is a value URI and that, although the resource class is indicated, the class URI is not encoded anywhere in this description.
The following aspects of the DCMI abstract model are supported by the Expressing Dublin Core in HTML/XHTML meta and link elements proposed recommendation:
The following aspects of the DCMI abstract model are not supported:
The following constraints apply: