Ad-hoc Committee: Timothy Cole (UIUC), Thomas Habing (UIUC), Diane Hillmann (Cornell), Jane Hunter (DSTC), Pete Johnston (UKOLN), Carl Lagoze (Cornell), Andy Powell (UKOLN)
Document Date: 2003-02-28
This document is a follow-on to three efforts:
Publication of the DCMI Usage Board's consolidated description of all current DCMI terms at http://dublincore.org/usage/terms/dc/current-elements/ and http://dublincore.org/usage/terms/dc/current-schemes/.
Publication of the DCMI proposed recommendation "Guidelines for Implementing Dublin Core in XML" at http://dublincore.org/documents/dc-xml-guidelines/.
Joint work between the Open Archives Initiative and the DCMI to define an XML schema for unqualified Dublin Core, available at http://www.openarchives.org/OAI/2.0/oai_dc.xsd. This work was motivated by the requirements of the base metadata format for the OAI Protocol for Metadata Harvesting, but is useful for other applications that exchange unqualified Dublin Core records.
The schema presented in this document conform to the W3C XML Schema (1.0) recommendations. They are suggested rather than prescribed and may, in fact, co-exist with other schema for exchanging Dublin Core metadata. XML schema are interoperability vehicles; the greater number of applications that agree on a single schema the greater the ability to easily share Dublin Core metadata. Therefore, while the committee that formulated this proposal hopes that the proposed schema will be useful to a breadth of applications, we recognize that different functionality, provided by different schema, may be required by some.
While the schema presented here are indeed suggested, the functionality they support is congruent with the qualification model in the Dublin Core Qualifiers document. Therefore, applications that employ other schema that express additional functionality should recognize that doing so compromises interoperability with applications that use this schema.
The set of schema proposed meet the following set of requirements:
Three distinct namespaces and corresponding schema. The proposal includes three separate schema that correspond to the three namespaces defined by the Namespace Policy for the Dublin Core Metadata Initiative recommendation at http://dublincore.org/documents/2001/10/26/dcmi-namespace/. These are:
http://purl.org/dc/elements/1.1/, for the 15 Dublin Core
elements as defined in
http://dublincore.org/documents/dces/.
Schema: dc.xsd
http://purl.org/dc/terms/, for additional elements, and element refinements and encoding schemes,
defined in
http://dublincore.org/usage/terms/dc/current-elements/ and
http://dublincore.org/usage/terms/dc/current-schemes/.
Schema: dcterms.xsd
http://purl.org/dc/dcmitype/, for the DCMI type vocabulary
defined in
http://dublincore.org/usage/terms/dcmitype/.
Schema: dcmitype.xsd
Restricted to Dublin Core elements and their refinements. The three schema for the DCMI namespaces declare XML elements to represent the Dublin Core elements and their refinements. The container schema provided here restrict the elements in a valid instance document to the 15 Dublin Core elements as defined in http://dublincore.org/documents/dces/, additional elements approved by the usage committee (e.g., "audience"), and the so-called element refinements as defined in http://dublincore.org/usage/terms/dc/current-elements/. This means that so-called application profiles that mix elements from other namespaces or metadata vocabularies are not valid according to these container schema. An application profile schema may import the base schemas proposed here and use them in association with schema for other non-DCMI namespaces. However, implementers adopting that approach should give consideration to the implications for interoperability with applications based on the schema which specify that only Dublin Core elements and element refinements are valid.
Restricted to "simple literal" values. According to the schema, the values of elements (XML and Dublin Core) elements may only have simple "string" values (which may be further restricted in the manner described below), defined via the type simpleLiteral in the schema. simpleLiteral is defined to allow the "lang" attribute from the XML namespace, thereby expressing the language of the string that is the element value. Complex values - i.e., instance documents with additional elements nested within the (XML and Dublin Core) elements are not valid. Various discussions in the DCMI have alluded to such "structured values" for elements, but no consistent data model yet exists. By exploiting features of the XML schema specification, the proposed schema are designed so that it is possible to import the schema into an extension schema that does allow additional nested elements as values for the Dublin Core elements. We note, however, that applications that employ such extensions will not be valid according to the schema proposed here and, therefore, not interoperable except by translation methods (not yet defined here or by the DCMI).
Element refinements are expressed as XML elements. Therefore, within instance documents there is no explicit association between element refinements (e.g., "alternative") and the base element (e.g., "title"). Such association is established at the schema level through a feature of the XML schema specification called "substitutionGroups", an attribute that defines elements as substitutable for other elements. For example, the schema define an XML element "alternative" with a "substitutionGroup" attribute that has the DCMI namespace qualified value "title", thereby establishing the linkage between the element refinement and its base element. The substitutionGroup mechanism is essentially a structural one: the use of substitutionGroups in XML Schema in the general case says nothing about any "semantic" relationship between the elements. In this case, however, we use the substitutionGroup relation to represent the semantic relationship between element and element refinement.
Encoding schemes are expressed as new complexTypes. Each complexType is defined as a "restriction" on the base "simpleLiteral", which is the default value restriction for all elements and their element refinements. This then makes it possible to refine the string value of an element appropriately; e.g., a type "DCMIType" is defined that has a value set restricted to that defined by the DCMI type vocabulary at http://dublincore.org/usage/terms/dcmitype/. The implication for instance documents is that an "xsi:type" attribute must be specified for an element to specify that its value conforms to one of the encoding schemes. Due to limitations of the XML schema specification, it is impossible to restrict encoding schemes to specific elements (e.g., to specify that the type "LCSH" can only be specified as the encoding scheme for the value of the "subject" element).
Base schemas
These three schemas declare XML elements to represent the Dublin Core elements and element refinements and a number of complexTypes
to represent encoding schemes:
Schema: dc.xsd
Target XML Namespace: http://purl.org/dc/elements/1.1/
Schema: dcterms.xsd
Target XML Namespace: http://purl.org/dc/terms/
Schema: dcmitype.xsd
Target XML Namespace: http://purl.org/dc/dcmitype/
Container schemas
These schemas declare XML elements to act as containers for specified subsets of the Dublin Core elements and element refinements declared in the base schemas:
simpledc.xsd
Target XML Namespace: none
qualifieddc.xsd
Target XML Namespace: none
Sample application schemas
These schemas provide examples of how a container schema might be used in an application:
appsimpledc.xsd
Target XML Namespace: [decided by application]
appqualifieddc.xsd
Target XML Namespace: [decided by application]
http://purl.org/dc/elements/1.1/
The schema dc.xsd defines a complexType
called SimpleLiteral:
<xs:complexType name="SimpleLiteral"> <xs:complexContent mixed="true"> <xs:restriction base="xs:anyType"> <xs:sequence> <xs:any processContents="lax" minOccurs="0" maxOccurs="0"/> </xs:sequence> <xs:attribute ref="xml:lang" use="optional"/> </xs:restriction> </xs:complexContent> </xs:complexType>
The SimpleLiteral complexType
is defined in terms of mixed complexContent
. However, the cardinality attributes on the xs:any
element dictate that this complexType
does not permit child elements.
The fifteen Dublin Core elements in this namespace are represented as XML elements. The schema declares an abstract
element any with a type of SimpleLiteral. Because it is declared as abstract
, this element can not be used in an instance document. Each XML element representing a Dublin Core element is declared as a non-abstract element which is substitutable for the any element e.g.
<xs:element name="title" substitutionGroup="any"/>
Finally, the schema defines a group
elementsGroup and a complexType
elementContainer. With the dc:any element, these two constructs provide mechanisms by which external schemas can reference the set of elements declared in this schema without referencing each element individually - though it is still possible for an external schema to reference individual elements if desired.
For example, a schema can simply import
the dc.xsd schema and use the elementContainer complexType
as the type of an element, and this would make the DC elements available as child elements.
<xs:import namespace="http://purl.org/dc/elements/1.1/" schemaLocation="dc.xsd"/> <xs:element name="simpledc" type="dc:elementContainer"/>
An example of such a schema is provided as simpledc.xsd.
The simpledc.xsd schema does not use a targetNamespace
. It is possible to validate an instance directly against this schema. Where an application wishes to specify a namespace for the container element, it can be assigned when this schema is included in an application schema.
An example of such an application schema is provided as appsimpledc.xsd.
An example of an instance document which validates against that application schema is provided as testsimpledc.xml.
An example of an instance document which fails to validate against that application schema is provided as testsimpledc2.xml. (dcterms:modified
not permitted.)
http://purl.org/dc/terms/
The schema dcterms.xsd imports
the schema dc.xsd. The Dublin Core elements and element refinements in this namespace are all represented as XML elements, and importing the dc.xsd schema makes the any abstract
element and the SimpleLiteral complexType
available for use. Importing the dc.xsd schema also enables the indication of relationships between DC element refinements and the elements that they refine, using substitutionGroups
.
An XML element which represents a DC element in this namespace is declared as substitutable for the any abstract element:
<xs:element name="audience" substitutionGroup="dc:any"/>
And an XML element which represents a DC element refinement is declared as susbtitutable for the element it refines:
<xs:element name="alternative" substitutionGroup="dc:title"/>
Encoding schemes are mechanisms for constraining the "value spaces" of DC elements and element refinements. In this schema, they are represented as named complexTypes
derived from the SimpleLiteral complexType
. For example, the complexType
corresponding to the encoding scheme for "W3CDTF" is as follows:
<xs:complexType name="W3CDTF"> <xs:simpleContent> <xs:restriction base="dc:SimpleLiteral"> <xs:simpleType> <xs:union memberTypes="xs:gYear xs:gYearMonth xs:date xs:dateTime"/> </xs:simpleType> <xs:attribute ref="xml:lang" use="prohibited"/> </xs:restriction> </xs:simpleContent> </xs:complexType>
N.B. Some schema-validating XML parsers may not support this construct. See notes.
The use of one of these complexTypes
is specified by the use of the xsi:type
attribute in the instance document. The value of the xsi:type
attribute is a QName
correponding to the name of the complexType
:
<dc:date xsi:type="dcterms:W3CDTF">2002-07-09</date>
Use of this datatype means that a validating parser will check that the element content conforms to one of the builtin date/time types.
Not all of the complexTypes
associated with encoding schemes impose such "tight" validation. For example, the complexType
for "LCSH" prescribes only that the element content is a character string:
<xs:complexType name="LCSH"> <xs:simpleContent> <xs:restriction base="dc:SimpleLiteral"> <xs:simpleType> <xs:restriction base="xs:string"/> </xs:simpleType> <xs:attribute ref="xml:lang" use="prohibited"/> </xs:restriction> </xs:simpleContent> </xs:complexType>
In theory at least, it is possible to define a complexType
which enumerates all the possible values of a Library of Congress Subject Heading, but it would be impractical to validate against such a list. However, the principle of validating against an enumerated list of values is illustrated in the schema dcmitype.xsd for the DCMI Type Vocabulary (see next section).
An example schema which takes this approach for ISO639-2 language codes is available at http://dli.grainger.uiuc.edu/publications/metadatacasestudy/dc_schemas/iso639-2.xsd.
Similarly to the dc.xsd schema, the dcterms.xsd schema defines a group
elementsAndRefinementsGroup as a means of referring to all the elements and element refinements. A complexType
elementOrRefinementContainer is also defined.
A schema can simply import
the dcterms.xsd schema and use the elementOrRefinementContainer complexType
as the type of an element, and this would make the DC elements and element refinements available as child elements.
<xs:import namespace="http://purl.org/dc/terms/" schemaLocation="dcterms.xsd"/> <xs:element name="qualifieddc" type="dcterms:elementOrRefinementContainer"/>
An example of such a schema is provided as qualifieddc.xsd.
Like the simpledc.xsd schema, the qualifieddc.xsd schema does not use a targetNamespace
. An implementation may validate directly against this schema or it may specify a namespace for the container element by including this schema in an application schema.
An example of such an application schema is provided as appqualifieddc.xsd.
An example of an instance document which validates against that application schema is provided as testqualifieddc.xml.
An example of an instance document which fails to validate against that application schema is provided as testqualifieddc2.xml. ('1963/08/17' is not a valid W3CDTF date.)
http://purl.org/dc/dcmitype/
The dcmitype.xsd includes only a named simpleType
which defines an enumerated list of values for the DCMI Type Vocabulary.
This simpleType
is referenced in a complexType
in the dcterms.xsd schema.