DCMI and ODRL – a discussion paper

Andy Powell
UKOLN, University of Bath

Introduction

This discussion paper considers some of the issues related to bringing together the use of DC metadata and ODRL.  It takes a DCMI-centric view, since, of the two standards, DC is the one with which the author has most familiarity.  It starts with a summary of the capabilities of DC metadata and then presents some potential use cases that describe how the combined use of DC and ODRL may benefit the end-user.

Finally, it outlines some of the issues that the ODRL-DCMI working group will have to address.

Summary of DCMI

DCMI provides a growing set of metadata terms (elements, element refinements, encoding schemes and controlled vocabularies) and three encoding syntaxes (XHTML, XML and RDF) that allow those terms to be used in a wide variety of resource discovery applications.  These are underpinned by the DCMI Abstract Model, which provides a description of the key entities that make up DC metadata records.

A number of DCMIs terms are valid in the context of encoding rights-related information.  These are summarised here:

Contributor

URI: http://purl.org/dc/elements/1.1/contributor Label: Contributor Definition: An entity responsible for making contributions to the content of the resource. Comment: Examples of a Contributor include a person, an organisation, or a service. Typically, the name of a Contributor should be used to indicate the entity. Type of Term: element Status: recommended Date Issued: 1999-07-02

Creator

URI: http://purl.org/dc/elements/1.1/creator Label: Creator Definition: An entity primarily responsible for making the content of the resource. Comment: Examples of a Creator include a person, an organisation, or a service. Typically, the name of a Creator should be used to indicate the entity. Type of Term: element Status: recommended Date Issued: 1999-07-02

Publisher

URI: http://purl.org/dc/elements/1.1/publisher Label: Publisher Definition: An entity responsible for making the resource available Comment: Examples of a Publisher include a person, an organisation, or a service. Typically, the name of a Publisher should be used to indicate the entity. Type of Term: element Status: recommended Date Issued: 1999-07-02

Rights

URI: http://purl.org/dc/elements/1.1/rights Label: Rights Management Definition: Information about rights held in and over the resource. Comment: Typically, a Rights element will contain a rights management statement for the resource, or reference a service providing such information. Rights information often encompasses Intellectual Property Rights (IPR), Copyright, and various Property Rights. If the Rights element is absent, no assumptions can be made about the status of these and other rights with respect to the resource. Type of Term: element Status: recommended Date Issued: 1999-07-02

Access Rights

URI: http://purl.org/dc/terms/accessRights Label: Access Rights Definition: Information about who can access the resource or an indication of its security status. Comment: Access Rights may include information regarding access or restrictions based on privacy, security or other regulations. Type of Term: element-refinement Refines: http://purl.org/dc/elements/1.1/rights Status: conforming Date Issued: 2003-02-15

Date Copyrighted

URI: http://purl.org/dc/terms/dateCopyrighted Label: Date Copyrighted Definition: Date of a statement of copyright. Type of Term: element-refinement Refines: http://purl.org/dc/elements/1.1/date Status: conforming Date Issued: 2002-07-13

License

URI: http://purl.org/dc/terms/license Label: License Definition: A legal document giving official permission to do something with the resource. Comment: Recommended best practice is to identify the license using a URI. Examples of such licenses can be found at http://creativecommons.org/licenses/. Type of Term: element-refinement Refines: http://purl.org/dc/elements/1.1/rights Status: conforming Date Issued: 2004-06-14

Provenance

URI: http://purl.org/dc/terms/provenance Label: Provenance Definition: A statement of any changes in ownership and custody of the resource since its creation that are significant for its authenticity, integrity and interpretation. Comment: The statement may include a description of any changes successive custodians made to the resource. Type of Term: element Status: conforming Date Issued: 2004-09-20

Rights Holder

URI: http://purl.org/dc/terms/rightsHolder Label: Rights Holder Definition: A person or organization owning or managing rights over the resource. Comment: Recommended best practice is to use the URI or name of the Rights Holder to indicate the entity. Type of Term: element Status: conforming Date Issued: 2004-06-14

Of these, it is worth noting that Creator, Contributor and Publisher are used to indicate an entity (an ‘agent’ in DCMI terminology) that is or was responsible for creating the resource or making it available.  However, there are no guarantees about what rights any of these entities hold over the resource, since any rights once held by any of these entities may have been sold or given away or simply revoked.

It is also worth noting that Date is defined very broadly and may not indicate a date that is relevant in the context of the ‘rights’ associated with the resource.

Similarly, the Rights element is also defined very broadly and might be used to provide a variety of rights related information, for example a simple copyright statement such as “(c) University of Bath, 2005” or a link to a complex ODRL rights statement.

Provenance will typically be used to hold a human-readable description of the custodial history of the resource.  This may be useful information to end-users wishing to trace the right-holding agents associated with a resource, but is unlikely to be structured sufficiently well in order that it may be usefully parsed by software applications.

In recognition of these kinds of problems, DCMI created some additional rights-related terms that are more specific in nature - Rights Holder, Date Copyrighted and Licence.  For example, by combining use of the Rights Holder, Date Copyrighted, Rights and Licence it is possible to encode (in a machine-readable way) information about who holds rights over as resource, when it was copyrighted and which licence it is made available under, as follows:

<dcterms:rightsHolder>University of Bath</dcterms:rightsHolder>
<dcterms:dateCopyrighted>2005</dcterms:dateCopyrighted>
<dc:rights>(c) University of Bath, 2005</dc:rights>
<dcterms:license>http://creativecommons.org/licenses/by/2.0/</dcterms:license>

The fragment of XML shown here is based on the DCMI guidelines for encoding DC in XML.

Note that there is some replication of information in the metadata above between the Rights and Rights Holder elements.  This is because the Rights Holder element does not specifically indicate the copyright holder and therefore the Rights element has been used to provide a more ‘traditional’ copyright statement.

Finally, it is worth noting that although the Creative Commons URI used above identifies a particular CC license, there are no guarantees about what will be made available at such a URI – there may be a machine-readable version of the licence (encoded using ODRL, XrML or some other markup language), there may be a human-readable statement about the licence, there may be the full legal text of the licence or there may be nothing.  In the context of a DC metadata description, a machine-readable description about the licence under which the resource is made available is known as a ‘related description’.

Potential ‘use cases’

This section outlines some possible use cases where the combined use of DC metadata and ODRL might be used to provide an enhanced service to the end-user.  Note that these use cases are somewhat aspirational and therefore may not reflect what is currently possible in the real-world!

  1. A lecturer discovers a Web page using Google and wishes to use some of the information it contains as part of a distance-learning module that he is creating.  The lecturer’s browser automatically downloads the ODRL licence under which the resource is made available and uses it to present a pop-up window containing a summary of what the lecturer is allowed to do with that particular resource.
  2. A researcher uses their library meta-search engine to search a set of high-quality bibliographic services that are relevant to her area of research.  The search engine uses Z39.50 and SRW to cross-search multiple targets.  When displaying each of the search results, the meta-search engine examines the returned metadata record to see if there is an ODRL rights statement attached or embedded.  In those cases where such a licence is available, the meta-search engine indicates this by placing an icon next to the search result.  When the researcher clicks on the icon a summary of the licence is presented.
  3. A public librarian uses an image search engine to discover images about his local area.  The image search engine uses the OAI-PMH to harvest metadata from image archives around the world.  Having discovered an image that looks to be of use, the librarian checks the copyright information in the OAI metadata record to determine the names of all the rights holders.  These are used as a set of starting points for seeking permission to use the image in the library Web site.
  4. A student is provided with a ‘resource list’ for one of her university course modules.  The list is delivered through the university Learning Management System (LMS) and is based on the IMS Resource List Interoperability (RLI) specification.  The LMS uses the ODRL licence that is linked to each list item to determine whether the student is allowed to re-use parts of the resource in their own work.
  5. A journalist uses his favourite RSS aggregator to keep track of a number of media blogs and news feeds that are of interest to him.  The aggregator looks for rights information in each item in these channels and uses this to indicate whether the journalist is allowed to re-purpose any of the resources that are available for use in his own stories.

Issues

The encoding guidelines currently endorsed by the DCMI currently have different capabilities with respect to the DCMI Abstract Model, i.e. not all features of the model can be encoded in all the syntaxes.  The RDF encoding can encode almost all aspects of the model and is therefore the richest.  The XML and XHTML guidelines do not support all parts of the model.  It is therefore the case that, currently, descriptions that can be encoded in one syntax may not be able to be encoded in the other syntaxes.

A very simple model of the key entities related to the combined use of DCMI metadata and ODRL may be represented as follows:

Figure 1

It is important to remember that the ‘value URI’ (the URI that identifies the value of a DCMI property) associated with the DC Licence property is the URI of the licence under which the resource is made available.  This may or may not be the same URI as the URI of the ODRL encoding of that licence.

Finally it should be noted that these has been a recent thread on the dc-architecture mailing list about whether it is possible and/or sensible to simply merge together fragments of XML from non-DC XML applications such as ODRL with DC XML metadata records.  The author is firmly of the view that it is not sensible to do this, unless the two XML fragments conform to the same underlying ‘model’, such as that provided by RDF.

Therefore, with the exception of those cases where both DC metadata and ODRL are encoded using RDF/XML, it seems more sensible consider how to link together, rather than merge, DC metadata description and ODRL documents.