How to use this tutorial | Top |
This tutorial is intended for those who are interested in more technical aspects of the OAI-PMH, although the Overview and the History and Development of OAI-PMH, together with the Glossary, are suitable for those who simply require some general background information. Each part builds on the material in the earlier parts, so a good approach is to work through the parts in order, referring to the glossary as required. In addition to the Glossary, you will find key terms defined within each part of the tutorial. Sets of quick quiz questions for the introductory sections help you to check whether you've picked up key points along the way.
Overview (this part) introduces the basic concepts underlying the OAI and the OAI-PMH. Use this part to gain an understanding of what the OAI-PMH is, and what it does and does not provide. History and Development of OAI-PMH covers the emergence of the Open Archives Initiative, showing how it grew from roots in several earlier initiatives, and discussing the nature of the problems for which it aims to provide solutions. This part also surveys the development of the protocol (including the evolving nature, aims and technical components) from the Santa Fe Convention, through OAI-PMH v.1.0/1.1, to OAI-PMH v.2.0.
The rest of the tutorial contains more technical material. The Main Technical Ideas of OAI-PMH introduces and explains in some detail the key technical elements of the protocol. Implementing OAI-PMH outlines implementation issues for Data Providers and Service Providers; it explains how to implement OAI-PMH as a Data Provider and as a Service Provider, including both the necessary steps for a local implementation and several examples of freely available and adaptable tools for implementations. XML Schemas and Record Formats provides an overview of the implementation of a Data Provider metadata set, including coverage of XML schema and how to support multiple record formats.
Basic OAI concepts and features | Top |
--- Open Archives Initiative (OAI) ---
The essence of the open archives approach is to enable access to Web-accessible material through interoperable repositories for metadata sharing, publishing and archiving. It arose out of the e-print community, where a growing need for a low-barrier interoperability solution to access across fairly heterogeneous repositories lead to the establishment of the Open Archives Initiative (OAI). The OAI develops and promotes a low-barrier interoperability framework and associated standards, originally to enhance access to e-print archives, but now taking into account access to other digital materials. As it says in the OAI mission statement "The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content."
Many communities are beginning to or potentially could benefit from the open archives approach. The Internet and the growing mass of material in digital format have broadened the potential clientele of many repositories of information. Material can be accessed more widely and also exploited for purposes different from those that originally motivated the creation of the repositories. Moreover, the possibility of accessing multiple repositories enables the construction of new kinds of services that can better serve the needs of the users. An additional incentive is the potential for cost-saving inherent in new models of the scholarly communication process that could be supported with this approach.
As an organisation, the OAI has included an Executive for management, and Steering and Technical Committees for policy direction and evaluation of protocol developments. The Digital Library Federation (DLF), the Coalition for Networked Information (CNI), and the National Science Foundation (NSF) have funded the OAI. While the Executive and the funders are USA-based, the success of the OAI is firmly grounded in the participation of a community of people from around the world, particularly Europe as well as North America. Now that there is a well-developed and stable second version of the protocol, the need to keep control in the hands of a very small number of people who can take independent and speedy decisions may be less important when weighed against the perception of stability and authority conferred by control through a standards body such as ISO, and this possibility has been discussed within the OAI.
--- OAI Protocol for Metadata Harvesting (OAI-PMH) ---
The OAI-Protocol for Metadata Harvesting (OAI-PMH) defines a mechanism for harvesting records containing metadata from repositories. The OAI-PMH gives a simple technical option for data providers to make their metadata available to services, based on the open standards HTTP (Hypertext Transport Protocol) and XML (Extensible Markup Language). The metadata that is harvested may be in any format that is agreed by a community (or by any discrete set of data and service providers), although unqualified Dublin Core is specified to provide a basic level of interoperability. Thus, metadata from many sources can be gathered together in one database, and services can be provided based on this centrally harvested, or "aggregated" data. The link between this metadata and the related content is not defined by the OAI protocol. It is important to realise that OAI-PMH does not provide a search across this data, it simply makes it possible to bring the data together in one place. In order to provide services, the harvesting approach must be combined with other mechanisms.
Much promise is seen for the use of the protocol within an open archives approach. Support for a new pattern for scholarly communication is the most publicised potential benefit. Perhaps most readily achievable are the goals of surfacing 'hidden resources' and low cost interoperability. Although the OAI-PMH is technically very simple, building coherent services that meet user requirements remains complex. The OAI-PMH protocol could become part of the infrastructure of the Web, as taken-for-granted as the HTTP protocol now is, if a combination of its relative simplicity and proven success by early implementers in a service context leads to widespread uptake by research organisations, publishers, and "memory organisations".
Seven key definitions | Top |
Open Archive Initiative (OAI)
OAI is an initiative to develop and promote interoperability standards that aim to facilitate the efficient dissemination of content.
Archive
The term "archive" in the name Open Archives Initiative reflects the origins
of the OAI in the e-prints community where the term archive is generally accepted
as a synonym for repository of scholarly papers. Members of the archiving profession
have justifiably noted the strict definition of an ?archive? within their domain;
with connotations of preservation of long-term value, statutory authorization
and institutional policy. The OAI uses the term ?archive? in a broader sense:
as a repository for stored information. Language and terms are never unambiguous
and uncontroversial and the OAI respectfully requests the indulgence of the
professional archiving community with this broader use of ?archive?.
(OAI definition quoted from FAQ on OAI Web site)
OAI Protocol for Metadata Harvesting (OAI-PMH)
OAI-PMH is a lightweight harvesting protocol for sharing metadata between
services.
Protocol
A protocol is a set of rules defining communication between systems. FTP (File
Transfer Protocol) and HTTP (Hypertext Transport Protocol) are examples of other
protocols used for communication between systems across the Internet.
Harvesting
In the OAI context, harvesting refers specifically to the gathering together
of metadata from a number of distributed repositories into a combined data store.
Data Provider
A Data Provider maintains one or more repositories (web servers) that support
the OAI-PMH as a means of exposing metadata.
(OAI definition quoted from FAQ on OAI Web site)
Service Provider
A Service Provider issues OAI-PMH requests to data providers and uses the metadata
as a basis for building value-added services.
(OAI definition quoted from FAQ on OAI Web site)
A Service Provider in this manner is "harvesting" the metadata exposed
by Data Providers
Sources of further information | Top |
The rest of this tutorial.
Open Archives Initiative (OAI official Web site)
http://www.openarchives.org/
Open Archives Forum (OA-Forum Web site)
http://www.oaforum.org/
Quick Quiz Questions | Top |
Answer and 'Mark' each question separately. Feedback is provided for each marked answer. Once you have marked a question, you can get a further 'Explanation' of the answers. When you have finished, check your total marks for the questions you tried. The marks are provided only to you, they are not stored when you leave this page.
Copyright © 2003 University of Bath. All rights reserved.
Author: Leona Carpenter (co-ordinating author) for OA-Forum and UKOLN |
Last modified: 14 Oct 2003 16:36 Authored in CALnet |