Interoperability between metadata formats

IAFA Templates and USMARC

Michael Day
UKOLN: The UK Office for Library and Information Networking,
University of Bath, Bath, BA2 7AY, United Kingdom
http://www.ukoln.ac.uk/
m.day@ukoln.ac.uk

August 1996


1. Mapping of IAFA Templates to USMARC

In the context of Interoperability, it would be interesting to investigate the compatibility between IAFA templates and MARC. This assumes that metadata from an information provider in the form of an IAFA template could be automatically converted into a relevant MARC record and displayed, and maybe saved, if necessary, in that format. Similarly, it might be possible for information in a MARC database to be searched and the information to be retrieved in IAFA template format.

For this to be able to take place IAFA templates will have to map onto MARC and vice versa. Of the various MARC formats available, USMARC has been chosen because it has been recently adapted to cope with networked information and because it has a very large user-base. In this draft, OCLC MARC - which has minor variants from official USMARC - has been used. Further work is planned to map IAFA templates to USMARC.

2. Mapping of IAFA Templates to OCLC MARC

Table 1 (below) provides a preliminary mapping from IAFA templates to MARC. The IAFA template fields are taken from Deutsch, et al. (1995) as adapted for the ROADS project (ROADS, 1995).

Please note that the table applies only to the IAFA templates for documents,data-sets, mailing list archives, Usenet archives, software packages, images and other objects, as described in Deutsch, et al., (1995, § 8.4.4).Information on OCLC MARC is taken from OCLC (1993).


Table 1. Mapping from IAFA templates to OCLC MARC

IAFA Template

OCLC MARC

Handle

File:

Category

655 Index Term - Genre/Form, or

518 Type of Computer File or Data Note

Title

245$a Title Statement

URI-v

856 Electronic Location and Access

Short-Title

246 Varying Form of Title

Alternative-Title

246 Varying Form of Title

Author - (USER*)

100 Main Entry - Personal Name, or

110 Main Entry - Corporate Name, or

700 Added Entry - Personal Name, or

710 Added Entry - Personal Name, or

245$c Title Statement

Admin - (USER*)

710 Added Entry - Corporate Name

Source

500 General Note

Requirements

538 System Details Note

Citation

524 Preferred Citation of Described Materials Note

Publication-Status

500 General Note

Publisher - (ORGANISATION*)

260$b Publication, Distribution, etc.

Copyright

500 General Note

Creation-Date

260$c Publication, Distribution, etc.

Discussion

500 General Note

Keywords

653 Index Term - Uncontrolled

Version-v*

250 Edition Statement

Format-v*

538 System Details Note

Size-v*

256 Computer File Characteristics

Language-v*

Lang:

Character-Set-v*

500 General Note

ISBN

020 ISBN

ISSN

022 ISSN

Last-Revision-Date-v*

260$c Publication, Distribution, etc.

Subject-Descriptor-Scheme

See below

Subject-Descriptor-v*

050 Library of Congress Call Number

080 UDC Number

082 Dewey Decimal Call Number

084 Other Call Number

090 Locally Assigned LC

092 Locally Assigned Dewey

098 Other Classification Schemes

6XX Subject Added Entries

To-Be-Reviewed-Date

No equivalent

Record-Last-Verified-Email

No equivalent

Record-Last-Verified-Date

No equivalent

Comments

No equivalent

Destination

No equivalent


3. Comments

Most of the IAFA templates map onto at least one MARC field- although the "fit" is not perfect. A more detailed account of the mapping can be found at Appendix 1. Before moving on to more detailed issues of data loss and other incompatibility there are five major observations that can be made:

  1. Many of the data elements end up merely as General Notes, where the information would be likely to be displayed, but would not be capable of being searched.
  2. IAFA templates allow for a certain flexibility to be allowed with regard to author names. This contrasts markedly with USMARC's insistence on AACR2 style entries and observance of authority control. This probably constitutes the biggest problem with the mapping process.
  3. IAFA templates have two entries for dates: "Creation-date"and "Last-Revision-Date". In the AACR2 environment of USMARC, any significant revision would have to be counted as a new edition - requiring a new record.
  4. IAFA templates are designed to be adaptable. Users are able to add new data elements if they feel that it is appropriate. For example OMNI on their basic template have an extended set of administrative metadata ("Record-Last-Modified", "To-Be-Reviewed-Date", etc.) and some new data elements ("Access-Policy", "Access-Times","Charging-Policy" and "Registration"). This raises questions about how these new data elements could be included in any mapping procedure if they are felt to include important information.
  5. The creators of IAFA templates are not encouraged to include all data elements, but just the ones relevant to the networked information being described. It is possible that a minimal level IAFA template would not generate enough information to make a MARC record viable.

A viable MARC record could be created from the information in an IAFA template although the interaction of a human cataloguer would probably improve the MARC record and ensure that problem areas like author names are in the format defined by AACR2.

Data Loss

Inevitably, there will be some loss of data when mapping from IAFA Templates to USMARC. This data loss falls into two main categories:

  1. Personal and organisational data. IAFA Templates allow for a quite detailed profile of an author, administrator, sponsor, publisher or host organisation of a network resource. If required, it is possible to include postal addresses and telephone/fax numbers. This information has never been part of theUSMARC format, and is unlikely to become so in the near future. All of this data, where it has been input, will be lost when mapping to USMARC.
  2. System and administrative data. The IAFA templates used by OMNI have several fields which contain data about the record itself. For example the 'Handle' and 'Record-Last-Modified-Date' fields. It is unlikely that this data would be necessary in a USMARC format record. Of a slightly different nature is the 'Destination' field, which could alert browsers of the bibliographical record to the subject information gateway itself.

Information lacking in IAFA Templates

On the other hand, an IAFA template will not translate completely to a "full" USMARC format record. IAFA templates do not contain the controlled terms needed by a format primarily based on AACR2 conventions. The problems fall into three main categories:

  1. Author

    Multiple Authors: When there is more than one 'Author-Name' there could be a problem of where to map the multiple authors. Perhaps, if there are no more than three authors, then the first could be the main entry and the others added entries. Some form of addition to 245 Title Statement would also be desirable, e.g.:

    [245 10 OCLC/NCSA Metadata Workshop report / $c Stuart Weibel, Jean Godby, Eric Miller]

    Even more of a problem is when no author is given; maybe just the details of an administrator, owner, sponsor or publisher. These latter could be included in an added entry.

    Personal/Corporate Authors: Deutsch, et al. (1995) suggest that personal name fields should conform to a particular format, based on BibTex, so that they can be parsed. Parsed personal names could, possibly, be added in the relevant fields for main entry (100) and added entry (700) although they will probably not conform to the Authority Control lists in use elsewhere in the catalogue. Although it could be assumed that 'Author-Name' and 'Admin-Name' would usually be personal names (700) and those of 'Owner-Name' and 'Sponsoring-Name' corporate names (710), this does not have to be the case and could cause confusion.

  2. Variable Field 260. USMARC, following AACR2, has strict rules about publisher details. Field 260 'Publication, Distribution, Etc. (Imprint)' is a mandatory field for all levels. The usual pattern of this field would be: 260 Place of publication : $b Name of publisher, $c Date of publication. If an IAFA template contained the following: 'Publisher-City', 'Publisher-Name' and 'Creation-Date' a MARC record of sorts could be produced, although it would not strictly follow AACR2 rules. However, in practice, IAFA templates will not include all of these, so the majority of records could just read: 260[S.l. : $b s.n., $c 199-?]. Many IAFA templates will identify 'Owner' rather than 'Publisher' fields. In this case, it would be good if these could be mapped to the 260 field if there are no publisher details given.

  3. Preponderance of Notes. Most of the data given in the IAFA templates will, inevitably, only map to notes. This has always been a problem with computer files, so perhaps it is inevitable. It is, perhaps, worth mentioning that in MARC, notes are not able to be searched.

  4. Added entries without explanatory notes. Where an Administrator, Owner or Sponsor of a network resource is mentioned in a IAFA template, their name could be mapped to the appropriate added entry format in 700 or 710. However the same data would have to be mapped to a general note field (500) so that the added entry is justified by the catalogue record.

  5. Incomplete records. It should be stressed that IAFA templates do not have each and every field completed. There is also a certain amount of flexibility concerning the creation of new fields. This complicates the mapping process somewhat. If there is no consistency in approach from those who produce the templates, whetherthey be professional cataloguers or data providers, any mapping can only be provisional.

4. An Example of a MARC record created from an IAFA template

A recognisable USMARC record could be created from a IAFA Template. The following example is taken from a record used in one of the ROADS-based services (OMNI).


Table 2. Sample record: mapping to USMARC

Template-Type: DOCUMENT

Handle: 83381296-7713

Title: Better care of the child with cancer

URI-v1: http://www.dundee.ac.uk/MedEd/webupdate/child/cancer.htm

URI-v2: http://www.dundee.ac.uk/MedEd/webupdate/child/cancer.htm

Author-Name-v1: Richard Stevens MRCP MRCPath

Author-Job-Title-v1: Consultant Oncologist

Author-Email-v1: update@dundee.ac.uk

Description: An article on childhoodcancers from Web Update, covering a summary of the various types of malignancy, aetiology, improvements in survival rates in recent years.

Publisher-Name-v1: Centre for Medical Education, University of Dundee

Subject-Descriptor-v1: 616-006

Subject-Descriptor-v2: QZ275

Subject-Descriptor-Scheme-v1: UDC

Subject-Descriptor-Scheme-v2: NLM

To-Be-Reviewed-Date: 960901

Destination: omniuk

Record-Last-Modified-Date: Tue, 03 Jun 1996 14:49:55 +0000

Record-Last-Modified-Email: unknown@mail-address

Record-Created-Date: Tue, 03 Jun 1996 14:42:44 +0000

Record-Created-Email: unknown@mail-address

The items in bold script can be mapped to create the following OCLC MARC record:

File: d

060 QZ275

080 616.006

100 Richard Stevens MRCP MRCPath

245 00 Better care of the child with cancer $h [computer file] / $c Richard Stevens MRCP MRCPath

256 Computer file

260 [s.l.] : $b Centre for Medical Education, University of Dundee, $c [n.d.]

520 An article on childhood cancers from Web Update, covering a summary of the various types of malignancy, aetiology, improvements insurvival rates in recent years.

856 7 $2 http $a www.dundee.ac.uk $d /MedEd/webupdate/child$f child.htm $u http://www.dundee.ac.uk/MedEd/webupdate/child/cancer.htm $2 http


Notes:

  1. File: d is derived from the "Template-type" DOCUMENT.
  2. The NLM and UDC classification have to be first identified by the "Subject-Descriptor-Scheme-v*" field, then the text taken from "Subject-descriptor-v*".
  3. The Author is in the wrong format for a MARC 100 Main Entry field. This is an important problem.
  4. 145 $h and 256 could be added as defaults in the MARC record.
  5. Although Dundee is mentioned in the "Publisher-Name-v*", it will not map in the format expected for MARC field 260.
  6. It should be noted that certain parts of this sample IAFA Template will not map to USMARC, and this information will be lost in the translation. Most notably, information like the author's institution, job-title, address and contact details find no place in USMARC.Also missing will be the database specific information like the "handle", and information on the creation and modification of the record.

5. References


6. Acknowledgements

UKOLN is funded by the British Library Research and Innovation Centre, the Joint Information Services Committee of the UK Higher Education Funding councils, as well as by project funding from JISC's eLib Programme and the European Union. UKOLN also receives support from the University of Bath, where it is based.


Back to main interoperability contents page

To: Appendix 1: Detailed mapping of IAFA templates to OCLC MARC
To: Mapping of IAFA templates to Z39.50 Bib-1 use attribute set
To: ROADS Home page
To: UKOLN metadata page


This work was carried out for the Resource Organisation And Discovery in Subject-based services (ROADS) project funded by the Electronic Libraries (eLib) Programme.

More information on ROADS can be found on the project's Web pages: <URL:http://www.ilrt.bris.ac.uk/roads/>


Maintained by: Michael Day of UKOLN The UK Office for Library and Information Networking, University of Bath.
Document created: 6-Aug-1996
Last updated: 12-Aug-1998

[UKOLN Metadata] [UKOLN Mapping Between Metadata Formats]