Interoperability between metadata formats

Appendix 1:
Detailed comments on mapping IAFA templates to OCLC MARC

Michael Day
UKOLN: The UK Office for Library and Information Networking,
University of Bath, Bath, BA2 7AY, United Kingdom
http://www.ukoln.ac.uk/
m.day@ukoln.ac.uk

August 1996


To main report: Interoperability between metadata formats: IAFA Templates and USMARC


Template Type:

"These templates all contain the same fields, but have different "Template-Type" values. Suggestions for these types include:

Other names may be added to future releases of this document." (definitions from Deutsch, et al., 1995)

USMARC: This could match with the Fixed-field element "File", which defines "type of computer file" as: a. numeric, b. programs, c. representational, d. document, m. combination, u. unknown and z. other. Something like:


Category:

Type of object.
This defines the category of the object, e.g. "Technical Report" or "Conference Paper", "General Guide" or "User Manual" (Deutsch, et al. (1995)).

USMARC: There is no easy match here with USMARC. In the Rutgers- Princeton Inventory of Machine-Readable Texts in the Humanities, there was a perceived need for "genre" terms, so controlled entries from Genre terms: a thesaurus for use in rare book and special collections cataloguing were added to MARC field 690 - Local Subject Added Entry - Topical Term (Hoogcarspel, 1994, p. 30). Use of this field, however, really depends on the use of controlled terms, so it might be better to map "category" to field 655 Index Term - Genre/Form or Field 516 Type of Computer File or Data .


Title:

Complete title of the object

USMARC: 245 Title Statement The main issue is how subtitles could be included. It is probably easiest to place all the text in 245$a, or punctuation would have to be identified for automatic insertion of 245$b. However, there could be problems with indicators: First indicator: There would probably not be a need for a title added entry unless there is an entry under 246. Second indicator: If the IAFA title has characters before the first significant word, e.g. articles, punctuation, etc., this will have to be noted here, or the software should identify the most common. Subfield $h [computer file] would need to be added after the title statement.


URI-v*:

Description of access to object. The "v" means that multiple URIs can be put in a IAFA template.

USMARC: 856$u Uniform Resource Locator Assuming that the URI is a URL or similar, then they can be mapped directly onto 856 Electronic Location and Access $u URL.


Short-Title:

Summary title (if the Title is very long).

USMARC: 246 Varying Form of Title Follow rules for 246. The first indicator is to state whether there is an added entry, so, assuming that there will be a need for one - the first indicator (default) would be "1". The second indicator would probably be "3" for Other title (unspecified). 24613$a would do as a default. An added-entry 740 (Uncontrolled Related/Analytical Title) could also be automatically created from "Sort-Title".


Author-(USER*):

Description/contact information about the authors/creators of the object. The USER can refer to another template which contains some or all of the following data (from Deutsch, et al.,1995 § 7.5.1 and 7.5.2):

USMARC: Apart from the "Name", "Organisation-Name", and possibly the "Department" elements, there is no need for most of these in MARC. Thus, the data given is effectively lost The Author-Name needs to be mapped to either 100 Main Entry - Personal Name, or 110 Main Entry - Corporate Name. If Main Entries are deemed inappropriate here, then they could be mapped to the Added Entries in 700 and 711 - remembering as well that there is only one 100 field in each record so multiple authors would have to be placed under 700 in any case. There is a further problem with the format of the names. IAFA Templates suggest that person name fields should conform to a particular format, based on Bibtex, so that they can be parsed. Assuming that data entries conform to this standard, presumably correctly parsed names could be created to fit with the rigours of data entry for 1XX and 7XX name fields; but they may not conform to the Authority Control lists in use elsewhere in the catalogue. Another entry could be made for the Author/s at 245$c. Title Statement.


Admin-(USER*):

Description/contact information about the administrators/ maintainers of the object.

USMARC: Much the same applies for these entries as for the above. If possible, the Organization-name should be mapped to a 710 Added Entry - Corporate Name, although it may be difficult to ensure that they follow the correct format for that field. The relationship of the Administrators to the text could be placed in a 500 General Notes field.


Source:

Information as to the source of the object.

USMARC: As 537 Source of Data Note is now obsolete, OCLC suggest using field 500 General Note for "source of data" notes.


Requirements:

Any requirements for the use of the object. A free text description of any hardware/software requirements necessary to use the object.

USMARC: 538 System Details Note.


Description:

Description (that is, "abstract" in the case of documents) of the object.

USMARC: Probably best in field 520 Summary, etc. Note.


Bibliography:

A bibliographic entry for the object.

USMARC: No direct parallel - closest is 524 (recommended below), But could go in 500.


Citation:

The citation for the object when used in other works.

USMARC: 524 Preferred Citation of Described Materials Note. Just free text after 524$a


Publication-Status:

Current publication status of object (draft, published etc.).

USMARC: Probably free text in 500 General Note.


Publisher-(ORGANIZATION*):

Description/contact information about object publisher.

USMARC: Would need only the "Name" of the publisher, and put it in field 260$b. (260$a could be taken from Organization-City).


Copyright:

The copyright statement. Any additional information on they copying policy may be included.

USMARC: No parallel in USMARC, but could be included as 500 General Note.


Creation-Date:

The creation date for the object.

USMARC: 260$c Date of publication, distribution, etc.


Discussion:

Free text description of possible discussion forums (USENET groups, mailing lists) appropriate for this object.

USMARC: No parallel in USMARC, but could be included as 500 General Note.


Keywords:

Appropriate keywords for this object.

USMARC: Could all go under 653$a Index Term - Uncontrolled


Version-v*:

A version designator for the object.

USMARC: 250 Edition Statement, as in software. 250 Version 5.01


Format-v*:

Formats in which the object is available.

USMARC: Field 538 System Details Note is the closest match.


Size-v*:

Length of object in bytes (octets).

USMARC: 256 Computer File Characteristics. Maybe the system could generate the term "Computer data" (or "Computer program") followed by (size in KB).


Language-v*:

The name of the language in which the object is written. For documents this would be the natural language. For software this would be the programming language.

USMARC: A text note could go in field 546, although this is more relevant for translations. The best solution would be for the controlled language used in this field to be converted to the three letter codes used in the Fixed Field Element: Lang:, e.g.: eng.


Character-Set-v*:

The character set of the object. This should be a well-known value for example "ASCII" or "ISO Latin-1".

USMARC: No parallel in USMARC, but could be included as 500 General Note.


ISBN-v*:

The International Standard Book Number of the object.

USMARC: 020 International Standard Book Number The main problem with direct transcription is that ISBNs are entered into USMARC format without hyphens and with capital X.


ISSN-v*:

The International Standard Serial Number of the object.

USMARC: 022 International Standard Serial Number


Last-Revision-Date-v*:

Last date that the object was revised.

USMARC: If there is any major revision, AACR2 would treat it as a new edition. Presumably, something would have to be arranged with "Creation-date" details.


Subject-Descriptor Scheme-v*:

Name of classification scheme used in the corresponding Subject- Descriptor-v* field. Deutsch, et al., (1995b) suggest that "well known" classification schemes should be used, or there should be a reference to its specification. It would probably be best if a controlled vocabulary could be identified for the main classification and indexing schemes: e.g.: LC, UDC, DDC, LCSH. USMARC: See below.


Subject-Descriptor-v*:

A classification mark for the resource.

USMARC: There are plenty of codes for classification schemes in MARC, the most popular are given here:

The IAFA-template, if it can use controlled terms for the main classification schemes, could map directly with USMARC. e.g.:
Subject-Descriptor-Scheme-0: UDC
Subject-Descriptor: 971.1/.2

would become

080 971.1/.2

However, there might be a problem knowing which Subject Added Entry is required for LCSH.


Acknowledgements

UKOLN is funded by the British Library Research and Innovation Centre, the Joint Information Services Committee of the UK Higher Education Funding councils, as well as by project funding from JISC's eLib Programme and the European Union. UKOLN also receives support from the University of Bath, where it is based.


To main report: Interoperability between metadata formats: IAFA Templates and USMARC

To: main interoperability contents page
To: Mapping of IAFA templates to Z39.50 Bib-1 use attribute set
To: ROADS Home page
To: UKOLN metadata page


This work was carried out for the Resource Organisation And Discovery in Subject-based services (ROADS) project funded by the Electronic Libraries (eLib) Programme.

More information on ROADS can be found on the project's Web pages: <URL:http://www.ilrt.bris.ac.uk/roads/>


Maintained by: Michael Day of UKOLN The UK Office for Library and Information Networking, University of Bath.
Document created: 5-Aug-1996
Last updated: 12-Aug-1998

[UKOLN Metadata] [UKOLN Mapping Between Metadata Formats]