ROADS Templates : how they are usedRachel Heery |
Resource description within the ROADS software uses a simple record structure based on attribute:value pairs. This record structure is commonly referred to as the ROADS template and is derived from the IAFA templates definitions contained in the IETF Internet Draft 'Publishing information on the internet with anonymous FTP' <URL:http://info.webcrawler.com/mak/projects/iafa/iafa.txt>. This document published in October 1995 specifies record structures (or templates) that can be used as the basis for descriptions of the contents of anonymous FTP archives. At that time anonymous FTP archives were the most popular method of making information available over the network to the academic community. Although web servers have superseded FTP archives, the guidelines remain valid as there is a strong parallel in the resources that need to be described.
ROADS templates fit well with the characteristics of Internet resources and the environment in which they are indexed. It is appropriate to describe network resources by means of a simple record as Internet resources have a tendency to volatility in that their content and location may be short lived. Often networked resources are of the nature of pre-print reports, they may be working documents or early drafts of work that will eventually be published in a more traditional forum. Typically the resources available to describe Internet resources will be limited, so that if human effort is involved record creation of necessity must be straightforward.
The ROADS templates provide a `lightweight' means of describing resources. The attribute names are in text rather than numeric tag, no fixed codes are used, there are minimal rules for the formulation of content of the various data elements. If appropriate records can be created with a minimal set of attributes offering the possibility of record creation by the author or web server administration.
Within the UK there are other implementations, using software other than ROADS which make use of the IAFA template structure. The ones we know about are the Parallel Computing Archive (IPCA) based at HENSA created by Dave Beckett, and NetEc <URL:http://cs6400.mcc.ac.uk/NetEc.html> which gives access to papers and reports in economics created by Thomas Krichel.
Within the original IAFA guidelines various template types are defined, each template being a logical grouping of data elements appropriate for the description of different genres of resource. The guidelines suggest the following template types:
In addition templates are defined for clusters of data elements which are commonly used to describe people or groups:
ROADS has made use of a subset of these defined templates and has added one type. The templates available by default in ROADS are:
At the moment one new template type has been introduced into the defaults, PROJECT, and the template outline is included with the ROADS software release. This template was needed by a ROADS user to allow a database of project records to be built up. The content was discussed with other ROADS users and it was thought to be of general use so is now included in the release. Other template types can be added by individual users to their own system if they wish, and if the template type could usefully be added to the ROADS software release this can be done after discussion with other ROADS users. Within an individual system, as well as creating new template types, the administrator can delete template types that are not required.
The IPCA parallel computing archive has introduced two other template types: EVENT to allow conferences and exhibitions to be described, and DIRECTORY to allow for collection or directory level descriptions.
Statistics were gathered by Jon Knight at Loughborough University in Spring 1996, giving figures for usage of different template types and attributes by the eLib subject services using ROADS (OMNI and SOSIG), as well as NetEC and IPCA.
Note: UKOLN are currently gathering more up to date usage figures (Autumn 1996). The initial summaries are available here
[These are currently unavailable: 11-Feb-1997]:
- EEVL
- IPCA
- NetEC
- OMNI
- SOSIG
- eLib Projects (UKOLN)
The statistics showed which template types were being chosen to describe resources. The eLib services had databases made up of over 70% SERVICE templates, with much lower percentages of DOCUMENT and MAILARCHIVE template types. Other templates were hardly used. Whereas IPCA and NetEC used the DOCUMENT template for the majority of their database entries.
SOSIG | OMNI | NetEC | IPCA | |
Document | 4 | 9 | 100 | 84 |
Service | 79 | 73 | ||
Mailarchive | 14 | 17 | ||
Software | 1 | 1 | 9 | |
Organization | 8 | |||
SiteInfo | 8 | |||
Dataset | 2 | |||
User | 6 |
The template usage figures reflect the difference in the types of resource described by the different services. Whereas the eLib subject services concentrate on describing quality resources at the `collection' or `service' level, IPCA and NetEC tend to deal with individual reports, papers and articles. At present the eLib subject services tend towards description of resources at a high level of granularity , often at the web server level which itself can be viewed as a service or a collection of documents. The predominance of the SERVICE template type in the ROADS services has led to some discussion as to whether we could introduce another template type, perhaps the Logical Archive template type (LARCHIVE) defined in the IAFA Guidelines. This might be used to distinguish records describing collections of documents, from records giving details of a service. The LARCHIVE could be used to describe logically grouped sets of pages on a particular server.
The attributes for the various template types are listed separately. All the `document-like' templates (DOCUMENT, IMAGE, SOUND etc) contain the same attributes, whereas there are different sets for SERVICE, USER and ORGANIZATION.
The implementation of IAFA templates within ROADS has led to some amendment and enhancement of the attributes within each template, while remaining compliant with the original guidelines. In a similar way those working on white pages applications using WHOIS++ compliant directory service software have enhanced IAFA templates with further attributes for personal names and contact details.
Within ROADS the SERVICE template has been enhanced with more bibliographic fields, e.g. to allow for title and subject terms. A field allowing alternative titles has been added to templates to cope with the difficulty in locating the definitive title on web pages. As well as keywords, two additional attributes are available in all templates: Subject-Descriptor and Subject-scheme. This means traditional classification and subject headings can be used, and the scheme in use recorded. It was agreed that a variety of administrative metadata was required to enable the database to be managed effectively. This required the addition of a number of attributes such as Record-Last-Verified, Record-Last-Verified-Email, Comments, Destination, Record-creation-date.
As with template types, the individual service administrator can edit the ROADS defaults to add or delete attributes as required.
The statistics show that the eLib services use relatively few attributes to describe resources. By far the majority of templates contain ten or less descriptive attributes. (Administrative metadata used for the management of the database is excluded from these figure, as all records contain this information). Six essential fields for resource description are used in nearly all SERVICE and DOCUMENT records. These attributes are:
URI, Subject Descriptor, Subject Scheme, Description, Title, Keywords
There is then some variance between the two template types in the other most used attributes:
DOCUMENT | SERVICE |
Author | Admin-Email |
Copyright owner | Category |
Admin-Email | Owner |
Publisher | Copyright Owner |
Amongst ROADS users discussions are planned on refining the default attributes, and also on ensuring consensus on what sort of information should form the content of each attribute.
Future plans for ROADS include compliance with the whois++ directory service protocol and investigation of integration with the Common Indexing Protocol (CIP). This will allow for query routing and inclusion of ROADS databases in a mesh of summary indexes (centroids). As part of this development ROADS will include a whois++ server which will `serve' whois++ compliant templates. The development of the whois++ and CIP protocols are of interest to ROADS and any implications for record content and structure will be significant. ROADS.
As part of the ROADS project it would be useful to keep a central register of all extensions to the ROADS default template types and attributes.
Although the original IAFA guidelines give some guidelines for the construction of the attribute content, the references are located in several disparate documents not all of which are easy to locate. These guidelines are not focused on the description of the resources typically described by the subject services. In order to allow for effective interoperability simplified `cataloguing rules' are required not only for ROADS records but for other simple Internet records such as Dublin Core.
It seems certain that for the foreseeable future a number of metadata formats will be used for creating, describing, exchanging, searching, and retrieving information about Internet resources. Indeed it is not surprising that for all these functions different sorts of metadata are in use. Given that the management of information on the Internet is at a relatively early stage, that there are many different constituencies involved with the Internet, in terms of professional background, academic subject area, and purpose of use, then similarly diverse forms of metadata can be envisaged to fulfill the varied requirements.
To facilitate record exchange and provide a background for interoperability, it will be necessary to map ROADS templates to other formats. This is already being done from the ROADS format to whois++, albeit that there is minor variance between the formats. Some work has already been done by UKOLN and Loughborough University. This work will investigate mapping ROADS templates to USMARC and the Z39.50 bib-1 use attribute set in the context of interoperability with the Z39.50 protocol. The recent efforts to define encoding for embedding the Dublin Core element set into HTML arise from the desire to promote author or `publisher' created metadata. If the means to embed metadata in an effective way can be provided, then the use of Dublin Core for this purpose may become widespread and may be considered useful in the context of creating metadata for ROADS services. Some work has been undertaken by the ROADS project to investigate mapping from the Dublin Core element set to ROADS templates. Other work in the area of mapping and conversion will arise no doubt in the future.
Maintained by: Michael Day of UKOLN, University of Bath
Last updated: 05-Oct-1998.