Selection criteria for quality controlled information gateways
Work Package 3 of Telematics for Research project DESIRE (RE 1004) |
Title page
Table of Contents |
[Customised introduction to each mail]
1. Are documents/services actively selected for inclusion in your service/gateway? Yes/No If *No* you will probably not need to fill in any more details. If *Yes* would it be possible to get some answers to the following? Feel free to expand if necessary. 2. Who does the selection? 3. What criteria are used and are they published? 4. What issues do you think have been raised by the selection process? 5. Could resource selection be improved in any way? Have any refinements been made to the selection process? 6. How do you ensure the up-to-dateness of the information resources you include with relation to: a. new resources? b. currently included resources? 7. Are there any constraints on the selection process? 8. How can selection be applied to resource descriptions created by gathering software? 9. Have you found any publications/WWW sites particularly useful in formulating resource selection guidelines? If these questions are not relevant, please include any additional comments below. Thank-you very much for your help with this. Michael Day Research Officer UKOLN University of Bath Bath BA2 7AY, UK. E-mail: lismd@bath.ac.uk
Replies were received from: ADAM, BUBL, EELS, EEVL, NBW (KB),
OMNI, RUDI and SOSIG.
The following issues were identified:
Granularity is central to the problem of quality control. The
basic problem is knowing at what level to catalogue the resource.
Briefly, resources could be catalogued at a high level (say a
service, or collection of documents) or a low level (an individual
document or data-set). The higher level would be more economical
on time and effort than lower level cataloguing, but is less good
for those searching the database. This is not strictly speaking
quality criteria, but judgements in this area will have an impact
on the selection criteria used.
One of the responses stated:
"Granularity - The problem of defining what a resource is. This is a much wider issue than the quality one, and goes to the heart of the problem of indexing things on the Internet. Resources are often inter-linked so there is a problem with knowing where to aim the selection process. The key question is: at what level should selection take place? Individual Files? The Server? This may not seem to be particularly relevant to quality selection issues, but quality guidelines may be different for different levels".
Another respondent noted that there was a "lack of in-depth work to formalise criteria according to different types of resources (e.g. electronic journals versus image databases, etc.)".
As there is likely to be some overlap between different fields in a particular subject gateway, the same resource could be unintentionally included more than once in the same database, assuming that different people are doing the indexing. This need not be a problem if all the "cataloguers" check the database before adding a new resource. For similar reasons, there will be some duplication of effort in subject areas which straddle more than one of the subject gateways, e.g. architecture in EELS/ADAM/RUDI; sociology of health care in SOSIG/OMNI.
One of the eLib subject gateways questioned exclusion criteria. They mentioned that resources excluded by the selection process could be of some interest to users of the database. Specifically, resources which consist almost entirely of links to other resources may have some use in the user community. Similarly, badly organised sites may be better than nothing, e.g. poor image libraries might be better than none in the context of an arts subject gateway. If subject gateways only select good quality resources, to what extent do they take away the user's ability to decide for themselves whether to consult a resource?
Several subject gateways noted the need for increased co-operation with the information providers themselves. Some services already contact information providers to fill-in gaps in their knowledge about a resource (if for example a resource was missing a date) and would be prepared to contact them again to check whether resources were going to be updated regularly. Others would like information providers to contact them with details of new or updated sites when necessary or to make some comments on the resource description made by the service.
One respondent noted that criteria currently used were not validated with reference to their suitability for the network environment or to the subject areas being covered by the database itself. One of the subject gateways (OMNI) is conducting research into end user perceptions of information quality.
It was generally agreed that robotic gatherers could not be used as the sole arbiter for selection. Selection is a complex action needing some human intervention. However, it might be possible for a gatherer to make some 'pre-selection' to which the subject specialist cataloguers could approve and add description. Alternatively, gatherers could be used to retrieve document level resources from sites already acknowledged to be of high quality. If embedded metadata were included in the original resources being described, a metadata aware gatherer could take this information and place it in the relevant parts of the resource description for later approval by the subject gateways. This might speed up the cataloguing of resources. One service suggested some type of automated 'current awareness' service, using automated link and content checkers, to identify new or altered pages on selected relevant sites.
Checking that a resource is up-to-date is problematic. Most of the services currently use an automated link checker, but there is no consistent way of ensuring that a site is kept up-to-date. With a large database, it is not possible to check resources individually. Here there is need for either increased co-operation between subject gateways and information providers or use of a review-by-date in resource descriptions - as currently used by SOSIG for conference information.
A couple of the subject gateways commented that knowing exactly what was relevant and what constituted "quality" in a particular subject area was difficult - even when subject-specialist librarians were doing the selection. Another was concerned by the amount of importance given to the provenance of information in the selection process. It was noted that resource selection was largely subjective. One respondent noted an "initial lack of confidence on behalf of librarians / information professionals to undertake evaluation activities in the network environment". The same person also noted that little work has been done on "developing procedures and methodologies to enable subject specific contribution to the process".
The main constraints mentioned were the amount of personnel and time allocated to the selection process. It was noted by one respondent that the learning curve associated with the selection process was in itself time consuming. Other perceived constraints are the large amount of potentially useful information available on the Internet and the lack of background information supplied with some resources.
One of the services mentioned the problem of copyright. Although this is usually a problem for persons who want to add copyright information to their own services, it is worth noting that many current lists of selection criteria do not usually stipulate that a resource included in a subject gateway should not break international copyright law.
Some of the subject gateways have a particular interest in resources
from geographical areas. For example EELS concentrates on Nordic
resources and the eLib services on the UK and Europe. Different
selection criteria might need to be devised for resources perceived
to be of particular utility for the international user community.
Next | Table of Contents |
Page maintained by: UKOLN Metadata Group,
Last updated: 2-Apr-1998