Selection criteria for quality controlled information gateways
Work Package 3 of Telematics for Research project DESIRE (RE 1004)

Selection Criteria for Quality Controlled Information Gateways

Introduction

This report is a study of the selection criteria employed for selective information gateways (e.g. subject gateways) on the Internet. It has been produced as part of the Indexing and Cataloguing activities for Workpackage 3, and concentrates on the subject gateways run by DESIRE (Workpackage 3) partners (EELS, KB, SOSIG) and the UK Electronic Libraries (eLib) Programme subject gateways (ADAM, EEVL, OMNI, RUDI, SOSIG) - see Appendix 1 for short descriptions of these services. Selective gateways add value to Internet information because they can choose resources from the Internet with regard to subject matter or quality criteria. Libraries and librarians have an interest in this area and have been widely represented in the creation and maintenance of these services.

Aims and objectives

The aim of this task was to develop quality selection criteria and methods use by subject gateways. This work may additionally be applicable to other selective Internet services, for example those primarily based on geographical or linguistic criteria.

The main objectives were to provide:

standards and procedures for resource selection by subject gateways.
methods (and tools) for the development of quality-controlled information catalogues.
quality mechanisms to permit the quality of information and information services to be monitored and reported.
a framework for quality assessment and control within subject gateways.

Approach

The study started with a 'state of the art' review, to capture materials and views in the areas of quality control and selection criteria. The review covered:

quality models and methods being used in other fields, notably industry, management and information science
selection criteria and quality methods currently in use by subject gateways

The resulting studies can be found as appendices to the main report:

Definitions, models and methods of quality currently in use (Appendix III)
Selection criteria used by selective subject gateways (Appendix IV)
Selection criteria of other selective Internet services (Appendix V).
Selection criteria found by literature review (Appendix VI)
User surveys carried out by the selective subject gateways (Appendix VII)
E-mail survey of the selective subject gateways (Appendix VIII)

An initial review of literature and current practice indicated that the quality mechanisms in place for selective Internet services were fairly rudimentary when compared to developments in the commercial 'customer oriented' sector, where emphasis has been on developing systems of continuous improvement. It was decided that a quality model designed specifically for Internet subject gateways should be developed, which would provide a framework within which to implement continuous improvement processes. In parallel with the development of this conceptual model, detailed work would be done on the 'resource selection' process within subject gateways. A comprehensive list of quality selection criteria was to be developed. This was to be mapped iteratively onto the subject gateway model.

The model and the list of quality criteria would be developed in the light of the 'state of the art', but would also be tested by existing subject gateways to ensure that they were of practical use in the field.

Desired outcomes

The study aimed to generate two end products:

A generalised graphical model of a functioning subject gateway that would enable a systematic approach to quality issues in the provision, development, control, monitoring and analysis of a subject gateway.

A structured list of selection criteria that could be used as a reference tool by subject gateways and enable new and evolving subject gateways to produce their own tailored selection schemes without having to reinvent wheels.

The conceptual model aimed to be both comprehensive and generalised, not constrained to any particular subject area. Derived from a 'rich picture' description of the activities necessary to achieve the objectives of a subject gateway, it would identify key points at which quality criteria are and may be employed; for example resource selection, training to enable users to make intelligent and informed searches, and quality requirements of information providers.

The list of selection criteria aimed to be comprehensive and flexible. It would be a reference tool that subject gateways could use to assist the definition of the most appropriate criteria for their specific service. As an internal test the list was to map directly onto the quality model, to enable services to apply quality mechanisms to the process of selection.

This report describes the background and evolution of these two products, and gives reference to the sources and testing used in this development.

The subject gateway model

Background to the development of the model

Subject gateways consciously emphasise the importance of skilled human involvement in the assessment and 'quality control' of their selected Internet resources. The core activity - selecting and attributing meaning to those resources is a human activity. Subject gateways are currently run as academic services and carry out activities that do not lend themselves to automation (recognising however the importance of complementary developments in automated resource harvesting to the growth of subject gateways).

From the outset the view was taken that defining quality processes and criteria for a subject gateway involved more than simply listing the right questions to ask about potential resources. It was agreed that a more rigorous framework was needed on which to hang the activities, processes and associated quality aspects. Such a framework would provide a useful tool for the specification, implementation, development and evaluation of any subject gateway, making the substance of this report more generally applicable. It was hoped that in looking at quality from a broad base, such a model could prove valuable to a subject gateway as it could be used to:

stimulate discussion and aid problem solving by providing agreed points of focus
aid grouping and classification of various quality and control criteria
identify possible quality improvement tools and methods
enable the testing of existing and subsequent specifications for subject gateways
help to identify and stratify training needs
provide a basis for specification and design to anyone setting up a subject gateway
provide for flexible development once established

Subject gateways use a technology (hard) system and a human activity (soft) system. The human actions need to be interpreted (understood) before designing or adapting technology that effectively supports the overall system. We decided to apply the analytical approach of Soft Systems Methodology (SSM) which would allow due consideration of both hard and soft aspects (Checkland 1981; Checkland and Scholes 1990).

See Appendix II for an overview of SSM and its key concepts.

SSM allows the development of a framework for collecting and interpreting information about the overall system, its associated issues and constraints. Ultimately it is used to define the overall system, its boundaries, the tasks performed by technology and by people, and how they interact. More structured, formal techniques can later be employed in specifying and designing the technical aspects of the service.

A model would be developed using SSM to build a framework for Internet subject gateways. The methodology and evolution sections of this report describe how this was done.

Defining a general model for subject gateways

The University of Bristol's Social Science Information Gateway (SOSIG) was used as the subject of the detailed study using SSM. SOSIG was a well-established real system (a subject gateway in action).

The scope of the study was not restricted too early in the process by defining at the outset what the problem areas were. The aim was to capture various individual views about purpose, goals or effectiveness of certain tasks or subsystems, the principal actors and clients involved, the transformations (what the system did) and what the expectations and constraints on the system were. Some of the approaches used were:

identify individual tasks performed
identify existing tools and methods
establish interactions between people and systems
make drawings of structures and layouts
interviews: unstructured, informal ("tell me what you do")
brainstorming issues / structured matrix
creative approaches to root definitions

Information was mainly gathered from informal interviews, brainstorming and existing evaluation studies. This enabled the identification of relevant systems, current concerns and problems and concepts of what a subject gateway did and should be doing and the likely development issues.

Rich pictures were generated to graphically represent the structure, processes and issues that could be relevant to the problem definition. In this case the rich picture rapidly developed into a generalised system model avoiding specific reference to SOSIG. (The distinction between the rich picture and the model distinction was not as easy to maintain as implied by Checkland).

Having gathered this information and structured a rich picture (graphics, text and an issues matrix) , a series of root definitions were written to express the primary and secondary objectives of the subject gateway. These definitions inform the iterative development of a graphical Conceptual Model of the overall system. A small team worked on this so that missing information and conflicts in perceptions were highlighted.

Subject gateways are usually established to fulfil a stated role and provide a certain service. We adopted a primary-task approach where an attempt was made to give a neutral account of the functioning of a subject gateway. We would looked at statements of what SOSIG was trying to achieve and portray to the outside world. The root definitions were written specifically as succinct statements that include Checkland's CATWOE components.

It is important to note that the conceptual model produced is a theoretical construct - it does not represent the existing or potential structure of the organisation. A process of comparison or testing is required to link the conceptual model back to the real world. It is this process of testing that will raise issues and imply subsequent action, both for the model (systems world) and for the subject gateway (real world).

Results of the SSM analysis

Rich Pictures

The rich pictures were produced as paper based drawings with attached notes and a matrix of current and anticipated issues (see Appendix X for examples). Attempts to make these more easily interpretable as rich pictures rapidly led to a production of the more idealised conceptual model.

CATWOE Analysis

Root definitions for a subject gateway were based on work within SOSIG. The CATWOE analysis revealed broad agreement over the individuals involved in the development and current functioning of SOSIG. The following data were derived from informal interviews and discussions with directors, cataloguers, trainers, evaluators and users. The Social Science bias was subsequently removed for inclusion in the generalised conceptual model

Customers (who benefits)

Charities	FE students
HE students	Information support professionals
Journalists	Librarians
Researchers in Government	Researchers in higher education
Researchers in Industry	Researchers in NGOs
Resource providers	SE students
Self help organisations	Social science practitioners
Statisticians	Subject specialist librarians
'Surfers'	Teaching staff
Trainers	Undergraduates

Actors

Advisory group	Cataloguers
Listeners	Other subject gateways
Researchers	Software developers
Subject specialist librarians	Systems administrators/managers
Trainers	Trusted information providers
User group

Transformations

From:		To:
URL	==>	URL with added value
Vast	==>	Small
Unpredictable	==>	Predictable
Variable quality	==>	High quality
Unmediated	==>	Mediated
Unstructured	==>	Structured
Users with poor search strategies	==>	Users with well developed search strategies
Unsafe environment	==>	Safe environment
Data	==>	Data + meaning = information
Timewasting	==>	Time efficient
Labour intensive	==>	Labour saving
Information hungry	==>	Enlightened /satisfied
Internet unconfident	==>	Internet confident
No metadata record	==>	Rudimentary metadata
No subject sections	==>	Subject sections
Reluctant	==>	Enthusiastic (seeds interest)
Researchers/users with habitual search patterns	==>	Individuals with enlarged horizons of what is possible
Intimidated users	==>	At ease users
No clues	==>	`now I know where to go'
Uncritical information users	==>	Users with well developed critical abilities
Trainees	==>	Trainers, Proselytisers, Sales force
Information user and consumers	==>	Information providers
People requiring Internet presence	==>	People with Internet presence
Average career prospects	==>	Enhanced career prospects for staff and users
US biased	==>	Reduced US bias

Worldview

Information Superhighway - it's the way to the future
Selectively mining information from an unmanageable data source confers economic and intellectual advantage.
The information available over the Internet needs to be controlled, moderated, systematised
Grey literature or incidental information
Ease of electronic publishing allows the dissemination of different views - ones which publishers will not accept
Data is dynamic, immediate and pervasive - click on reload button after coffee break

Owners

These are the relevant bodies for SOSIG and simply appear as owners in the model

ESRC, eLib, JISC, DGXIII, University, Department, users

Environment

People, information overload, software constraints, funding, funding outlook.

Root Definitions

A university owned and maintained system that selects and catalogues subject specialist Internet resources on the bases of quality and relevance, allowing structured access by a range of users in research and education in the belief that such filtering provides an essential added value to the inadequately structured data available on the Internet.

A university owned and maintained system that introduces a range of users in research and education to the Internet as a potential source of relevant high quality information, allowing them to explore and develop discovery strategies which can be used in subsequent exploration in the belief that efficient and critical use of the Internet requires appropriate training.

An academic institution owned and maintained system that builds a publicly accessible catalogue of subject specialist Internet resources by the application of a predefined set of quality selection criteria.

Conceptual modelling

These three root definitions provided different perspectives of emphasis for a particular subject gateway. They did not imply three different structures. We attempted to move to a general model, independent of SOSIG, which would allow for all the above root definitions.

The model went through three iterations before an agreed generalised model emerged, capable of incorporating all the above root definitions. These earlier iterations, the pre- and post- test models are detailed in Appendix X.

We looked at the logically necessary processes and components for each of the three root definitions at a depth which seemed to capture the system. We did not concentrate on establishing a hierarchical structuring of the processes, but preferred to work with the whole picture, recognising that for clarity and explanation we might want later to add such a decomposition.

A conceptual model was developed which represented graphically the activities logically necessary to achieve the transformation described in the primary root definition(s). The model was checked to ensure that it conforms to the following requirements:

represents exactly the activities (transformations expressed as verbs) required to achieve the goals of the organisation
meet the criteria for being a system
is capable of being decomposed hierarchically containing 5-7 activities at first-resolution level
have all components connected (except for monitoring and control units)
an ongoing purpose i.e. effects transformation
has provision for measure(s) of performance
has a decision-making or control process
consists of components (which are themselves systems)
exists as part of wider system, or environment with which it interacts
has bounded decision-making processes
has resources for its own use
an expectation of continuity

Pre-test model

See Appendix II for a more detailed outline of Soft Systems methodology and a definition of terms. The pre-test model is available at: <URL:http://www.ukoln.ac.uk/metadata/DESIRE/quality/images/mdlv1_4.gif>

Field testing the model

An important stage in the evolution of both the list and the model was the testing, which had two main aims:

To collect test data that could be used to improve the model and the list
To get an evaluation of the practical value of the model and list in the field

Both products were subjected to testing and were modified in the light of the results.

The testing was undertaken by the three organisations involved in building subject gateways as part of the DESIRE project; SOSIG, KB and EELS. In addition to this, Biz/ed (an eLib funded subject gateway) took part in the testing due to their close working proximity to the SOSIG project.

A copy of the pictorial model and a set of the selection criteria were sent to each organisation. They were asked to study the model and the associated textual descriptions and compare the processes indicated by the model against their own service. Disparities between the systems model and the real world could indicate problems and/or where improvements could be made.

Methods for comparison that were used:

general discussion and observation: first impressions of disparities
question generation: the model was used to generate a series of focused questions : Does the activity exist? Would it be useful? etc.
testing in practice: comparing what happens on a day to day basis in carrying out activities (resource description and cataloguing)
model overlay: comparing the conceptual model to the model implied by the organisation.

A similar procedure was used to test the quality criteria. The participants were asked to work through the list criteria and mark down the criteria that they found relevant for use in their service. See Appendix IX for a list of the personnel involved in the testing.

Test results

The final model takes into account some of the results of the testing that was carried out in the field. (NB some changes - notably those relating to heirarchical structuring of the model will be incorporated with the final version of this report). It is available as: <URL:http://www.ukoln.ac.uk/metadata/DESIRE/quality/images/mdlv1_5.gif> .

General impressions

Well received
The model was considered to be useful in general, particularly for the initial specification and design stages of a subject gateway, in the absence of any framework. (EELS, SOSIG)
Mapped well retrospectively onto a new service (Biz/ed)
It was seen to be useful in providing a framework for the decomposition of more complex problems (KB)
It will be used in providing a framework for further developments of the NBW service (KB)
Time consuming to understand

Process mapping

High degree of mapping
Model indicated areas which would be of use
Noted differences (restricted access), training users
There were some noted changes and additions to processes
Agreeing 'select resource criteria' (Activity B) to be used by distributed subject editors is difficult (EELS)

Feasible or desirable changes to the model

Differentiate different data types (lists of criteria, on line help, resources) in a clear way on the model
Provide a hierarchical overview - for clarity, display and reproduction
Generate simpler model for smaller subject gateways (non-ROADS)
It should be emphasised that the model is a conceptual, not organisational one
There should be a more detailed examination of where quality criteria might be stated (i.e. criteria for the quality of quality of training materials)
Add data record exchange between subject gateways
Activity R (publicise) should feedback into the potential users' group (KB)
Failed search or browse should result in suggestions to the user (KB)
Failed browse should result in suggestions to use related categories in classification scheme (KB)
Failed search - related keywords (KB)
Ensure that user feedback results in further development of training material (KB)
Users should be added to the group selecting potential resources (KB)
Differentiate those activities which are automated and the which are human (KB)
Use colours more to differentiate between parts of the model (KB)
Provide a start and finish as it's not immediately obvious how to read the model
Ensure that it can accommodate the distributed nature of some activities
Include and elaborate the process by which cataloguers, TIPs and Volunteers are trained
More help and guidance through the model in the absence of personal one to one explanation

Most of these changes were subsequently incorporated graphically into the model of Appendix X.

The subject gateway model - conclusions

The generalised model that has been developed from this study aims to provide the basis for a useful and comprehensive reference tool for Internet subject gateways. Used in conjunction with the lists of quality criteria it will underpin the specification and development of subject gateways within DESIRE. Additionally the model attempts to provide a conceptual framework with which new, existing and emerging subject gateways and related selective gateways might evolve. It is not intended to be prescriptive and is not a system specification or design, but rather a means by which such specifications could be tested and developed.

Each subject gateway could make effective further use of the model if it were to:

define roles associated with each activity
define the 5 E's criteria (see Appendix II) for each activity
incorporate any additional necessary activities to fit into its principal aims
decompose of the activities into sub systems
make explicit the associated control criteria (scope policy, content criteria etc.)
formalise the review and monitoring processes associated with each activity

The model itself should evolve in use as the specification and development of a subject gateway proceed - it is not intended to be a static model. It is primarily a visual means to capture the complex functioning of a subject gateway in all its aspects. It has proved to be potentially useful in the evaluations carried out with a limited number of services. It should be used subsequently in the implementation phases of DESIRE to provide a means of structured evaluation of the service, establishing criteria for its effective and efficient performance. It will be produced in a revised format as one component tools and methods at the end of the project.

Selection criteria

Background to the development of the list of selection criteria

A initial review of the selection process currently being used by subject gateways revealed two key findings:

the selection is usually done by subject specialists (academics and librarians).
many of the services have not formally developed or published any definitive selection criteria.

These findings reflect the premise on which these services are based - that human judgement is the critical factor if only resources of the highest quality are to be selected. If detailed and definitive criteria could be established then expert systems could be developed to do the job, but this has not happened. The implication is that the evaluation of information resources is a very complex process best carried out by subject specialists whose judgements are likely to involve detailed and complex mental processes. It is necessary to draw out and formalise the tacit knowledge which is currently used in an unexamined way, particularly if the resource selection process is increasingly distributed as subject gateways enlarge and expand their information gathering activities.

The fact that selection is done intuitively and is based on human knowledge, experience and judgement raises the question as to what criteria this intuitive process involves. This study aimed to gather as many of these criteria as possible, from a wide variety of sources, with a view to making as many of these criteria as possible explicit. The list of criteria aimed to be comprehensive so that the benefit of this expertise could be shared as a tool for all subject gateways to use.

Generating a comprehensive list of selection criteria

A systematic review of selection criteria for Internet resources was conducted. The initial aim was to capture all the selection criteria and quality attributes either currently being used by Internet services, or recently mentioned in the literature. Four main information sources were used in the review:

Subject based selective services on the Internet (See Appendix IV)
Other selective services on the Internet (See Appendix V)
Related literature (See Appendix VI)
User surveys from the subject based services (See Appendix VII)

The aim was to produce a list of all criteria and attributes found. Comprehensive coverage was the main aim. The trawl was systematic, and at this stage all criteria were included in the terms that they were found in the sources, regardless of duplication or apparent value.

Pre-test list

Over 250 criteria were collected from the initial trawl. This 'raw data' then went through the following processes:

Duplicates were removed
The language used was standardised. The criteria and attributes were phrased as a question. The question format was chosen because it was the most common format found during the review, and because it reflects the evaluative nature of the selection process.
A qualitative analysis took place. The research team grouped the criteria thematically making use of shared attributes and the quality model which was developed in parallel to this process.

A second list was created by the research team and was to be tested by some subject gateways, after which any modifications necessary would be made

At this stage the list had been categorised into sections which aimed to reflect the different types of selection criteria, and the selection process itself. These categories were designed in parallel with the quality model. This list was then sent off to be tested by subject gateways (see Section 3.3).

Field-testing the selection criteria

An important stage in the evolution of both the list and the model was the testing, which had two main aims:

To collect test data that could be used to improve the model and the list
To get an evaluation of the practical value of the model and list in the field

Both products were subjected to testing and were modified in the light of the results.

Methods for comparison that were used:

general discussion and observation: first impressions of disparities
question generation: the model was used to generate a series of focused questions : Does the activity exist? Would it be useful? etc.
testing in practice: comparing what happens on a day to day basis in carrying out activities (resource description and cataloguing)
model overlay: comparing the conceptual model to the model implied by the organisation.

Post-test list

The final list was created in the light of the test results.

The list of quality selection criteria was well received as a tool for Internet subject gateways. All the testers gave a rating 4 or 5 on a five-point scale (where 5 was 'very useful' and 1 was 'not at all useful'). This result implied that drastic modifications were not required. However, the ratings for individual selection criteria were used to make some minor modifications:

The number of items in the list was reviewed
The order of the list was changed
New criteria were added to the list

Only one of the criteria was not used by any of the services. This was the Special Needs criteria in the Scope section. None of the services said their users had any special needs that would affect the resources that were selected (e.g. disabled users requiring large print or audio resources). In general the test results corroborated the idea that different services use different selection criteria, since there a variety of differing criteria were used by the different services.

It was therefore decided that none of the items should be removed from the list, as they might be appropriate for some services. However, the order of the criteria were altered, with the criteria used most commonly by the services doing the testing given a higher priority within each section (i.e. moved up the list-order).

The testers' comments in answer to the open questions gave a consistent picture of the relative importance of the different categories of criteria. The scope criteria and content criteria tended to carry the most weight in the selection process of the majority of the services. One service said the collection management criteria also carried most weight. The process and form criteria tended to carry the least weight.

One new criterion was added to the list following a suggestion made by one of the testers. It was suggested that 'complementary value in a shrinking acquisitions budget' should be used as a criteria. This was added to the collection management criteria, as a valuable addition. The complementary value of a resource in relation to traditional information resources available in libraries could conceivably affect the value of an Internet resource in the eyes of the users.

The test results were encouraging, in that they supported the idea that the list could be a useful reference tool for Internet subject gateways. The comprehensiveness and adaptability of the list were well received. It was acknowledged that the list would need to be tailored to meet the needs of individual services to be of practical use:

'I think it is useful to start out with this very comprehensive list, and choosing your priorities, work it down to something workable, and maybe from time to time reconsider your priorities by turning to the list once again.' (A comment from the National Library of the Netherlands).

The list of quality selection criteria: a reference tool for Internet subject gateways

This list of quality selection criteria aims to be a useful reference tool for Internet subject gateways. Its strength lies in the fact that it is:

comprehensive
adaptable
organised according to the process of selection

The list takes into account the fact that different quality criteria will be needed for different services, since 'quality' should be closely related to 'user satisfaction'. Different services will be aimed at different users, and so what constitutes a quality resource will vary across services. The list aims to offer Internet services:

A generic framework in which to consider the quality selection process and quality selection criteria.
A comprehensive list of possible criteria which individual services can draw on to create or refine their own specific selection criteria

Selection criteria: a framework

As indicated in the quality model, selection is a process which involves careful consideration of a number of factors, all of which will affect the definition of a quality resource for the service. The key factors in the selection process of an Internet subject gateway are generic: the users, the information resources, and the service itself. The framework of the list takes all of these factors into account, by suggesting five main types of quality selection criteria:

Scope Criteria: (Considering the Users)
Content Criteria: (Evaluating the Content)
Form Criteria: (Evaluating the Medium)
Process Criteria: (Evaluating the System)
Collection Management Criteria: (Considering the Service)

A 'quality resource' will therefore be defined with the specific service and its users in mind, as well as the nature of the information resources. The quality selection criteria for a specific service can be created by using this framework. Within each of the five areas the criteria most appropriate for the service should be decided, defined and continually reviewed. The framework also helps to structure the actual process of selection:

Scope criteria will be defined at the inception of the service and will be the 'first filter' through which potential resources pass through.
Content, form and process criteria need only be applied to resources that fall within the scope. These criteria involve an evaluation of the resource itself.
Collection management criteria will take account of the coverage of the current collection, and may cause the other criteria to be changed or modified as the collection grows.

The framework accounts for the different stages at which decisions about quality need to be made. Like traditional library collections, Internet collections involve selection, maintenance and de-selection. This framework, in conjunction with the quality model, suggests that services need to apply quality selection criteria to resources at all three of these stages, and that many resources will need to be evaluated more than once, if the integrity of the collection is to be maintained.

Selection criteria: a comprehensive list to draw upon

All of the selection criteria found in the 'state of the art' trawl have been included in the list, and are organised according to the framework described above. Individual services can use the list as a reference tool, to select the criteria that are appropriate for the service, in the knowledge that in doing so they will be drawing on a wealth of practice, experience and knowledge in this field.

By using the framework, and drawing on the list, the definition of a 'quality resource' will be determined by the users and the aims of each service as well as by the nature if the resource. The list can be tailored for use by any selective service. The five main categories of criteria are generic, as they are based on the process of resource selection required to run any service. Each service would need to select from the list, the criteria that are appropriate given their own particular user group and service aims.

Conclusion

The two tools which are detailed here will firstly be used in the specification and development of the Cataloguing demonstrators for the next phase of DESIRE. They will be further developed over the life of the project in parallel with the demonstration phase and be used to structure the subsequent evaluation. It is recognised that, having produced tools which are accepted as being of value to existing and emerging subject gateways, further effort will be required to make these generally useful and accessible to new subject gateways which will emerge using the other tools and methods developed during the life of DESIRE.

Next Table of Contents

Page maintained by: UKOLN Metadata Group,
Last updated: 2-Apr-1998