Lorcan Dempsey
UK Office for Library and Information Networking
University of Bath
Bath BA2 7AY
February 2000
This is a preprint version of an article of the same name appearing in Online Information Review, 24(1), 2000, 8-23. Please refer to the print version in any citation.
This special issue is about a set of services which have been known, but not universally, as subject gateways, or as subject-based information gateways. For the moment, we can say that in the usage envisaged here such terms typically refer to a network resource discovery service which provides database(s) of Internet resource descriptions with a specific subject focus and created according to specific selection and quality criteria. Examples are SOSIG, EDNA, and EELS. A fuller discussion of the characteristics and range of such services can be found in the companion article by Traugott Koch and readers are referred there for the further detail which won't be found here. There is also some discussion of issues, which I will draw on below, in the report of the first Imesh Workshop (Dempsey, Gardner, Day and Van der Werf, 1999), which brought together the creators of such services to discuss shared concerns.
'Subject gateway' as a term was popularised in the UK Electronic Libraries Programme (eLib), and it has been given currency by initiatives which have been influenced by the eLib gateways. It tends to be used by services in the research, educational or cultural domains, which sometimes have a significant R&D or project-based focus. In several cases, subject gateway activity has been conceived and carried forward as a part of academic or research information infrastructure initiatives, as has happened in the UK and more recently in Denmark for example. The R&D aspect means that there is a considerable pool of literature on subject services, some of it with an evangelical tinge. It also means that, in some cases, there are relatively high levels of interaction between developers in different countries, as services have been developed within collaborative projects, or by people who have been concerned to keep in touch with centres of activity. Finally, it means that a major issue for many of these services is how to transition into sustainable, persistent components of the emerging information landscape.
In this article, a first section presents a historical perspective; a second examines some more general issues as the subject services develop. In the first section, I take a perspective influenced by UK experiences. I hope this has some intrinsic interest, but that it is also more generally interesting as a consideration of some representative policy and service questions which have ultimately led to the setting up of the Resource Discovery Network (RDN). The RDN is a new service, funded within the academic and research community in the UK, which brings together under a common business, technical and service framework a range of subject gateways and other services. I hope also that the frankly retrospective perspective of the first section complements the treatment of current developments available in other contributions to this special issue.
Finally, an important scope note. This article, as with the rest of this issue, takes as its starting point a particular set of services which have sufficient commonality to allow a general discussion about them - they are described in Traugott Koch's article. Other services are beginning to appear which have similar characteristics, some from within the commercial publishing sector, for example, but these are treated here in relation to subject gateway interests. This is not to say anything about the usefulness of those services, merely to note a boundary on this discussion.
The subject gateways emerged in response to the challenge of 'resource discovery' in a rapidly developing Internet environment in the early and mid-nineties. It might be useful to briefly contextualise the emergence of the subject gateways within a rather schematic account of Internet development. I have elsewhere suggested that we can identify four very approximate phases, or emphases, in the growth of Internet and information infrastructure which have emerged successively but whose characteristic user and service orientations continue to exist modified within later phases (Dempsey, 1993; Dempsey, 1994):
The subject gateways can be seen to be very much part of an academic and research information infrastructure emphasis, still funded in many cases as part of educational or research provision. In some cases, their developers also owe something, at least in ethos, to the collaborative, precompetitive aspects of the community emphasis. Their historical trajectory has very much coincided with the rapid rise of public information infrastructure, and several trends are apparent while they have been developing:
Flowing from this discussion, we can notice two general factors of growing importance. First, as institutions develop in this shared network space, we are seeing the accelerated 'disembedding' of communication and publishing practice from its previous settings and its rearticulation in a network environment. Look at the reshaping of the scholarly publishing system, as publishers, new aggregator services, agents and users are brought into a new relation in a network environment, and as new preprint and other approaches emerge. Or consider how people answer questions. In the print world, somebody might look to a dictionary, to the Yellow Pages, to an almanac, to one of several agencies to find something out. Increasingly, we look to search engines and portal services which bring together services in this shared space. There is in fact massive growth in services which support user discovery of relevant opportunities, and which support provider disclosure of relevant offerings. This is across the whole range of network activity: in consumer and business activity, as well as in educational, cultural or government sectors. There are 'portals' everywhere.
And second, it is important to note the increased emphasis on learning. If politics really is "increasingly about coming to terms with our own expanding capacity to generate unsettling change" (Leadbeater, 1999), then the programmatic support for learning activity will continue. This is a public response to rapid social and economic change, and is generating significant public and private investment in network services which touch on the interests of the gateways. For example, in the UK alone we can point to the University for Industry, the National Grid for Learning, the People's Network, and BBC Online, who, variously, will be providing resource discovery and disclosure services. We can also identify a range of commercial services with different target audiences for 'learning' products. Increasingly, it is recognised that users will be well-served by services which select, evaluate, and describe potential resources of interest; this is not merely an academic or research concern.
These factors in turn create an intriguing environment in which the gateways are considering their future: apparently at once rich in opportunity, as the value of the type of services provided by the gateways is recognised, and risk, as others look at working in this area.
The Electronic Libraries Programme (eLib) was an initiative of the Joint Information Systems Committee (JISC) of the UK Higher Education Funding Councils. Its first phase saw the setting up of projects during 1995 in the following programmatic areas: electronic document and article delivery; digitisation (of backruns of journals); on-demand publishing; training and awareness; and access to network resources (ANR). The subject gateways were funded as part of the ANR area. A variety of other initiatives was progressed in concert with eLib. These included the funding of an Arts and Humanities Data Service, the commissioning of work on retrospective conversion needs, and the funding for wider use of the union catalogue of the Consortium of University Research Libraries, COPAC. These activities were in turn part of a larger, developing JISC information agenda which included the setting up of data centres (which managed access to licensed bibliographic and other data sets), advisory and communication services, and other shared services. Together, this growing portfolio of activity was a result of a belief in the utility of shared action in the construction of academic information infrastructure. In the same way as the network infrastructure was centrally procured and managed, it was felt that a significant part of shared information resource and supporting services could be so managed. (See Law, 1994, for a discussion of the data centres in the context of wider academic information infrastructure.)
A call for expressions of interest to develop eLib was released by the JISC in August, 1994 (JISC, 1994), and the programme was extended in subsequent calls (eLib). The programme ambition and scope have been quite influential, and it has been discussed in detail elsewhere (Rusbridge, 1998). And, as noted above, it is interesting that subject gateway discussion in some other countries has also been related to academic or research information infrastructure, however centrally planned or executed that has been (for example, in The Netherlands, Denmark, Finland, Australia).
The call characterises the aim of the ANR strand as "to consider funding through the JISC to encourage the development of networking navigation tools and the growth of local subject based tools and information servers". An annex describes the aim in more depth. It suggests that "The main outcome will be to raise awareness of networked information resources, to explore the issues associated with running large scale services, and to ensure community involvement in developments at national and international levels." It furthermore suggests that the intention was not just to fund R&D which would lead to the production of demonstrator services: "It is therefore intended that a series of centrally funded initiatives should be taken with the aim of creating a national infrastructure capable of generating significantly more widespread use of networked information resources." Several areas were highlighted which might deliver such an aim including the setting up of a national centre which would act as a "national entry point and registration agency", and advise on network information issues; the establishment of subject services "in order to test the problems of scale associated with offering a community wide service"; disseminating information about resource discovery systems and providing advice; international collaboration and standards activity; and finally the development of guidelines and standards.
This led directly to the setting up of the eLib subject gateways. Following some consultation, bids to provide subject services were invited. This led to the funding and establishment of the eLib subject gateways. These were SOSIG (Social Science Information Gateway) (which slightly pre-dated eLib, as discussed below), EEVL (Edinburgh Engineering Virtual Library), OMNI (Organised Access to Medical Networked Information), History, ADAM (Art, Design, Architecture and Media information gateway), and Biz/Ed. The ROADS (Resource Organisation and Discovery for subject-based services) project, which provided systems support for the gateways was also supported here. Together with some more content-oriented projects these formed the Access to Network Resources strand of eLib (ANR).
The national centre, and some of the other activity, was not directly taken forward, but, as described below, these themes have been taken up by later developments.
The topics of the call, as described above, leant heavily on the recommendations of the ANIR report which had been commissioned the previous year (ANIR). The Access to Networked Information Resources (ANIR) working group was established to advise the Information Systems and Services Sub-committee of JISC on sensible approaches to networked information, recognising the potential importance to the conduct of research and learning of emerging network information services. Looking at the call above alongside the ANIR report one can identify some particular concerns:
In practice some of these concerns have been partly addressed. However, the impact and speed of development was not foreseen. It is notable, for example, that the Follett report, the document that released the eLib funding, makes no mention of the Web. (Follett, 1994) And some proposed developments -- the manual, central registration of network services in a 'national entry point', for example -- were quickly being overtaken by events, and seem to belong to an earlier stage of development, as suggested above, characterised by a much more sparsely populated network information environment.
However, some of this early thinking is evident in the current range of JISC services. (JISC) For example, there are now national mirroring and caching services, who plan to work with the Resource Discovery Network to optimise the use of available bandwidth. There is a range of advisory services, including for example, TASI (Technical Advisory Service for Images), based at the University of Bristol, and the Interoperability Focus and Web Focus, based at UKOLN, University of Bath, who work to influence practice and inform policy in their respective areas. The JISC supports relevant consensus-making activity (for example, the World Wide Web Consortium, The Digital Object Identifier Foundation, and the Instructional Management System).
From our vantage point, where the web is the de facto network use environment, and where the network has the reach described above, it is sometimes difficult to imagine the nature of discussions which led to the formation of the gateways.
The flavour of the period is well captured by Nicky Ferguson in his description of the emergence of SOSIG, the first gateway (Ferguson, 1995). He briefly reviews the development of the network use environment between 1992 and 1995, based on his experiences supporting the use of information resources by social scientists. "The original NISS Gateway and bulletin board services were very exciting windows on to the world. BUBL, the Bulletin Board for Libraries, was a frontier-breaking service which I always showed in my practical workshops as an example of what was possible with vision and imagination." It was a world in which several bespoke bulletin board and information management systems were in use, alongside a range of ftp sites, wais servers and listservs. In UK higher education, and elsewhere, there was an interesting flurry of activity around so-called campus wide information systems, and some discussion about their purpose and standardisation. There were even proposals at one stage about developing a specific Campus Wide Information System Protocol (CWISP) in the absence of a generally acceptable, usable alternative (Work, 1993). However, for all this activity, it remained an environment whose potential seemed clear, but which absorbed too much effort on the part of the general user and yielded too little in return to make it more than the tool of the persistent, the curious, or the specialised user.
Gopher changed this: it had a major impact, allowing the rapid construction of user-oriented, browsable services. However, for an interim period, it remained one approach among several. No single access protocol allowed users to reach all resources of interest: a number of different, if increasingly interconnected, resource spaces existed, defined by access protocol: Gopher, HTTP, WAIS, ftp and so on. Typically, each resource space developed associated search services (archie for searching ftpspace; Veronica for searching Gopherspace; and so on). Mosaic appeared in the Summer of 1993, heralding the web environment we now recognise, with a profusion of providers and users. Against this background, some features of resource discovery design deserve comment.
It was soon recognised that it was not feasible to browse through highly populated resource spaces, even where hierarchical structure or other organising principles were deployed. Structure began to be introduced which divided up the resource space; for example it became common to organise resources by access method, by geographical or by subject area. In the UK, for example, BUBL began organising resources from 1991 along subject lines. It introduced a subject tree structure across all subjects in September 1993 when it adopted Gopher (Nicholson, 1999). In some countries there were 'National Entry Points', a concept introduced within the European Gopher community, responsible for presenting organised, comprehensive access to national resources, typically using several 'trees', or lists, organised in these ways. There was much discussion of with 'subject trees', the organisation of resources into subject-based hierarchies depending on what they were presumed to be 'about'. Different approaches were taken, but usually relied on some broad categorisation, maybe based on a library classification scheme (several UK services used Universal Decimal Classification, partly based on the example of work at Lund, Ardoe and Koch, 1993, which influenced BUBL and later other services). This 'bookshop' approach was useful in a browsing context but was subject to all the disadvantages of hierarchical and linear systems. In any case, 'subject', however defined, is one attribute only of potential interest to the user. Such organisation ignores other discriminating attributes - type in several dimensions for example (medium, function, cost, etc.).
For these reasons, searching approaches soon emerged to complement browsing, and were increasingly important as routes into large resource spaces. However, there were also problems. In the case of early search systems, the chances of retrieving relevant materials was lessened by the terse, and often non-descriptive, text from which indexes are created (e.g. file names in Archie, menu items in Veronica). With web-based search engines, despite growing refinement, one is often overwhelmed by results, or prey to well known retrieval problems. One has often to use a resource, retrieve a file, or connect to a database to discover whether it is of interest or not.
These issues prompted some investigation of richer resource desciptions, delivered in database driven services or as resource guides. This in turn raised some consideration of formats for the description of Internet resources.
Libraries, of course, have a long tradition of cataloguing and have developed elaborate rules and structures. A variety of initiatives emerged which worked through some of the issues involved in applying these techniques to library resources. Some of this early work was documented in Caplan (1993) and in Lynch (1993). This type of work has been pursued in various quarters.
A second area of activity was monitored by the IAFA (Internet Anonymous ftp Archive) Group of the IETF, who produced recommendations for the description of resources on anonymous ftp archives. A number of objects were identified ('user', 'organisation', 'siteinfo', 'document', 'image', and so on), and templates consisting of multiple attribute-value pairs defined for each. An experimental service based on the templates was set up by Bunyip Information Services (Weider, 1994). They were also used by the early, interesting, ALIWEB initiative (Koster, [1994]), and in the Dutch InfoServices Project (where they were also augmented to form library catalogue records) (van der Werf, 1994). The IAFA templates were taken up by the eLib gateways -- where they were used in association with the WHOIS++ protocol in ROADS servers (ROADS; Kirriemuir et al, 1998).
Of course there were other initiatives. In the UK, NISS developed a format for use within its services - it began a database service in April 1995 (Zedlewski, 1999). Some schemes for resource description were developed for use in X.500 services (Barker et al, 1994). And so on.
The IAFA templates met several needs They supported services which contain full enough descriptions to allow a user to assess the potential utility or interest of a resource without actually having to retrieve it or connect to it, but not so full or complex as to require considerable cost and very specialist staff to create. In this way, they represented an advance over the rather terse data available in search services, but stopped short of the fuller descriptive and documentation formats in use in curatorial domains. For this type of application, what is required is a description which is simple to create yet full enough for effective retrieval and relevance judgement. This implies a description which falls between the terseness of the crawlers and the fullness of a research library catalogue record. For these reasons, the IAFA templates represented a pragmatic choice for ROADS, and there was a realistic expectation that the experience of the gateways could influence the development of the format (Dempsey, 1996). For some time the IAFA templates were seen as the most favoured open approach to Internet resource description. The Dublin Core is now a candidate to support this type of simple resource description, but it was not available at the time the gateways were being set up.
There is now a wide variety of metadata initiatives to support resource discovery, and other operations, across many different sectors (Dempsey and Heery, 1998). It is recognised that in an indefinitely large resource space, effective management of networked information will increasingly rely on effective management of metadata.
In a print environment one can make certain predictions about relevance based on 'brand' - a known title or publisher for example. Such cues were not well-established in a network environment. At the same time, people look for resources that relate to their interests, that improve the quality of their work or leisure, that save them time. Nicky Ferguson quotes a typical reaction to some of his workshops: "It's all very well to look at satellite and weather forecasts, but is there material out there that is relevant to my work?". These issues point to the value of a managed collection of resources, managed to ensure a level of quality, collected to ensure a level of relevance.
The gateways emerged from this engagement between an emerging service model and a policy aspiration to create a network use environment which effectively supported research and learning. They responded to the need to provide managed collections of resources, supported by effective resource description and subject access, available for browse and search access. In effect, they aimed to give shape and definition to an information space in a particular subject area. They aimed to save the time of their users, to connect them to resources which supported their learning, teaching and research interests, and to make sure that information about useful resources was effectively disclosed.
The subject gateways began operations in 1995, launching services at different times. However, as noted above, SOSIG had been established in advance of eLib funding. It was supported by the Economic and Social Research Council who since the Summer of 1992 had been funding Nicky Ferguson to assist UK social scientists in the use of networked information (Ferguson, 1995). It went live in July 1994 with descriptions for about 300 Internet resources (Hiom, 1999), and in many ways provided the model that some of the others followed. ADAM, EEVL and OMNI followed, as did History. Biz/Ed received funding to enhance its existing service with gateway activity.
Although the services were conceived along the same pattern, each had its particularities and emphases. For example, SOSIG benefited from being a part of a wider set of research and service activities at the Institute for Learning and Research Technology, and developed considerable training and outreach activities. EEVL developed a set of complementary services to support the engineering community, and developed a focus on 'professional' needs. History emphasised the resources and services of the Institute of Historical Research which hosted it. ADAM devoted considerable time to the consideration of standards and terminology issues.
Although programmatically aligned, there was no compulsion on the gateways to develop uniformly, either in terms of the services offered, or in terms of technical standards. In fact, all except EEVL used the ROADS software, although ADAM switched to a system provided by System Simulation at a later date. EEVL developed its own system, but was able to serve up data through WHOIS++. However, there was broad similarity of data content, with some difference of emphasis. Services had their own cataloguing formats, although there was some mutual influence and some general ROADS cataloguing guidelines were produced (which have since been adapted for use within the RDN) (Day, 1999) (Chapman, Day and Hiom, 1998). Each developed its own subject approach, specific to its community interests.
The gateways have been well described in the literature and several make regular contributions to the Ariadne magazine (Ariadne). Some are described elsewhere in this issue.
By general estimation, the eLib subject gateways were seen as a positive innovation (see for example Green, 1997). Although no independent, cross gateway, evaluative work was done, anecdotal evidence suggested that they were useful to particular user communities, that their subject focus was welcomed, that the quality controlled approach was valuable. It was also recognised that they had been quite influential as a service model, and that initiatives elsewhere seemed to be following their lead. However, there were also concerns that there was not a critical mass of descriptions (whatever that might be) in particular subject areas, that some subjects were not covered at all, that they were expensive to maintain, that there was redundancy of effort arising from their lack of coordination, and that they still operated in a project-oriented mode. One published study, based on a small sample of academic users in two universities, suggested that the gateways were very positively used by some respondents, but that the majority of academics were unaware of them (Mackie and Burton, 1999). From a funding perspective, there was also a concern that to fund a set of gateways which covered the full subject spectrum to desired levels would consume an unacceptably large part of available budgets.
Accordingly, it was decided to build on the perceived successes of the gateways, but to bring them together in a new federated structure, the Resource Discovery Network (RDN). A contract to run the RDN was awarded to King's College London, with UKOLN at the University of Bath as a consortial partner with a special responsibility for interoperability issues. The RDN was established as a network organisation comprising a centre and a set of organisations called 'hubs'. The Centre has responsibility for promoting, sustaining, and developing the Network in a federated form; supporting hubs in developing their collections; maintaining framework policies to ensure quality, consistency, and interoperability across the Network; and presenting gateway collections in combination in order to exploit their intrinsic interdisciplinary and cross-sectoral value. The framework is being developed to include policies for PR, interoperability, collection development, partnerships, as well as business planning activity. The latter is especially important, as the RDN has an explicit mandate to reduce the funding burden on the JISC through partnership and commercial engagement.
The hubs have been established around faculty-level subjects. Currently, the following hubs exist: EMC (engineering, maths, computing), BIOME (biomedical sciences), SOSIG (social sciences, business, law), Psigate (physical sciences), Humbul (humanities). EMC includes EEVL and is led by Heriot Watt University. BIOME is led by the University of Nottingham and embraces OMNI. SOSIG continues to be led by the University of Bristol, with responsibility for business, social sciences, and law. Psigate is a new service provided by CALIM, the Consortium of Academic Libraries in Manchester. A consultancy is currently in process to advise about provision in the Creative Arts and Industries, and other services may emerge in due course.
The faculty level subjects were chosen with a view to potential for partnership, sustainability, and growth, while preserving legacy investment. The hubs were selected around lead organisations in order to lever the domain-specific and technical expertise in established initiatives, organisations, and services. The hub model was chosen so as to create a sustainable structure which could develop into new subject areas and which could concentrate expertise and effort, creating scale economies. Each hub may provide a range of services, including one or more 'subject gateways'.
The RDN represents a new way for the JISC to provide services, and it is addressing significant challenges. It has to create a framework within which previously autonomous services recognise value in federation, a value that must go beyond the initial funding imperative that has brought them together. This will involve creating a collective identity while preserving distinctiveness; developing interoperability and shared services across independently developed platforms; identifying fruitful patterns of cooperation and sustainable partnerships with other groupings; developing a business framework which reduces the funding burden on the JISC; identifying and sustaining appropriate service roles in a complex ecology and economy; developing a critical mass of descriptions across a range of subject areas, either through original creation or by other means.
At the same time, the hubs are being asked to provide additional services which broker access to distributed network resources as part of the UK higher education Distributed National Electronic Resource, a policy aspiration to create more richly interconnected information and learning environments in support of learning, teaching and research. Interestingly, in the context of the DNER there is also an explicit desire to work more closely with other sectors, which may create useful linkages for the RDN services.
Having explored the emergence of the RDN in some detail, I turn in this section to some more general issues which are being considered in relation to the future development of subject gateways and their relationship to other services.
The Imesh workshop discussed business issues (Dempsey, Gardner, Day and van der Werf, 1999). It was clear that there was a variety of funding patterns in operation across the gateways. However, reliance on 'soft money' - project or research funding which is temporary, unpredictable or fragile - was common. This created issues for long term planning, collaboration, and service development. Some gateways had commercial partners, some were part of a wider service, and some stood alone. There was a general interest in business planning, in sharing 'success stories', and in securing appropriate institutional bases for activities. It was also recognised that surprisingly little was known about the value and cost of these information services. Gateways differ widely in aims and scope, but a better understanding of the costs and benefits of gateways would enable improvements in future decision making.
The RDN is engaged in a business planning operation to identify ways in which the RDN hubs might work with partners, engage in selective commercial activity, and identify further forms of support. It is too early yet to say what the outcomes of that activity will be.
We can characterise some current public sector resource description activities rather schematically as follows:
Although, the term 'subject gateway' has not been used to describe it, we can also point to some commercial 'Internet resource description' activity. Several organisations providing research and learning products have invested in such data creation to complement existing 'bibliographic resource description' activity. And, there is a wide variety of other services directed at consumers, learners and professional users (e.g. niche portal activity in professional domains such as engineering). These have different business models. We can identify two approaches -- there are more.
This is not the place to consider the wider issues of sustainability, even if I were qualified to do so, but it is useful to note that the gateways exist in an uncertain business environment, and that, for them, some of this uncertainty relates to their institutional immaturity.
Most gateways currently exist in a public funding context. They are a part of that wider public investment that historically has supported libraries, museums and galleries, public service broadcasting, and education. In terms of values, many would want to continue in that context, working with commercial partners where beneficial to share the cost of record creation. However, they are relatively new services. They have not been 'institutionalised' in the sense of developing a well-understood and valued position alongside other institutions, themselves now coevolving, as discussed above, with the emerging network. In other words, relationships with professional or scholarly publishing, or with library or museum services, or with public service broadcasting, are still there to be formed. A part of the challenge, and opportunity, is to find a way of articulating with those developments, themselves being rearticulated in a network environment. Especially as a part of that rearticulation is an engagement with issues involved in brokering access to heterogeneous information and learning resources.
And, of course, alongside services developed within research, learning, cultural and government domains, are the growing range of broadly 'consumer-oriented' or 'professionally-oriented' services which are proliferating as the web becomes an integral part of social and business affairs. One example is the emerging 'niche portal' which provide community-building and information services to particular communities of users. Several companies who supply bibliographic services are also developing resource discovery products to complement existing services. There is some commercial facilitation of what one might call consumer-to-consumer models, as for example in the OpenDirectory initiative. Some of these initiatives are now looking to the gateways as potential partners in data creation, and interesting discussions will be had about licensing and business relationships.
A third issue in this area relates to collaboration among the gateways, to achieve some scale economies, increased coverage, and shared expertise and contact. I return to this issue below.
So, one can identify several scenarios in which the gateways might have a more assured future. However, they indicate directions to be taken rather than positions achieved, and directions whose implications need to be worked through in practice and experiment. Resource discovery services will be an integral part of the fabric of network use and it will be interesting to see how and whether the gateways become institutionalised within this complex ecology and economy.
Patrick Wilson describes the construction of a bibliography as follows: "What a complete bibliographical job requires you to do is to search files, to select items from those files for inclusion in your bibliography, a process that requires you to do more or less analysis, then to describe the items, and to organise them for use." (Wilson, 1998) This is similar to what is involved in the construction of a gateway. And, indeed, if one were to look for precedents for the gateways, then catalogue, abstracting and indexing service, or bibliography come to mind, and each is similar and different in different ways. An important difference, however, is that in each of these cases there is a mature resource type in question, and the different tools partition the resource space in well-understood ways. For example, an abstracting and indexing service selects, analyses, describes and organises journal articles. Another type of guide will select, analyse, describe and organise particular types of organisation (universities, companies, etc).
Initially, the subject gateways searched for, selected, analysed, described and organised web sites. One might have seen these as a resource type alongside the journal article or the book (or the television programme, or the record); it was reasonable to consider a digital resource as yet another coarse resource type at this level. However, as noted above, we are now seeing a major rearticulation of traditional resource types in a network environment. Of course, the gateways have adapted and typically acknowledge a range of resource types in their descriptions. But what is the role of the gateway when the web is a pervasive social, research and business medium, home to the full reach of intellectual product?
Take some simple examples. Will the gateways provide direct access to individual preprints? To the full range of government reports? and so on. Will they become guides to organisations and associations? Will they become reference resources, as various traditional reference resources come online? Will they deal with resources in all languages? How will they cope with the hidden web, the range of subscription or pay per view services? Will they begin to describe library, museum and archive holdings as approaches based on collection description become more common? They already cope with these types of questions, but collection development policy questions will become more difficult. We already see the combination of resource description databases and selective web indexes, and other strategies will be needed. These will include the ability to broker access to other resources as discussed in the next section, to select between mirrors based on various criteria, to better express relationships between resources, and so on.
This in turn meshes interestingly with business issues just discussed. The gateways need to position themselves in terms of the service offered, which in turn says something about the range of potential partners they might have.
We have recently seen the emergence of 'portals', which could be loosely characterised as aggregations of services oriented towards particular categories of user need. Typically, a resource discovery service is at the heart of a portal. In this view, several of the 'gateways' have developed into 'portals'. For example, EEVL has developed a range of services to complement its original gateway service (understanding gateway in the sense used in this issue) (Macleod, Kerr and Guyon, 1998). Services include access to complementary databases, web indexes of sites included in the gateway, mirrors of commonly used resources, and so on. Other 'portal' enhancements might include personalisation, current awareness, and other community-building services being developed in some of the professionally-oriented 'niche portal' or other Internet services.
A particularly interesting development was noted above, as gateways begin to broker access to distributed resources. Currently, the gateways provide a 'discovery' service: they allow you to discover resources of potential interest and link to them in a web environment. Typically, the user leaves the gateway environment at that stage. In other words, the gateway sees the network as a range of individual websites available for human inspection. A next step is to provide broker services which not only allow resources to be discovered, but which interact with those services on behalf of users, mitigating some of the drudgery of interacting with multiple, heterogeneous resources, and consolidating content from different sources.
Take for example a library catalogue. This will be available through a website; it may also be available as a Z39.50 service. Currently, the gateway drops you at the front door of the catalogue website; it does not query the catalogue on your behalf. Within the context of the UK higher education Distributed National Electronic Resource, the RDN gateways have been asked to provide an additional layer of service. They have been asked to consider brokering access to network services using a variety of protocols. In this context a 'broker' is a service which provides a unified user view of heterogeneous other network services. The broker service interacts with these other services through protocols (e.g. Z39.50, LDAP, WHOIS++) which return structured data for reuse by the broker service. Two important types of reuse are clear: reformatting or processing as part of an integrated user environment (so for example records from various sources -- say subject gateway, A&I service and a catalogue -- are merged, sorted and presented in categorised folders at the user interface), and exchange of data between services to automate supply and demand chains (for example passing a description of an article retrieved in a search service to another service which locates the article in a particular document store, or which drops it into an interlibrary loan request, and so on).
In this view, the 'subject gateway' or resource catalogue, is one component in a network of communicating services which may be assembled to meet particular business and user needs. There has been some preliminary work characterising the types of service such brokers might provide in an RDN environment (Powell, 2000).
The first Imesh workshop (Dempsey, Gardner, Day and van der Werf, 1999) discussed collaboration between the gateways. In principle there is considerable support for collaboration, though in practice several difficulties were realised.
At one level collaboration relates to 'business' strategies, as strategic dependencies rely on a view of directions. However, at another level, where services are individually and collectively exploring their options on a number of fronts opportunistic or tactical alliances and agreements may be very important. There was agreement that collaboration would benefit from facilitating structures, but at this stage gateways were reluctant to incur the overhead any very formal structures would involve. Typically, collaboration might develop on a bilateral basis or between small groups of initiatives. Several experiments are now underway, and discussions are emerging between national initiatives to identify where fruitful collaboration may be possible, both in terms of extending the breadth and depth of coverage of any particular service.
It should be noted that different views are taken within initiatives. In some cases, there is an aspiration to be comprehensive within a particular country, and then within subject. In other cases, not so. It also raises interesting questions about the extent to which subject areas are susceptible to international treatment. Is mathematics, for example, more 'international' than politics.
An issue of importance in any collaborative arrangement will be that of branding. Gateways aspire to be high quality 'brands'. Successful collaborative arrangements are likely to be those which do not compromise the brand value of individual initiatives.
The interest in collaboration, and the emergence of broker services, highlight the importance of interoperability. The cost of sharing records, or of licensing their use by another provider, for example, is related to the amount of processing that needs to be done on them. If the RDN collectively wished to license its data for use by a commercial aggregator, then there would be advantage in consistency of data. Similarly, the ease with which gateway services can be combined with others in new services depends on interoperability at various points. For these reasons, interoperability is increasingly seen not just in technical terms but strategically, as a way of maximising the investment in existing services.
Powell (1999) outlines the current RDN interoperability framework concentrating on search and retrieve protocols, metadata formats, and approaches to cataloguing, subject, and resource type. Iannella (1999) is a more discursive account of issues in a subject gateway environment.
The gateways operate in a fluid environment where agreed approaches do not exist for many of their requirements. Nor are there off-the-shelf solutions which provide the range of services they may wish to incorporate. There is also a considerable R&D activity, managed through various channels (see for example the Desire project - Worsfold and Hiom, 1998). This raises some general questions. How does one mesh research activity with production services without constraining the former or holding the latter hostage to uncertain timetables, or inexplicit or changing development paths? Does one develop software inhouse, or rely on an external package whether opensource or commercial? These are general issues, but of especial moment in this area.
At the same time, there are a range of development issues which relate to wider issues of network information management and access. A long list could be elaborated, which could include among other things: relationship between subject gateway, caching and mirroring strategies; personalisation; collaborative filtering; ratings; selective dissemination of information services; integration of catalogue and harvesting approaches; thesaurus linking; shared high level subject or resource type lists; and so on. The gateways need to address issues like authentication and user profiling where there are issues about local implementation versus shared approaches; which in turn raises issues about current real, if local, service improvements, as against waiting some indeterminate interim period before a shared approach is developed.
And finally in this section, a brief thought about an area which has not been close to subject gateway concerns. Historically, libraries, archives and museums have described the intellectual record, and have selectively curated and preserved it for future generations. The passage into a digital environment poses radically new questions for them, as they grapple with a resource which is fragile, fugitive, and fluid, in which the fixity which joined form and content is dissolved. They are faced with a considerable challenge if they are to continue to support their values (Dempsey, 2000).
Consider one case. Libraries, archives and museums are important sources of local history information. They collect and manage a significant resource. However, what of the last few years? The web-sites of local business, clubs, and associations are not entering the intellectual record. Or consider the emergence of a government presence on the web. Or the first steps of newspapers, media companies and others.
There are some initiatives which are looking at capturing 'snapshots' of the web, or some nationally-based parts of it. One might also consider whether the developing subject gateways, selectively, provide an avenue through which one might consider tracking the emergence of particular communities of interest.
From their early beginnings, the gateways are now adapting within a richer economy and ecology than those in which it was originally formed. We are seeing the more mature gateways develop a range of services of which the Internet resource catalogue is only one. We are seeing such catalogues figure as parts of richer collections of services. We are seeing a range of parallel and alternative services emerging. We are seeing, albeit in nascent forms, the emergence of brokerage services based on distributed communicating components. Within this complex environment, the issue for the gateways is how they will articulate their own development with the wider rearticulation of services we are seeing, and whether they can move to some sustainable funding source in that process. In that context, it will be interesting to see if the gateways retain some collective institutional identity or whether they become absorbed into, variously, for example, a national learning service, a professional portal service from a commercial publisher, a library service, and so on.
It will be interesting to see what happens with the current RDN model where national research or academic resources support the gateways, but where there is an expectation that partnership and commercial engagement will reduce the funding burden. Here it will be interesting to see whether the perceived value of the gateways translates into the ability to raise other revenues. It will also be interesting to see whether conditions support commercial entrants in this arena.
The gateways have pioneered a service model which will continue to develop in coming years. The contributions in this issue are testament to the vitality of those who are working on them, and to the innovation and creativity of the work which is supporting them. I began by discussing the path towards our current network environment which increasingly underpins so much of our activities. I have discussed the critical importance of discovery and disclosure services to effective working in a shared network space; the developing rearticulation of publishing, education, and communication; and the emergence of a learning agenda supported by several significant programmes of activity. I have discussed how the gateways might position themselves in this environment; it is also worth remembering how much these other activities can learn from the experiences of the gateways.
Lorcan Dempsey is the director of UKOLN at the University of Bath, and co-Director of the Resource Discovery Network. UKOLN is supported by the Library and Information Commission, JISC, and the University of Bath. The RDN is funded by the JISC. Thanks are due to Derek Law for commenting on the section about the policy background to the eLib gateways, and to Traugott Koch for inviting the contribution. Thanks also to Ray Lester and to Nicky Ferguson for some specific discussion. Any views expressed are those of the author alone.
(Agre 1998) Phil Agre. Yesterday's tomorrow. Times Literary Supplement, 3 July 1998, pages 3-4.
(ANIR, 1994) Report on the Working Group on Access to Networked Information Resources ISSC(94)21. [commissioned Autumn 1993; submitted Spring 1994]. (Reprinted in Journal of Information Networking, 2(3), 1995, p. 223-235).
(ANR) Access to network resources - eLib projects. <URL:http://www.ukoln.ac.uk/services/elib/projects/>
(Ardoe and Koch, 1993) Anders Ardoe and Traugott Koch. Wide-area information server (WAIS) as the hub of an electronic library service at Lund University. In: Opportunity 2000: understanding and service users in an electronic library: 15th International Essen Symposium, 12 Oct - 15 Oct 1992. Essen: Essen University Library, 1993.
(Ariadne) The Ariadne Magazine. <URL:http://www.ariadne.ac.uk/>
(Barker et al, 1994) Paul Barker, Thomas Johannsen and Colin Robbins. A survey of current and possible future uses of X.500 directory services. Journal of Information Networking, 1(3), 1994.
(Caplan, 1993) Priscilla Caplan. Cataloging Internet resources. The public-access computer systems review, 4(2), 1993, 61-66.
(Chapman, Day and Hiom, 1998) Ann Chapman, Michael Day and Debra Hiom. Cataloguing practice and Internet subject-based information gateways. Ariadne, 18, 1998. <URL:http://www.ariadne.ac.uk/issue18/metadata/>
(CORC) Information about CORC available from <URL:http://www.oclc.org/oclc/corc/index.htm>
(Day, 1999) Michael Day. ROADS cataloguing guidelines. <URL:http://www.ukoln.ac.uk/metadata/roads/cataloguing/cataloguing-rules.html> (version consulted: last updated 16 June 1999)
(Dempsey, 1993) Lorcan Dempsey. Research networks and academic information services: towards an academic information infrastructure. Journal of Information Networking, 1(1), 1993, 1-27.
(Dempsey, 1994) Network resource discovery: a European library perspective. A paper presented to the British Library R&D Dept, 4 August 1994. (reprinted in: Neil Smith ed. Libraries, networks and Europe: a European networking study. LIR reports; 101. London: British Library R&D Department, 1994.) Also available at <URL:http://www.ub2.lu.se/UB2proj/LIS_collection/lorcan.html>.
(Dempsey, 1996) Lorcan Dempsey. ROADS to Desire: some UK and other European metadata and resource discovery projects. D-Lib Magazine, July/August, 1996. <URL:http://www.ukoln.ac.uk/mirrored/lis-journals/dlib/dlib/dlib/july96/07dempsey.html>
(Dempsey, 2000) Lorcan Dempsey. Scientific, Industrial, and Cultural Heritage: a shared approach: a research framework for digital libraries, museums and archives. Ariadne, Issue 22, January 2000. <URL:http://www.ariadne.ac.uk/issue22/dempsey/intro.html> (Slightly amended version of a report of the same title prepared for the European Commission's Information Society Directorate General in the context of Fifth Framework objectives, November 1999)
(Dempsey and Heery, 1998) Lorcan Dempsey and Rachel Heery. Metadata: a current view of practice and issues. Journal of Documentation, 54(2), 1998, 145-172.
(Dempsey, Gardner, Day, van der Werf, 1999) Lorcan Dempsey, Tracy Gardner, Michael Day and Titia van der Werf. International Information Gateway Collaboration Report of the First IMesh Framework Workshop. D-Lib Magazine, 5(12), December 1999. <URL:http://mirrored.ukoln.ac.uk/lis-journals/dlib/dlib/dlib/december99/12dempsey.html>
(DESIRE) Information and reports can be found on the website <URL:http://www.desire.org/>
(DNER) Committee on Electronic Information - Content Working Group. An integrated information environment for higher education: developing the distributed, national, electronic resource (DNER). December 1997. <URL:http://www.jisc.ac.uk/cei/dner_colpol.html>
(eLib) Electronic Libraries Programme website. <URL:http://www.ukoln.ac.uk/services/elib/>
(Ferguson, 1995) Nicky Ferguson. Subject-based services: origins and futures. In: Lorcan Dempsey, Derek Law and Ian Mowat eds. Networking and the future of libraries 2: managing the intellectual record: an international conference held at the University of Bath, 19-21 April 1995. London: Library Association Publishing, 1995. p. 131-135.
(Follett, 1994) Report of the Joint Funding Council's Libraries Review Group, chair Professor Sir Brian Follett. <URL:http://www.ukoln.ac.uk/services/papers/follett/report/>
(Green 1997) Andrew Green. Towards the digital library: how relevant is eLib to practitioners? Journal of Academic Librarianship, 3, 1997. p 39-48.
(Hiom, 1994) The Social Science Information Gateway. Journal of Information Networking, 2(2), 1994. p.136-139.
(Hiom, 1999) Personal communication from Debra Hiom, December 5, 1999.
(Iannella, 1999) Renato Iannella. Technical review of RDNC subject gateway services. http://www.rdn.ac.uk/publications/studies/technical-review/
(JISC) The range of JISC services can be seen from the JISC informational web pages <URL:http://www.jisc.ac.uk/>.
(JISC, 1994) JISC Circular 4.94: Follett Implementation Group on Information Technology: framework for progressing the initiative. 3 August 1994. <URL:http://www.ukoln.ac.uk/services/elib/papers/circulars/4-94/>
(Kirriemuir et al, 1998) Kirriemuir, John, Dan Brickley, Martin Hamilton, Jon Knight, Susan Welsh, Cross-Searching Subject Gateways: The Query Routing and Forward Knowledge Approach. D-Lib Magazine, January 1998. <URL:http://mirrored.ukoln.ac.uk/lis-journals/dlib/dlib/dlib/january98/01kirriemuir.html>
(Koster, 1994) Martijn Koster. Aliweb - Archie-like indexing in the web. 28 March 1994. <URL:http://info.webcrawler.com/mak/projects/aliweb/paper-www94/paper.html> (also available in Computer networks and ISDN systems, 27(2), p.175-182.)
(Law, 1994) Derek Law. The development of a national policy for dataset provision in the UK: a historical perspective. Journal of Information Networking, 1(2), 1994, p.103-116.
(Leadbeater, 1999) It's not just the economy, stupid. New Statesman, September 1999, Special supplement: Knowledge is power!, p.iv-vi.
(Lynch, 1993) Clifford A Lynch. A framework for identifying, locating and describing networked information resources (draft for discussion at March-April 1993 IETF meeting). 23 March 1993. <URL:ftp://ftp.cni.org/CNI/wg.docs/architecture/lynch.overview.txt>
(Mackie and Burton, 1999) Morag Mackie and Paul Burton. The use and effectiveness of the eLib subject gateways: a preliminary investigation. Program, 33(4), 1999, 327-337.
(Macleod, Kerr and Guyon, 1998) Roddy Macleod, Linda Kerr and Agnes Guyon. The EEVL approach to providing a subject based information gateway for engineers. Program, 31(3), 1998, 205-233.
(Nicholson, 1999) Personal communication from Dennis Nicholson, December 6 1999.
(NLA, 1999) National Library of Australia. A national framework for the development of Australian subject gateways. First draft, July 1999. <URL:http://www.nla.gov.au/initiatives/sg/>
(Obraczka, 1993) Katia Obraczka, Peter B. Danzig and Shih-Hao Li. Internet resource discovery services. Computer, September 1993, p.8-22.
(Powell, 1999) Andy Powell. Interoperability framework. <URL:http://www.rdn.ac.uk/publications/interoperability/framework/> (version consulted dated 1 October 1999)
(Powell, 2000) An MIA view of DNER portals (work in progress). <URL:http://www.rdn.ac.uk/publications/mia/> (version consulted: version 1.6)
(Quarterman, 1990) John Quarterman. The matrix. Bedford, MA: Digital Press, 1990.
(RDN) Resource Discovery Network website. <URL:http://www.rdn.ac.uk/>
(ROADS) ROADS (Resource Organisation and Discovery in Subject-based services) software, documentation and links to other ROADS material available from <URL:http://www.roads.lut.ac.uk/index.html>
(Rusbridge, 1998) Chris Rusbridge. Towards the hybrid library. D-Lib Magazine, July/August 1998. <URL: http://mirrored.ukoln.ac.uk/lis-journals/dlib/dlib/dlib/july98/rusbridge/07rusbridge.html>
(Schwartz et al, 1992) Michael F. Schwartz, Alan Emtage, Brewster Kahle, and B Clifford Neumann. A comparison of resource discovery approaches. Computing Systems, 5(4), 1992.
(Thorne, 1999) Mike Thorne (ed). Universities in the future. [London]: Office of Science and Technology, Department of Trade and Industry, 1999.
(Vlib) Gerard Manning. About the Virtual Library. <URL:http://www.vlib.org/AboutVL.html> (version consulted: "last modified Sep 15, 1999")
(van der Werf, 1994) Titia van der Werf. InfoServices: cooperation between the National Research Network Service and the National Library in The Netherlands. Journal of Information Networking, 2(1), 1994.
(Weider, 1994) Chris Weider. The Internet anonymous ftp archives templates: towards an Internet resource location system. Journal of Information Networking, 1(3), 1994.
(Wilson, 1998). Patrick Wilson. Patrick Wilson: a bibliographer among the cataloguers. In: Portraits in cataloging and classification: theorists, educators and practitioners of the late Twentieth Century. Carolynne Myall and Ruth C. Carpenter (eds). Binghamton, NY: The Howarth Press, 1998. p. 305-316.
(Work, 1993) Colin K. Work. The future development of Campus-Wide Information Systems: towards the virtual campus. Journal of Information Networking, 1(1), 1993, p. 41-52.
(Worsfold and Hiom, 1998) Emma Worsfold and Debra Hiom. The DESIRE Project - promoting and facilitating Web usage among Europe's research community. New Review of Information Networking, 4, 1998, p. 105-126.
(Zedlewski, 1999) Personal communication from Ed Zedlewski, December 7, 1999.