DELOS logo | Link to DELOS home page
newsletter logo
Issue 2 : October 2004
DELOS Home DELOS Newsletter Front PageDelos Newsletter Contents

The Work and Vision of Work Package 1: Digital Library Architecture

Hans Schek and Can Türker together with representatives of the groups introduce the work, research and vision of the constituent groups within the DELOS Work Package 1: Digital Library Architecture (ARCH).

Introduction

The feature article for this issue of the DELOS Newsletter is devoted to the first DELOS work package (WP1). Our goal is to introduce briefly the groups that are active in this work package. Each group has arranged its information into the following areas:

To give an overview, the groups and their particular interests are as follows:

ETH Zurich, Switzerland (WP1 Leader):
Hyperdatabase Technology: The Basis of Future Digital Library Infrastructure
CNR-ISTI Pisa, Italy:
Meeting the Demands of Virtual Organizations
FhG/IPSI Darmstadt, Germany:
Grid-based Infrastructures the Basis of Future Virtual Digital Libraries
MPII Saarbrücken, Germany:
Achieving Service-quality Collaborative Searching in a Peer-to-Peer Environment
Masarykova Universita v Brne, Czech Republic:
Working Towards a Uniform Computational Framework
OFFIS Oldenburg, Germany:
Super Peer Networks: Providing Scalability and Autonomy
Technical University of Crete, Greece:
Focusing on Service-oriented Architectures
UKOLN, University of Bath, UK:
Open Standards Key to the Creation of Long-Term Viable Applications and Resources
UMIT Innsbruck, Austria:
Designing an Infrastructure for Highly Networked Information in a Pervasive Computing Environment
University of Athens, Greece:
Information Access Methods: Respecting the Autonomy of Differing Digital Libraries
Universitā degli Studi di Milano, Italy:
Providing Ubiquitous Access through Personalized and Content-aware Presentation
Universitā degli Studi di Padua, Italy:
Flexibility of Implementation to Reflect Differing Architectural Paradigms

However, before we move to the individual groups, let us very briefly recall the objectives of this work package: they are to evaluate both conceptually and experimentally the impact of recent computing technologies on digital library architectures. These new directions can be summarized as:

  1. Web services and service-oriented architectures
  2. Grid middleware
  3. Peer-to-peer data management

A thorough evaluation of existing approaches will reveal the advantages and disadvantages of these strategies.

ETH Zurich:
Hyperdatabase Technology: The Basis of Future Digital Library Infrastructure

Introduction to the Group

The ETH database research group is headed by Prof. Hans-Jörg Schek. The group currently consists of two senior researchers and six doctorate students. Its research activities aim to realize the vision of a hyperdatabase as the key infrastructure for developing and managing future information systems. At its interface, a hyperdatabase supports component and service definition, specification of transactional processes encompassing multiple service invocations, service publication and subscription. A hyperdatabase performs metadata management, component and service discovery and tracking, scheduling, routing, and optimization of service requests, monitoring, flexible failure treatment, availability and scalability. A hyperdatabase particularly provides an effective and efficient infrastructure to manage and retrieve documents over large multimedia repositories. Due to the high cost of capturing the content of multimedia documents, the infrastructure is able to make use of a large number of machines running various types of components to store and analyze documents, to extract features from them, cluster them, and to maintain indexes over those features. For more information about the group and its research, see http://www.dbs.ethz.ch.

Vision of a Future DLA

In our vision, digital library users will be able to gain access to a myriad of forms of knowledge from anywhere and at any time and in an efficient and user-friendly fashion. To realize this vision, a highly scalable, customizable and adaptive infrastructure is needed. For future digital libraries we visualize an infrastructure that is based on hyperdatabase technology. Such an infrastructure combines techniques from peer-to-peer data management, grid computing middleware, and service-oriented architectures.

Peer-to-peer networks allow for loosely coupled integration of digital library services and the sharing of information such as recommendations and annotations. Grid computing middleware supports the dynamic allocation and deployment of complex and computationally intensive digital library services such as the extraction of features from multimedia documents to support content-based similarity search. A service-oriented architecture provides common mechanisms to describe the semantics and usage of digital library services. Furthermore, it supports mechanisms to combine services into workflow processes for sophisticated search and maintenance of dependencies. As depicted in Figure 1, the digital library architecture envisaged consists of a grid of peers which provide various kinds of digital library services such as storage, extraction or retrieval services. These digital library services can be combined with processes. High scalability is achieved by executing the processes in a completely distributed, peer-to-peer fashion. For that, metadata about processes, services, and load of the peers is distributed and replicated over the grid. This is performed by a small hyperdatabase (HDB) layer atop each peer. This layer also takes care of peer-to-peer navigation and execution of processes. Figure 1 depicts the execution of the process "Insert Image".

diagram (70KB): Figure 1: Digital library architecture based on a hyperdatabase infrastructure

Figure 1: Digital library architecture based on a hyperdatabase infrastructure

Research Relevant to WP1

Our research on hyperdatabases is strongly related to the objectives of WP1. With the implementation of our OSIRIS hyperdatabase prototype, we hope to show how a synthesis of concepts and techniques from database systems, process management systems, service-oriented architectures, grid and peer-to-peer computing can work together. The results obtained from many experiments with OSIRIS show the considerable scalability potential of hyperdatabase technology. Moreover, OSIRIS demonstrated its usefulness for advanced search and co-ordination of multimedia documents. While in the past we have developed sophisticated metadata replication and peer-to-peer process execution techniques in the hyperdatabase framework, current research focuses on concepts and mechanisms for the transactional execution of processes which did not involve any dedicated grid components. Such transactional executions will even be provided in the context of mobile grid peers and grid partitioning.

Main Contributions to WP1

As project management lead for this work package, the ETH database research group co-ordinates the tasks in WP1. Furthermore, it contributes in particular to the digital library architecture which supports a peer-to-peer execution of composite services under transactional guarantees and in the presence of dynamically changing combinations of the grid peers. Using the experience gained from the implementation and investigation of the OSIRIS hyperdatabase prototype, the ETH database research group contributes to the WP1 survey on service-oriented architectures, peer-to-peer systems and grid infrastructures. This is done in close collaboration with the DELOS partner UMIT. Furthermore, the ETH database research group leads the survey on synchronization techniques in E-health applications. Currently, the following members of the ETH database group are involved in WP1 activities: Hans-Jörg Schek, Sören Balko, Michael Mlivoncic, Christoph Schuler, Hao Shao, and Can Türker.

ISTI-CNR:
Meeting the Demands of Virtual Organizations

Introduction to the Group

The ISTI-CNR group belongs to the Multimedia Networked Information Access Laboratory, one of the fourteen research laboratories of the Istituto di Scienza e Tecnologia dell'Informazione, A. Faedo- CNR, located in Pisa, Italy. Currently, this laboratory, lead by Dr. Costantino Thanos, comprises thirty members including both permanent and temporary staff. This group has been conducting research on Digital Libraries (DLs) since 1996. This research has mainly focused on the design of generic digital library systems able to satisfy the needs of many different application frameworks. Particular attention has been dedicated to the introduction of innovative services on digital objects and to the study of flexible DL architectures capable of supporting different requirements in terms of handled content, functionality, policies, distribution, availability, etc.

Vision of a Future DLA

Our research has always been driven by the aim of satisfying concrete user requirements. Recently, we have been facing radical new user expectations in respect of digital libraries. A large part of the demand for DLs comes from "virtual organizations", i.e., organizations composed by remotely distributed individuals, not usually highly skilled in computer science, who often work together for a limited period to achieve a common goal. These individuals need DLs to support temporary activities such as projects, exhibitions, courses, etc. These users demand easy, cheap and quick DL development models. In order to satisfy this demand we have started to explore the introduction of generic, customizable and highly dynamic software environments capable of providing, in addition to the functionality of a library, all the necessary management functions required to maintain the library, e.g. to preserve the content and services, and to guarantee the quality of the entire DL service, e.g. to support availability, performance and scalability. These environments must also support the sharing of resources in order to maximize re-use and to decrease the costs of creating a DL. Note that in the currently emerging context, the notion of sharing is not confined to "content", as has been the case until now. It also spans applications, software platforms, storage and computing elements. Sharing these resources must then, of necessity, be highly controlled. Most resource providers will only usually open access to their resources when the technology is sufficiently mature to guarantee that the resources shared are only used according to the policies established by their owners.

The creation of these new DL environments calls for appropriate architectural frameworks. We are currently working on the definition of an innovative distributed service-oriented architectural framework composed of three main elements:

  1. A technical infrastructure, which provides all the necessary functionalities for supporting basic capabilities, like dynamic allocation and sharing of resources, transparent distribution, security, interoperability, quality of service, activation and de-activation of DLs, etc
  2. A set of services that implement the typical digital library functionality
  3. A number of application specific services, possibly supplied by third parties, which provide access to shared repositories of content and application specific tools by following the standard rules imposed by the technical infrastructure

In principle this architectural framework has the capacity to support multiple dynamically created customised virtual views of the underlying resources. A specific DL can thus be defined as one capable of providing such views and can therefore be created and destroyed dynamically on demand with limited effort in a short period of time.

Research Relevant to WP1

Over recent years our research has mainly focused on the design of a distributed, dynamically configurable, service-oriented architecture for Digital Library Management Systems (DLMSs). As an outcome of this research we have designed and implemented one of these systems, OpenDLib (http://www.opendlib.com), which is now fully operational. OpenDLib has been used for building a number of DLs:

The development of a system with this kind of architecture required greater implementation effort than that of a centralized one since a number of services devoted to the co-ordination, management and optimal allocation of the different service instances also had to be provided. However, the experience acquired so far in building the different DLs has validated our choice as we have been able to satisfy quite easily a large number of application-specific requirements that could not otherwise have been met.

By exploiting the outcomes of this activity, we are now working on the definition of a more general architectural framework as described above. We are confident that the new architectural approaches that have emerged since we started the design of the OpenDLib system, e.g. Web services, P2P, Grids, now provide features that simplify the implementation of a distributed DLMS architecture and offer a number of new opportunities to implement novel user functionalities and enhance the quality of the overall system. In particular, we are now exploring, together with other DELOS partners, the use of the Web service paradigm on a Grid infrastructure as a basis for building a DL environment with the desired characteristics.

Main Contributions to WP1

We contribute to the WP1 activities by bringing to this workpackage the results of our past experience on building and experimenting a DLMS with a distributed service-oriented architecture and by reporting our new research on Grid-enabled DL environments. In particular, we are leading the preparation of a survey on current aspects and tools of the Grid technology that potentially can be exploited for the construction of DL environments. We are also working on the identification of the gap between the functionality provided by some of the best known Grid middleware and the functionality required by a technical infrastructure capable of supporting the on-demand creation of transient DLs. The ISTI-CNR researchers participating in this specific work package activity are: Henri Avancini, Leonardo Candela, Donatella Castelli, Pasquale Pagano and Manuele Simi.

FhG/IPSI:
Grid-based Infrastructures the Basis of Future Virtual Digital Libraries

Introduction to the Group

The Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. (FhG), the leading organization of institutes of applied research and development in Europe, is a link between science and industry, i.e. between research and the application of its results. It was founded in Munich in 1949 as a registered non-profit association. The Fraunhofer-Gesellschaft is an autonomous body with a decentralized organizational structure, which currently maintains 58 research institutes and a patent office in locations throughout Germany. A staff of approximately 13000, the majority of whom are qualified scientists and engineers, works with an annual research budget of about one thousand million (1,000,000,000) Euros.

The Fraunhofer Institute IPSI (Integrated Publication and Information System Institute) focuses its research and development work on software applications for co-operative work, publication and information, innovation support, and lifelong learning in real and virtual environments. Our research areas comprise knowledge management and e-commerce, systems for individual or group learning, security in media and document management, digital libraries and e-Science, information systems, database-supported publication tools, distributed publication environments for the common maintenance of extensive data, and services for mobile communication. These activities also cover the fields of planning and installing modern working environments, i.e., building elements and furniture equipped with high-quality information technology.

Vision of a Future DLA

The IPSI group sees the future role of a digital library and of digital library services as an important part of the information and knowledge environments which support E-Science and other innovative processes. According to IPSI's understanding, the digital library is currently undergoing a transition from a statically integrated system to a dynamic federation of services. This transition is inspired by new trends in technology which include developments in technologies like Web services and Grid infrastructures as well as by the success of new paradigms like Peer-to-Peer Networking and Service-oriented Architectures. The transition is driven by DL "market" needs. This includes a requirement for

Such new decentralized and service-oriented architectures for digital libraries make the library functionality available in a more cost-effective and tailored way and thus open up new application areas for digital libraries.

In essence, the creation of virtual digital libraries on the basis of Grid-based infrastructures, support for the integration of metadata, personalization services, semantic annotation and the on-demand availability of information collections and extraction services will make digital libraries more useful and attractive to a wider clientele.

diagram (36KB): Figure 2: The Infrastructure to Support Virtual Digital Libraries

Figure 2: The Infrastructure to Support Virtual Digital Libraries

Research Relevant to WP1

IPSI is part of two key European projects related to the WP1 objectives: Bricks and DILIGENT. Both projects are related to the next generation of digital library infrastructures. In these projects the IPSI group develops concepts and services for the architecture, metadata and content management, metadata integration, personalization in distributed architectures, manual and automatic annotation, distributed data management in the Grid and decentralized data management as well as content and services security.

Main Contributions to WP1

In WP1 IPSI participated in the discussion on potential architectures for the future digital library with a special focus on XML-based decentralized metadata management, personalization approaches that also work in heterogeneous dynamic environments and the systematic support and exploitation of annotations in such environments. This work was summarized in a publication for the WP1 Workshop in June 2004.

Max-Planck-Institut für Informatik Saarbrücken:
Achieving Service-quality Collaborative Searching in a Peer-to-Peer Environment

Introduction to the Group

The research group Databases and Information Systems is headed by Prof. Dr. Gerhard Weikum and located at the Max-Planck-Institut für Informatik in Saarbrücken, Germany. The group is being built up and currently consists of about 10 researchers. The overriding long-term goal is to develop rigorous service-quality guarantees for Internet-based information systems, comprising provably correct behaviour, predictably acceptable response times, very high availability with failure masking, and satisfactory result quality for various kinds of information search.

The current focus is on the following topics:

Vision of a Future DLA

We are addressing the problem of collaborative search across a large number of digital libraries and query routing strategies in a peer-to-peer (P2P) environment. Both digital libraries and users are equally regarded as peers and, thus, as part of the P2P network. Our system provides a versatile platform for a scalable search engine combining local index structures of autonomous peers with a global directory based on a distributed hash table (DHT) as an overlay network.

Research Relevant to WP1

The peer-to-peer (P2P) approach, which has become popular in the context of file-sharing systems such as Gnutella or KaZaA, permits the handling of huge amounts of data in a distributed way. In such a system, all peers are equal and all of the functionality is shared among all peers so that there is no single point of failure and the load is balanced across a large number of peers. These characteristics offer potential benefits for building a powerful search engine in terms of scalability, resilience to failures, and high dynamics. In addition, a P2P search engine can potentially benefit from the intellectual input of a large user community; for example, prior usage statistics, personal bookmarks or implicit feedback derived from user logs and click streams.

Our framework combines closely studied search strategies with new aspects of P2P routing strategies. In our field of digital libraries, a peer can either be a library itself or a user that wants to benefit from the huge amount of data in the network. Each peer is a priori autonomous and has its own local search engine with a crawler and a corresponding local index. Peers share their local indexes (or specific fragments of local indexes) by posting the meta-information into the P2P network, thus effectively forming a large global, but completely decentralized directory. In our approach, this directory is maintained as a distributed hash table (DHT). A query posed by a user is first executed on the user's own peer, but can be forwarded to other peers for better result quality. Collaborative search strategies use the global directory to identify peers that are most likely to hold relevant results. The query is then forwarded to an appropriately selected subset of these peers, and the local results obtained from there are merged by the query initiator.

Main Contributions to WP1

Our prototype system is described in:

Our current and future work covers the auto-generation of web services (for deep-web sources), automatic query mapping and P2P exploiting "collaborative intellectual input" (bookmarks, ontologies, query logs). We are also working on personalized ontologies based on long-term relevance feedback. An architecture of a scalable, self-organizing DL federation with intelligent and efficient searching represents, we feel, a valuable contribution to the DELOS Network of Excellence.

Masarykova universita v Brne:
Working Towards a Uniform Computational Framework

Introduction to the Group

The group consists of one professor, three associate professors, and five PhD students. The main research focus of the group is on advanced data processing methods. We are mainly interested in indexing techniques for the new types of data, for example multimedia data, as well as the new forms of data, such as the XML format.

Vision of a Future DLA

In our vision, the future architecture of digital libraries should respond to the ever-growing need for the data processing software systems that abstract over programmable infrastructure, which while provided locally, is distributed on a world-wide scale. We could visualize this architecture as a global computer that consists of the networked integration of individual computing units with potentially different computing, storage and networking capabilities. Each unit exposes a common interface that is co-ordinated so as to provide a global paradigm. As such, the global computer provides an abstraction that co-ordinates a potentially wide range of units, with the advantage of providing a uniform computational framework for particular sets of digital library applications.

Research Relevant to WP1

Recently, our effort has focused on developing a distributed storage structure for similarity searching in metric spaces that would scale up with constant or moderately increasing search times. In this respect, our proposal, called the Distributed Generalized Hyperplane Tree (GHT*), can be seen as a Scalable and Distributed Data Structure (SDDS) which uses the P2P paradigm for communication in a Grid-like computing infrastructure. We can achieve the desired effect for arbitrary metric data by linearly increasing the number of network nodes (whole computers), where each of them can act as a client and some of them can also operate as servers. A client inserts metric objects and issues queries, but there is not a specific (centralized) node to be accessed for all (insertion or search) operations. At the same time, insertion of an object, even the one causing a node split, does not require immediate update propagation to all network nodes. A certain degree of data replication can be tolerated. Each server provides some storage space for objects and also has a capacity to compute distances between pairs of objects. A server can send objects to other peer servers and can also allocate a new server.

The parallel search time for the similarity range and nearest neighbour queries in the GHT* becomes practically constant for arbitrary data volumes - the larger the dataset the greater the potential for inter-query parallelism. The GHT* has no hot spots - all clients and servers use as precise an addressing scheme as possible and they all incrementally learn from mis-addressing. Finally, updates are performed locally and a node splitting should never require the sending of multiple messages to many clients or servers.

Main Contributions to WP1

A member of the group acted as a PC member of the Sixth International Workshop on Digital Library Architectures. We have co-operated closely with the IST-CNR Pisa. As a result of this co-operation, we have published the following papers:

The following group members are involved in WP1 activities: Pavel Zezula, Michal Batko and Vlastislav Dohnal.

OFFIS:
Super Peer Networks: Providing Scalability and Autonomy

Introduction to the Group

Two of the five research divisions of OFFIS are involved in research on digital libraries. These are the division on Business Information and Knowledge Management and the division on Multimedia and Internet Information Services.

Vision of a Future DLA

The described network in Figure 1 represents a first step in order to capitalize on the advantages of peer-to-peer technology for digital libraries. For personal or project reference libraries most of the upcoming traffic will remain within sub-areas of the network where co-workers co-operate intensely. For specialized collections which focus on special topics or special media type queries can be routed directly to selected collections or even library experts without flooding the entire network. Precision and query performance can hence be improved. Additionally, a self-organization of collections and libraries is possible. Scalability and administrative autonomy are also ensured.

Research Relevant to WP1

Our research focuses on super peer networks. The figure depicts a hierarchical super peer network for digital libraries. Users are able to search for artefacts and offer artefacts independently. They are therefore supplied with person peers. On the next organizational level, the artefacts are grouped within collections managed by collection peers. Collection peers offer functionality relating to collection organization as for example the provision of a common classification scheme. A digital library can combine a number of different collections and is associated with a digital library peer. A digital library peer supports the integration of different collections, for example by offering merging services for different classification schemes. Furthermore, it manages access to the digital library artefacts, for example by ensuring a certain mode of payment. Person peers and collection peers can also exist independently of a superordinate peer and offer artefacts autonomously.

Peers are organized in disjoint clusters. Super peers route the messages along clusters to the destination cluster. Within the clusters the messages move through the hierarchical structure of the peers. The hierarchical peer structures in combination with their super peers form a hierarchical super peer network. The super peers hold a common metadata index of available artefacts which are distributed over the different organizational units or peer types, respectively. They are able to answer simple queries. Detailed queries additionally pass through the hierarchical structure of the peers. The exchange of the artefacts located takes place directly from peer to peer.

Super peer networks have some advantages over pure peer-to-peer networks. They combine the efficiency of the centralized client-server model with the autonomy, load balancing, and robustness of distributed search. They also take advantage of the heterogeneity of capabilities across peers.

The most important benefits of the approach discussed in this paper are scalability and administrative autonomy. A super peer can route messages independently within its cluster. Similarly, digital library and collection peers can route the messages to subordinated peers using their own strategy. Queries to selected organizational units do not flood the entire network but can be routed directly.

diagram (37KB): Figure 3: A hierarchical super-peer network for distributed artefacts

Figure 3: A hierarchical super-peer network for distributed artefacts

The hierarchical super peer network supports the flexibility and self-organization of widely distributed, loosely coupled and autonomous digital library systems. The architecture allows for searching over collections of arbitrary artefacts as for example traditional documents, on-line books, digital images, and videos, which is a basic service requirement for digital libraries. Beyond this, the network also enables library users to store, administer and classify their own artefacts. Therefore, it supports scenarios like the construction of personal or group reference libraries and collaborative authoring.

In the new Probado Project, tools for locating, storing, and releasing non-textual, multimedia documents automatically will be developed. Current digital libraries do not support such documents adequately because they usually assume documents to be purely textual in content.

Main Contributions to WP1

A research visit of a research assistant from OFFIS at the UMIT Innsbruck took place over 20 September - 1 October 2004. The purpose of the visit was to exchange knowledge about peer-to-peer architectures and to investigate the use of structured/hierarchical super-peer networks [1] [3] within the medical sector in order to solve the availability problem for distributed patient records. A patient record can be regarded as a specific kind of digital library artefact. These artefacts are typically distributed over several institutions and yet may not be regarded as available anywhere and everywhere to the doctor who needs them. A joint paper [2] describes a first approach regarding the use of super-peer networks to solve the described problems. To gain a deeper insight, however, additional research is needed.

Related Publications:

  1. Bischofs, Ludger; Hasselbring, Wilhelm; Schlegelmilch, Jürgen; Steffens, Ulrike: A Hierarchical Super Peer Network for Distributed artefacts. In: Pre-proceedings of the Sixth Thematic Workshop of the EU Network of Excellence DELOS. S. Margherita di Pula (Cagliari), Italy, 2004, pp. 105-114
  2. Bischofs, Ludger; Hasselbring, Wilhelm; Niemann, Heiko; Schuldt, Heiko; Wurz, Manfred: Verteilte Architekturen zur intra- und inter-institutionellen Integration von Patientendaten. In: Tagungsband der 49. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS 2004) (2004), September
  3. Bischofs, Ludger; Hasselbring, Wilhelm: A Hierarchical Super Peer Network for Distributed Software Development. In: Proceedings of the Workshop on Cooperative Support for Distributed Software Engineering Processes (CSSE 2004). Linz, Austria, 2004, September

Technical University of Crete:
Focusing on Service-oriented Architectures

Introduction to the Group

The Laboratory of Distributed Multimedia Information Systems and Applications of the Department of Electronic and Computer Engineering (ECE) of the Technical University of Crete (TUC/MUSIC) is a centre of research, development and education in the technological fields of Information Systems and their applications in the Information and Knowledge Society. In particular the Laboratory operates in the fields of Technologies, Architectures and Web Services Systems over the Internet, Databases and Knowledge Bases, Information Retrieval, Digital Libraries, Geographic Information Systems, Human Computer Interaction, Multimedia Management Systems, Digital TV Systems and Applications in e-Commerce, Tourism, Culture and e-Learning. TUC/MUSIC has been active since 1990 and it has participated in more than 40 European and national projects as well as in many Excellence Networks of the European Union.

The director of MUSIC/TUC is Prof. Stavros Christodoulakis. The research staff consists of six permanent staff members and about twenty postgraduate (Masters and PhD) students.

Vision of a Future DLA

Architectures could be Peer-to-Peer or pure Grid (with centralized co-ordination). The emphasis in the Peer-to-Peer approach is more on service provision by independent organizations and the composition of the independent services offered by DLs within a given network. These DLs may involve other smaller DLs (by SMEs) or even personal DLs. The emphasis in grid architectures with centralized knowledge of resources (CPU power, disks, network, etc) is more on dynamic workload balancing for large DLs or DLs which manage very large multimedia objects requiring real-time interaction and synchronization.

Research Relevant to WP1

Our current research focuses on Peer environments and service-oriented architectures. Service description according to MDA standards, service differentiation, service search and service synthesis are the current research topics being pursued.

In the future we are also planning to expand our research activities in the area of Grid architectures. Relevant research we have done in the past includes parallel streaming of audiovisual data from servers to support communities of users.

Main Contributions to WP1

Preliminary work done in the architectures for peer-to-peer computing and service-oriented environments was published in the cluster workshop in Sardinia. The work continues.

UKOLN:
Open Standards Key to the Creation of Long-Term Viable Applications and Resources

Introduction to the Group

UKOLN, based at the University of Bath, is a UK centre of expertise in digital information management, providing advice, support and services to the library, information, education and cultural heritage communities. UKOLN seeks to use its expertise to influence policy and inform best practice, to promote community-building and consensus-making by actively raising awareness, to advance knowledge through research and development, to build innovative systems and services based on Web technologies and to act as an agent for knowledge transfer.

The organisation is sub-divided into 4 main areas of activity, formally entitled Policy and Advice, Research and Development, Distributed Systems and Services and Resources and Administration. The four groups offer a complementary set of skills adding up to a wide-ranging coverage across all the related disciplines of Digital Library management.

UKOLN is currently active in, among others, the new Digital Curation Centre (DCC) where it is one of the four key partners in this important new initiative, maintenance of the Resource Discovery Network (RDN), the JISC Information Environment and in the provision of technical support and consultancy to large-scale nationwide projects. UKOLN holds JISC focus posts in Collection Description, QA (Quality Assurance), Web (including the UK HE seat on the W3 consortium) and Interoperability and is instrumental on many library and e-learning initiatives.

UKOLN is principally funded though the UK’s JISC and MLA organisations. Further information is available on our website http://www.ukoln.ac.uk/.

Vision of a Future DLA

UKOLN shares the vision of a digital library offering open access at any time, from any place, to all stored digital resources.

To achieve this, UKOLN is committed to the development and uptake of open standards and to oversee the application of these standards in new architectures and network middleware. UKOLN see the support and use of open standards as a key to creating long-term, viable applications and resources.

UKOLN understands that a future DLA must be built on functional, useful and agreed international metadata standards. UKOLN will be integral in the development and acceptance of well structured and globally accepted metadata definitions and schemas and will support the advancement and uptake of new and existing standards upon which the future DLA will be built.

UKOLN recognises that a future DLA will require new technologies to support distributed searching across multiple targets. UKOLN will continue its research and development work in this area through its existing commitments within the JISC Information Environment and will work with its partners to bring about the creation of exciting new developments in the application of cross-searching and data harvesting to the emerging peer-to-peer and GRID environments.

Research Relevant to WP1

UKOLN has a long history of achievement in the field of metadata for description and interchange. Currently, UKOLN is significantly involved in the DCMI (Dublin Core Metadata Initiative) with Andy Powell sitting on the usage committee and Pete Johnston working in the collection description area. Peter Dowdell, Monica Duke and Greg Tourte are currently engaged in implementation work integrating schemas into existing and new applications such as the RDN (Resource Discovery Network), EnrichUK and the JISC IESR (Information Environment Schema Registry). UKOLN will continue to use its expertise to develop schemas further, to propose and direct new research into the area and to use its reach to publicise developments in the field.

It also continues with the development of distributed information systems incorporating XML-based record sharing and harvesting, news dissemination through RSS, distributed searching and web services.

UKOLN has an important role as a promoter in the development and maintenance of open technical standards and good practice. UKOLN has been the custodian of the NOF-digitise technical standards and guidelines and is a partner in the new Digital Curation Centre, contributing to the establishment of a suitable standards framework in this important new nationwide initiative. Brian Kelly holds the JISC Web Focus post and is extensively involved in the promulgation and dissemination of open standards and good practice across the UK Higher and Further Education sectors.

UKOLN continues to maintain its extensive publication and event schedule as part of a key role in keeping its community informed and up to date with developments in the digital library sphere. UKOLN continues to publish the Web magazine ‘Ariadne’ and has recently successfully held ECDL 2004 at Bath, bringing together the principal players on the DL stage.

Main Contributions to WP1

UKOLN will hold a workshop during 2005 to initiate a discussion into standards frameworks suitable for underpinning the architecture of a DLA. This workshop will provide an open forum that will lead to the identification of all protocols and standards that will be relevant to the Future DLA. Furthermore, we will encourage the introduction of new research and development in this area, especially in the rapidly-evolving fields of P2P and GRID where we expect DELOS as a whole will be instrumental in the advancement and codification of new technical standards to support these new technologies.

The outcomes from this workshop will be firstly a healthy discussion and opening-up of the issues that confront us in the establishment of a DLA. UKOLN will lead and marshal these discussions, and will subsequently publish a draft document that will be our first guiding reference to the agreed standards and protocols upon which the DLA will be built.

UMIT: University for Health Sciences, Medical Informatics and Technology:
Designing an Infrastructure for Highly Networked Information in a Pervasive Computing Environment

Introduction to the Group

The University for Health Sciences, Informatics and Technology (UMIT) is located in Innsbruck, Tyrol, Austria and was founded in 2001. UMIT's role is to explore the potential and practical applications of information and communication technologies, to contribute to high-quality, efficient health care which satisfactorily serves both the individual and society, and to contribute to progress in medical and health sciences research. Two research groups at UMIT actively participate in DELOS: the Institute of Information Systems (IIS) http://ii.umit.at/, headed by Prof. H.-J. Schek, and the unit for Software and Information Engineering (ISE) http://ise.umit.at/, headed by Prof. H. Schuldt.

IIS activities address the infrastructure for highly networked information in a pervasive computing environment - the "infrastructure of the information space". The vision of the group is a new infrastructure called "hyperdatabase" which is particularly appropriate to e-Heath applications. In short, a hyperdatabase can be characterized as a synthesis of database technology, peer-to-peer computing, and a grid infrastructure. The activities of ISE support the vision of building reliable and dependable process-based systems, i.e., systems their users can count on. In most domains, and especially in digital libraries, specialized and well-engineered applications and databases are already in place and offer dedicated services providing access to information. In addition to process support and dependability, the infrastructure has to support automatic adaptation to changing environments. This requires the combination of aspects of workflow management, transactional process support, and grid infrastructures.

For more information about UMIT, see http://www.umit.at.

Vision of a Future DLA

A Digital Library does not exclusively focus on the management of static data, information, and knowledge that can be accessed anywhere from any place, but has increasingly also to consider information that is dynamically modified and/or continuously generated. Examples of such continuously generated information are sensor data streams as they have to be processed and stored in an eHealth Digital Library for health monitoring applications, i.e., in applications where elderly people or patients with chronic ailments are equipped with wearable devices and non-intrusive sensors to monitor and record their condition permanently. Therefore, the more general consideration of a Digital Library as Dynamic Ubiquitous Knowledge Environment (DUKE) must also have consequences for future Digital Library Architectures. Future Digital Libraries have to support a service-oriented architecture in order to make use of existing services and to combine services of the peers which are hosting services and information resources. Moreover, in order to achieve a high degree of scalability, (i.e., to scale with the number of resources, services, service providers and applications in the information space), aspects of grid computing are needed.

diagram (36KB): Figure 4: Future DLA based on hyperdatabase technology for composite services

Figure 4: Future DLA based on hyperdatabase technology for composite services

In our vision, the future DLA follows a hyperdatabase architecture (see Figure 1) which supports composite services and processes. A small hyperdatabase layer is installed on each peer in the information space. By applying clever replication management, meta information on:

  1. the applications (composite services and processes) to be executed,
  2. the providers of other services, and
  3. their load

is distributed among these hyperdatabase layers.

This allows for decentralized, peer-to-peer execution. Another important feature of future DLs is the potential to define sophisticated failure handling strategies within composite services and to be able to validate their correctness and derive quality characteristics during the design stage. This is particularly important for composite services and processes in eHealth DLs where correct functioning is critical.

Research Relevant to WP1

Research by IIS and ISE at UMIT focuses on the above-mentioned vision to build a highly dependable, scalable, and adaptable infrastructure for process-based applications and its application in health care. In several research projects, particular aspects of this infrastructure are being considered. One project seeks to combine features of a grid infrastructure with hyperdatabase functionality [1]. While the hyperdatabase controls the execution of composite services and processes, the additional grid features make it possible to split up the invocation of a service dynamically into a set of calls that can be issued in parallel. By taking into account the providers and services that are currently available, this allows us to distribute complex service requests dynamically between several providers and to make efficient use of their resources. This is particularly important when different medical databases and repositories need to be queried with given time constraints. Another focus of current research is on the identification, definition, provision, and combination of building blocks that support efficient searching over multimedia patient records [2]. While patient information is not usually stored in a centralized system but rather remains under the control of the treating physician or hospital, the capacity to query these resources efficiently is essential in order to build a virtual electronic health record of a patient. In the area of health monitoring, we have extended a hyperdatabase prototype which supports the combination of stream operators and which controls the execution of stream processes, i.e., processes that are continuously fed with different sensor data streams. A major result is a highly reliable hyperdatabase infrastructure that combines stream operators and (web) services within the same application [3], [4].

Main Contributions to WP1

A major contribution to WP1 is the architecture of an infrastructure for the execution of composite services in a peer-to-peer style following the notion of a Digital Library as Dynamic Ubiquitous Knowledge Environment (DUKE). This work is being addressed in close collaboration with another DELOS WP1 partner [5], the database research group of ETH Zürich. The experience gained with this infrastructure and its application in health care will also contribute to the WP1 survey on service-oriented architectures, peer-to-peer systems and grid infrastructures.

In a co-operative effort with OFFIS (Prof. W. Hasselbring's group), we are addressing the problem of availability in medical patient records by applying clever replication management over a peer-to-peer network. Preliminary results of this co-operation [6] will be reinforced by extending such collaboration, especially by means of exchanging PhD students. UMIT is co-ordinating a survey on service-oriented architectures and in particular on their potential for the construction of next-generation DL environments.

Related Publications:

  1. M. Wurz, G. Brettlecker, H. Schuldt: Data Stream Management and Digital Library Processes on Top of a Hyperdatabase and Grid Infrastructure. In: Pre-Proceedings of the 6th Thematic Workshop of the EU Network of Excellence DELOS: Digital Library Architectures - Peer-to-Peer, Grid, and Service-Orientation (DLA 2004), pages 37-48, Cagliari, Italy, June 2004, Edizioni Progetto Padova.
  2. M. Springmann, H-J. Schek, H. Schuldt: Kombination von Bausteinen zur ähnlichkeitsbasierten Suche in elektronischen Multimedia-Patientenakten. To appear in: Tagungsband der 49. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS 2004), Innsbruck, Austria, September 2004. In German.
  3. G. Brettlecker, H. Schuldt, R. Schatz: Hyperdatabases for Peer-to-Peer Data Stream Processing. In: Proceedings of the 2nd International Conference on Web Services (ICWS'2004), pages 358-366, San Diego, CA, USA, July 2004, IEEE Computer Society.
  4. G. Brettlecker, H.-J. Schek, H. Schuldt: Information Management Infrastructure for Telemonitoring in Healthcare. To appear in: Tagungsband der 49. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS 2004), Innsbruck, Austria, September 2004.
  5. C. Schuler, R. Weber, H. Schuldt, H.-J. Schek: Scalable Peer-to-Peer Process Management - The OSIRIS Approach. In: Proceedings of the 2nd International Conference on Web Services (ICWS'2004), pages 26-34, San Diego, CA, USA, July 2004, IEEE Computer Society.
  6. L. Bischofs, W. Hasselbring, H. Niemann, H. Schuldt, M. Wurz: Verteilte Architekturen zur intra- und inter-institutionellen Integration von Patientendaten. To appear in: Tagungsband der 49. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS 2004), Innsbruck, Austria, September 2004. In German.

University of Athens:
Information Access Methods: Respecting the Autonomy of Differing Digital Libraries

Introduction to the Group

The University of Athens (UoA) group participating in DELOS is led by Professor Yannis Ioannidis and has twelve members, all coming from the Department of Informatics and Telecommunications. The group's research focuses on several aspects of Digital Libraries, including Digital Library Architectures (DLAs). Specifically, it studies information access in distributed DL architectures, especially those based in the P2P or Grid paradigm, and in particular on query personalization and distributed information access. Advanced distributed information access methods are critical, as the forthcoming large-scale DL networks will have to provide prompt responses to end-user queries and traditional approaches do not scale up. Another focus of the group is related to query personalization, which enables DLs to provide end-users exactly the data they need by automatically tailoring the behaviour of the searching facilities to the users' preferences.

Vision of a Future DLA

As the volume of available information increases, the size of future Digital Libraries is most likely to lead to the adoption of large-scale distributed architectures, such as those developed for federated databases or those based on the GRID or the P2P paradigm. Independent of any progress made in the hardware available, distributed architectures are the only solution to scalability problems; additionally, they provide single points of (homogeneous) access to information held by multiple institutions.

diagram (3KB): Figure 5: Interacting Digital Libraries

Figure 5: Interacting Digital Libraries

An important prerequisite to the success of distributed DLAs is respecting the autonomy of participating DL systems. Search and browse access methods should be developed for an environment where participating heterogeneous nodes or DLs designate the exact hardware resources and information content (data and metadata) that are available to inquiring nodes, while the latter make no a priori assumptions on the capabilities of the former. Each individual DL may be of a co-operative inclination to provide information for the benefit of all or may be in the competitive business of selling information. Either way, DLs should be allowed to interact freely with each other in unspecified patterns, each time making decisions based on the needs of the current query or processing request.

Incidentally, availability of large-scale distributed DL systems will increase the volume of information that can be browsed and searched. Taming the potential information overload will be helped considerably by the adoption of effective personalization techniques.

Research Relevant to WP1

Within WP1, UoA has focused its research on developing information access methods suitable for distributed DLAs in the direction mentioned above, i.e., where DL autonomy is respected at all times. Such an environment poses significant challenges to query processing and optimization, as it results in lack of knowledge about any particular node with respect to the information it can produce and its characteristics, e.g., cost of production or quality of produced results. Potential inter-node competition also creates difficulties, as it results in potentially inconsistent behaviour of the nodes at different times. UoA envisages query processing and optimization of a form that resembles a commodity-trading negotiations framework, where in this case, information is the object being traded between independent DLs. In a recent paper (EDBT Conference, 2004), such a framework has been demonstrated, respecting node autonomy, being able to handle node heterogeneity, and offering effective query processing.

Main Contributions to WP1

The UoA members involved in DELOS WP1 activities are primarily Professor Yannis Ioannidis and PhD Candidate Fragkiskos Pentaris. The team has participated in the 6th DELOS Thematic Workshop in Sardinia (June 2004), where a paper was presented on the overall framework for autonomous query optimization and execution in a distributed DL system. The paper recognized queries and query answers as commodities, modelled query optimization as a trading negotiation process, and outlined several aspects of this approach which required further investigation.

Università degli Studi di Milano:
Providing Ubiquitous Access through Personalized and Content-aware Presentation

Introduction to the Group

The Università degli Studi di Milano (UNIMI) is one of the largest Italian Universities with 60,000 students enrolled, and several schools (Mathematical, Physical, and Natural Sciences, Humanities, Laws, Political Sciences, Medicine, Pharmacy, Agriculture). The UNIMI team involved in DELOS consists of people from the Database & Security Group (DB&Sec)and a research group from the Department of Computer Science and Communication (DICo).

The DB&Sec group consists of 8 members and 10 PhD students. Research is focused on the following areas:

The DB&SEC group is affiliated with the Center for Education and Research in Information Assurance and Security (CERIAS) of Purdue University, Indiana, USA.

DICo has about 35 faculty members and supports 4 different degrees (Computer Science, Digital Communication, Informatics for Telecommunications, Science and Technology for Musical Communication), 2 master degrees, and a PhD programme. Its main research areas are concerned with:

Vision of a Future DLA

Future DLA should take advantage of current advances in system architectures and networks, resulting in architectural paradigms like grid-computing systems, wireless grid systems and P2P which provide unlimited computing and storage capabilities. Emphasis should be on providing ubiquitous access to DLA by, however, supporting a personalized and context-aware content presentation. Collaborative learning and discovery processes should also be supported. Finally, models and mechanisms for security, privacy and IPR should be part of any such solution.

Research Relevant to WP1

Digital library-related activities at the UNIMI currently include the investigation of techniques supporting query formulation and data presentation from virtual reality (VR) environments. Thus, VR and database techniques are being integrated. Such research is carried out in the framework of the DHX Project (project IST-2001-33476). In addition, techniques and tools are being developed to generate multimedia presentations automatically, based on constraint languages. A third area of activity concerns security issues for database systems and advanced data management systems. Among the various research directions pursued, the most relevant are:

Main Contributions to WP1

We have started to develop a discretionary access control system to resources in a grid architecture. The system is based on the XACML (eXtensible Access Control Markup Language) standard and exploits the notion of virtual community to group together grid nodes adopting the same policies. We have developed scheduling algorithms which when given a request for accessing resources by a computation allow the system to determine the available resources by taking into account the access control policies. We are currently developing a preliminary prototype of our system in order to compare it with the performance of the Condor system. Finally, we are investigating an approach that would allow users, submitting computation on a grid, to specify security requirements. This approach is tailored to computations organized according to workflow systems.

Related publications:

E. Bertino, P. Mazzoleni, B. Crispo, S. Sivasubramanian, E. Ferrari, "Towards Supporting Fine-Grained Access Control for Grid Resources" to appear in Proceedings of IEEE 10th International Workshop on Trends in Distributed Computing Systems - FTDCS 2004, Suzhou, China, May 26-28, 2004.

E. Bertino, B. Crispo, P. Mazzoleni, "Support Multi-Dimensional Trustworthiness for Grid Workflow" submitted to the DELOS Workshop on "Digital Library Architectures: Peer-to-Peer, Grid, and Service-Orientation", 2004.

Università degli Studi di Padova:
Flexibility of Implementation to Reflect Differing Architectural Paradigms

Introduction to the Group

The Information Management Systems (IMS) Research Group (http://www.dei.unipd.it/) is one of the research groups within the Department of Information Engineering of the University of Padua, Italy (http://www.unipd.it). The IMS group led by Professor Maristella Agosti has a strong programme of research, based on both theory and experiment, aimed at producing new multimedia information management and retrieval tools. The group, which began work more than fifteen years ago, currently numbers ten members all from the Department of Information Engineering but with teaching commitments in different faculties of the University.

The group has a good research history in the area of information retrieval, database management and digital libraries research. Present interests include aspects of:

The group plays a role in both the national and international research communities. In the past the group was active in the network of excellence IDOMENEUS, and the working group Mira, as well as projects such as JUKEBOX and EUROIEMASTER.

Members of the group have also been extensively involved in organising conferences, workshops and summer schools in the area of digital libraries and information retrieval. Some relevant examples are:

Recently the IMS research group has co-operated in the organization of SPIRE 2004, the Eleventh Symposium on String Processing and Information Retrieval, which took place in the Department as part of Dialogues 2004, and two DELOS workshops, the Sixth Thematic Workshop of the EU Network of Excellence DELOS on Digital Library Architectures, in Sardinia in June 2004, and the Workshop on the Evaluation of Digital Libraries, again held in the Department in October 2004.

Vision of a Future DLA

The architecture of a digital library allows and supports the functionalities that the digital library systems designed on it are going to make available to the different categories of users. This means that the architecture strongly influences the effective capabilities and functionalities of digital library systems that can be based on it. The architecture of future digital libraries needs to be designed and implemented in a way that easily supports the evolution of information access and management services that are offered to end-users. This means that an architecture of this type needs to cope with different user requirements and must allow the addition of new functionalities to a digital library based on it without the need to re-design the underlying architecture.

Another important aspect is the flexibility of such an architecture which must permit its implementation according to different architectural paradigms, such as Web Services (WS) or Peer-to-Peer (P2P).

Research Relevant to WP1

In the past the IMS research group has participated in relevant European and national projects contributing to the design of advanced architectures for digital archives and libraries. Most relevant projects were:

The most relevant projects currently under development and to which the IMS research group is contributing are the ECD and the IPSA projects, briefly introduced below.

The group participates in Action 1 of the ECD (Enhanced Content Delivery) Project, a national research project supported by both the Italian National Research Council (CNR) and the Italian University (MIUR). The project is addressing the development of methodological tools and technologies for the delivery of enhanced contents to end-users. The main objective of the IMS group's research in this project is the design and development of a prototype for an Annotation Service (AS) for Digital Libraries. The service is going to deal with the different and relevant aspects of annotations, such as creation, management, access and retrieval of both manual and automatically created annotations.

The IPSA Project, launched at the University of Padua in 2002, is relevant to DELOS WP1 activities since it aims to design and construct a digital library of drawings and illustrations of historical documents, where the digital library in this instance is to serve researchers of the art and history of scientific illustration. Having carried out an analysis of user requirements and developed a methodology for accessing a digital herbal, the development of a prototype system, also named IPSA, is now under way. IPSA is a Web application that is based on a three-tier architecture. The system is based on software distributed under open source licence; the application has been developed on a Debian GNU/Linux platform.

Related Publications (in chronological order):

DELOS Home DELOS Newsletter Front PageDelos Newsletter Contents