The IMPACT Project
Improving Access to Text
IMPACT is a European project that aims to speed up the process and enhance the quality of mass digitisation in Europe. The IMPACT research programme will significantly improve digital access to historical printed text through the development and use of innovative Optical Character Recognition software and linguistic technologies.
IMPACT will also build capacity in mass digitisation across Europe. The twenty-six partners (eleven libraries, thirteen research institutes or universities, and two private sector companies) collectively constitute a Centre of Competence that will share best practice and expertise with the cultural heritage communities in Europe.
Project news
Authoritative and up-to-date IMPACT news is available from:
|
Project background
Improving Access to Text (IMPACT) is a EUR 11.5 million research project funded by the European Union as part of the Seventh Framework Programme (FP7), and led by the National Library of the Netherlands (Koninklijke Bibliotheek). This four-year "Large-scale Integrating Project" has 15 partners including national and research libraries, universities and industrial partners.
The project started work in January 2008 and its sub-projects are focused on three main activity areas:
- Text recognition - research into improving the digitisation and OCR workflow, including: the enhancement of images to maximise document segmentation and OCR; improved segmentation methodologies; adaptive OCR; the integration of language models into OCR; and exploring novel techniques for OCR processing
- Enhancement and enrichment - research into improvements that can be made post OCR, including: tools to support the collaborative correction of OCR output; the definition and development of historical lexica for English, Dutch and German; the enhancement of the XML output files that combine OCR results with technical and layout metadata
- Operational context/Capacity building - exploring the integration of IMPACT tools into their wider digitisation context, including: a technical framework for integrating all IMPACT tools and enabling their takeup by second phase partners; an evaluation framework for IMPACT tools; a requirements forum; and the development of a range of externally-facing resources, including: documentation on digitisation workflows and IMPACT tools, a helpdesk, and a series of training events
UKOLN's main roles in the project are to work on the externally-facing parts of IMPACT, primarily in helping to produce and disseminate documentation (best practice guides, briefings, case studies, etc.) on text digitisation frameworks and IMPACT tools and a series of training events.
Project partners
There were originally fifteen full-partners in IMPACT: seven libraries, six research institutes and two private sector companies:
Library partners
- Koninklijke Bibliotheek (KB), National Library of the Netherlands - project co-ordinator
- Bayerische Staatsbibliothek (BSB), Bavarian State Library
- Bibliothèque nationale de France (BnF), National Library of France
- British Library (BL) - Operational Context and Capacity Building sub-project leader
- Deutsche Nationalbibliothek (DNB), German National Library
- Niedersächsische Staats- und Universitätsbibliothek Göttingen, Goettingen State and University Library
- Österreichische Nationalbibliothek (ONB), Austrian National Library - Enhancement and Enrichment sub-project leader
University and research partners
- ABBYY Production LLC
- University of Bath, UKOLN
- IBM Israel, IBM Haifa Research Laboratory
- Universität Innsbruck, Universitäts- und Landesbibliothek Tyrol, Abteilung für Digitalisierung und elektronische Archivierung, Department for Digitisation and Digital Preservation - Text Recognition sub-project leader
- Institut voor Nederlandse Lexicologie (INL), Institute for Dutch Lexicology, Leiden
- Ludwig-Maximilians-Universität München (LMU), Centrum für Informations- und Sprachverarbeitung (CIS)
- National Centre for Scientific Research "Demokritos" (NCSR), Athens
- University of Salford, School of Computing, Science & Engineering, Pattern Recognition and Image Analysis Laboratory (PRImA)
From 2010, an additional 11 partners have joined the project, primarily to help test the IMPACT tools and their extensibility to new language groups. They are a mixture of libraries and research groups and are based in Bulgaria, the Czech Republic, France, Poland, Slovenia and Spain.
- Bulgarian Academy of Sciences, Institute for Parallel Processing, Sofia
- SS. Cyril and Methodius National Library (NLB), Sofia
- Univerzita Karlova v Praze (Charles University in Prague), Institute of the Český národní korpus (Czech National Corpus)
- Národní knihovna České republiky (NKC), National Library of the Czech Republic, Prague
- CNRS and Nancy-Université, Analyse et Traitement Informatique de la Langue Française (ATILF), Nancy
- Poznańskie Centrum Superkomputerowo-Sieciowe (PSNC), Poznań Supercomputing and Networking Center
- Uniwersytet Warszawski
- Institut "Jožef Stefan", Department of Knowledge Technologies, Ljubljana
- Narodna in univerzitetna knjižnica (NUL), National and University Library, Ljubljana
- Fundación Biblioteca Virtual Miguel de Cervantes and Universidad de Alicante
- Biblioteca Nacional de España (BNE), Madrid
UKOLN's Contribution to the Project
UKOLN is the leader of two work packages in IMPACT:
- CB1: Learning resource toolbox
- CB4: Training
UKOLN Publications and Presentations
- IMPACT Metadata Best Practice Guide [paper]
-
October 2010IMPACT Metadata Best Practice Guide (October 2010) - please send comments to the IMPACT LinkedIn Group: http://www.linkedin.com/groups?mostPopular=&gid=130648Pilot version (PDF) available at: http://www.impact-project.eu/uploads/media/IMPACT-metadata-bpg-pilot-1.pdf
- The Improving Access to Text (IMPACT) project and other European initiative [presentation]
-
September 2009JISC Workshop: OCR for the Mass Digitisation of Textual Materials, University of Bath, 24 September 2009
- IMPACT Conference: Optical Character Recognition in Mass Digitisation [conference report]
-
April 2009Ariadne 59
UKOLN staff working on the IMPACT Project
Ed Bremner
Research Officer
E-mail: e.bremner@ukoln.ac.uk
Marieke Guy
Research Officer (part time)
E-mail: m.guy@ukoln.ac.uk
Michael Day
R&D Team Leader
E-mail: m.day@ukoln.ac.uk
IMPACT is funded as part of the European Union's Seventh Framework Programme It is managed by the Cultural Heritage and Technology Enhanced Learning unit of the European Commission's Information Society and Media Directorate General (DG INFSO) |