Background
Digitisation is a production process. Large numbers of analogue items, such as
documents, images, audio and video recordings, are captured and transformed into
the digital masters that a project will subsequently work with. Understanding the
many variables and tasks in this process - for example the method of capturing
digital images in a collection (scanning or digital photography) and the conversion
processes performed (resizing, decreasing bit depth, convert file formats, etc.)
- is vital if the results are to remain consistent and reliable.
By documenting the workflow of digitisation, a life history can be built-up for
each digitised item. This information is an important way of recording decisions,
tracking problems and helping to maintain consistency and give users confidence
in the quality of your work.
What to Record
Workflow documentation should enable us to tell what the current status of an
item is, and how it has reached that point. To do this the documentation needs
to include important details about each stage in the digitisation process, and its outcome.
- What action was performed at a specific stage? Identify the
action performed. For example, resizing an image.
- Why was the action performed? Establish the reason that a
change was made. For example, a photograph was resized to meet pre-agreed image standards.
- When was the action performed? Indicate the specific date
the action was performed. This will enable project development to be tracked through the system.
- How was the action performed? Ascertain the method used to
perform the action. A description may include the application in use,
the machine ID, or the operating system.
- Who performed the action? Identify the individual responsible
for the action. This enables actions to be tracked and identify similar problems
in related data.
By recording the answers to these five questions at each stage of the digitisation
process, the progress of each item can be tracked, providing a detailed breakdown
of its history. This is particularly useful for tracking errors and locating
similar problems in other items. The actual digitisation of an item is clearly
the key point in the workflow, and therefore formal capture metadata (metadata
about the actual digitisation of the item) is particularly important.
Where to Record the Information
Where possible, select an existing schema with a binding to XML:
- TEI (Text Encoding Initiative)
and EAD (Encoded Archival Description)
for textual documents
- NISO Z39.87 for digital still images.
- SMIL
(Synchronized Multimedia Integration Language), MPEG-7 or the
Library of Congress' METS A/V extension for Audio/Video.
Quality Assurance
To check your XML document for errors, QA techniques should be applied:
- Validate XML against your schema or an XML parser
- Check that free text entries follow local rules and style guidelines
Further Information
- Encoded Archival Description,
<http://www.loc.gov/ead/>
- A Metadata Primer,
<http://www.cmswatch.com/Features/TopicWatch/FeaturedTopic/?feature_id=85>
- Dublin Core Metadata Initiative,
<http://dublincore.org/>
- MARC Standards,
http://www.loc.gov/marc/>
- MPEG- 7 Standard,
<http://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htm>
- Synchronized Multimedia,
<http://www.w3.org/AudioVideo/>
- TEI Consortium,
<http://www.tei-c.org/>
- Three SGML metadata formats: TEI, EAD, and CIMI,
<http://hosted.ukoln.ac.uk/biblink/wp1/sgml/tei.rtf>
- Z39.87: Technical metadata for still digital images,
<http://www.niso.org/standards/resources/Z39_87_trial_use.pdf>