This page is for printing out all of the briefing papers. Note that some of the internal links may not work.
FAQs are available in the following areas:
Much valuable information is available on the TASI site.
Suitable resolutions for digital master files for various media types are discussed in the HEDS Matrix, and the JIDI Feasibility Study contains a useful table of baseline standards of minimum values of resolutions according to original material type.
A detailed discussion of resolution, binary and bit depth can be found on TASI's Web pages and a good basic guide to colour capture can also be found on the EPIcentre Web pages.
TIFF is the best image format to use for storage. The reason that TIFF is
recommended over JPEG is because JPEG is an
inherently lossy compression technique. This means that whenever an image
file is converted to JPEG format, some detail is lost. However, as you have
noticed, the changes that occur are very subtle at high "quality" settings
of JPEG compression. You say that you cannot "see" any difference: can I
suggest that you try:
Open your two specimen files, JPEG and TIFF, side by side.
Blow up to maximum zoom the same area on both images. Select a portion of
the image with a good range of colours: an edge of the object, for example.
Select the Eyedropper tool (keystroke "I" for shortcut) and make sure the
"Color" floating tool bar is open.
Right click on any part of either image and select "Point Sample" (or adjust
this setting in Eyedropper options on the "Options" floating tool bar.
Now left click on a pixel in the TIFF image. In the "Color" tool bar you
should see the colour value of the pixel you have selected. Note this
value.
Left click on the same pixel in the JPEG image. Note the displayed colour
value.
You should observe that there is a general slight difference in the colour value at any specific point in the image. Indeed, it is very difficult to "see" this difference with the eye, but I hope that this numerical demonstration will prove to you that the two images are not identical. The JPEG compression routine does not store the discreet value of each pixel in the image, it stores a mathematical function that is used to re-generate the colour values and this process will result in approximate values for many of the pixels in the image.
Note also that TIFF files can be stored with LZW compression enabled, reducing the size of the file dramatically. LZW compression does not result in any change to the values of any pixels in the image, so is suitable for archiving and preservation purposes.
MPEG
MPEG is a set of international standards for audio and video compression. MPEG-4 is the newest MPEG standard, and is designed for delivery of interactive multimedia across networks. As such, it is more than a single codec, and includes specifications for audio, video and interactivity. Windows Media encodes and decodes MPEG4. Quicktime and Realplayer are working on versions which will do the same.
MPEG produces high quality video which can be streamed over networks. Quicktime and Realmedia use the MPEG standards to improve the quality of their files for delivery on the Web.
It is possible to store audio or video in an MPEG format, and to play an MPEG file. This would be NOF's preferred solution, as proper MPEG files are open, non-proprietary, and should be readable by most audio and video player programs and plug-ins. Many/most current web browsers have the capability to play MPEG-1 video without any extra plug-ins.
RealPlayer, Windows Media Player et al support a variety of audio and video formats, including MPEG, and a range of proprietary formats such as AVI.
Digital Audio
Standards for storage and playout may differ. Commonly an archive/library would wish to store (preserve) the highest quality possible - meaning uncompressed - but would deliver using a standard and datarate appropriate to user requirements.
Electronic delivery could then involve compression.
Software which delivers at multiple data rates, according to an Internet user's connection, is now available from Real and Quicktime, amongst others, but the 'master copy' should ordinarily be the 'best available', which would usually mean uncompressed, linear PCM with a sampling rate and quantisation appropriate to the bandwidth and dynamic range of the material. This form of audio is typically held in .WAV files (though there are over 100 registered forms of coded audio that are possible within WAV, including highly compressed).
Within European broadcasting, 16-bit quantisation and 48 kHz sampling are the EBU (European Broadcasting Union) recommendation for broadcast quality audio. The EBU has gone a step further and added metadata to WAV, to add information critical to broadcasting and broadcast archives, forming the "Broadcast Wave Format" standard: BWF.
The actual transfer of analogue material to digital format, especially in bulk or for unique items, is not simple. For European Radio Archives, standardisation and guidance is being developed within EBU Panel 'Future Radio Archives'.
In their public service role, the BBC would be pleased to offer advice to libraries / archives requiring help - providing it is for non-commercial purposes.
A few years ago VRML (Virtual Reality Markup Language) was thought to be the emerging standard for virtual reality. However VRML has failed to gain widescale market acceptance. VRML is now evolving. Its successor X3D will make use of XML to provide the syntax for 3D worlds. The development of X3D is being coordinated by the Web3D Consortium.
A range of browser plugins to render X3D worlds are available, see the Web3D Consortium web site for details.
The requirement that alternative format must be provided if a plug-in is required is intended primarily for accessibility purposes and to ensure that an open format is available if a project makes use of a proprietary format which requires a plugin. In the case of 3D visualisation it is recognised that a textual equivalent will probably not be appropriate and since X3D is an open standard which is currently accessible primarily through use of browser plugins, the use of these plugins is acceptable.
Realvideo
It is suggested that all projects work from a format that can be re-digitised from easily e.g. for video DV or DV Cam. Media, particularly video will need to be redigitised for delivery as technology advances. An equally important issue is probably the copyright and making sure thatall footage is covered by "blood chits" which hands over all the rights to the projects.
Although the use of Realvideo is not particularly recommended, we accept that in some cases the use of proprietary or non-standard formats may be the most appropriate solution. However, where proprietary standards are used, the project must explore a migration strategy that will enable a transition to open standards to be made in the future.
With regard to Real, if you do use this you should check that the stringent conditions which encoding with Real imply are suitable for both the project and the programme.
IPR and copyright is a very complex area and unfortunately there is no "one-size-fits-all" solution to these issues. Every resource or collection of resources may have its own IPR problems that will need to be solved before a digitisation project can go ahead. However, as it is an issue of such importance when working in a networked environment, a number of excellent resources have been produced to guide you through the process of clearing resources for use.
Copying to CD might be useful for short term backups but that on its own it isn't a very sustainable *long-term* preservation strategy. The reason for this is that the dyes used by recordable CDs (CD-R) tend to break down over time.
For more information, see section: 1.1.6 in Ross and Gow (1999). The same authors wrote in their executive summary that they felt that the stability of CD-R was over-rated and "far from being a secure medium it is unstable and prone to degradation under all but the best storage conditions." Best practice would beto keep an additional copy on some magnetic media. For more details see: Ross, S., Gow, A., 1999, Digital archaeology: rescuing neglected and damaged data resources. London: South Bank University, Library Information Technology Centre, February.
In practice, preservation is about managing information content over time. It is not enough just to make backups, but to create (at the time of digitisation) well-documented digital master files. Copies of these files should be stored on more than one media type, and (ideally) in more than one geographical location. These files should be used to derive other files used for user access (which may be in different formats) and would be the versions used for later format migration or for the repackaging of information content. If the files are images, the 'master' file format should be uncompressed, e.g. something like TIFF.
This is not to denigrate making backups in any way. Any service will need to generate these to facilitate its recovery in the case of disaster.
Creating a full 'digital master' with associated metadata will be a complex (and therefore expensive) task that should be done once only and at the time that the resource is being digitised. All equipment needs (or the choice of a digitisation bureau) should be considered with the creation of such digital masters in mind.
Projects will also need to decide where these digital masters should be kept for the duration of the project itself and where backup copies of them (and maybe other parts of the service) should be stored. Thought could be given to subscribing to a third-party storage service. An example is the National Data Repository at the University of London Computer Center (ULCC).
The various service providers of the Arts and Humanities Data Service (AHDS) will also provide a long-term storage service for digital resources. They have also published various guides to good practice.
There are, in fact, quite a few relevant standards. Firstly item-level descriptions can be based on the Dublin Core and in line with developing e-government and UfI metadata standards. In a Dublin Core context, the specifics of using DCMES for images was discussed at DC-3 - the Image Metadata Workshop held in Dublin, Ohio in September 1996. This workshop resulted in the addition of two new elements to the original thirteen and made some changes to element descriptions.
There is some useful information on DC and other image metadata formats in section 4 of the VADS/TASI guide to creating digital resources in the AHDS Guides to Good Practice series. This paper mentions standards like the CIMI DTD, MARC, the CIDOC standards, etc. as well as more specialised standards like the Visual Resources Association (VRA) Core Record.
There is information on more specialised administrative and structural metadata in the Making of America II project's final report.
A shorter list of elements with a primary focus on preservation is available from the RLG Working Group site.
There may be some useful background information in the Metadata for images conference paper by Michael Day.
It is probably advisable that you give consideration to watermarking and fingerprinting the digital material you produce as a safety mechanism if your resources are of value. If your project aims to make your resources available for non-profit, educational use, your online images will be of comparatively low quality and you have a copyright statement on your site then watermarking will probably not be necessary for your images. If you feel that your money could be used more constructively in another way then you are free to do this.
There are a number of different standards available such as the IMS Learning Resource Metadata Model and IEEE Learning Object Metadata (LOM). However both these place a significant overhead on the metadata creator; a LOM record could take an hour or more to complete in extreme cases, for example.
An alternative might be to use Dublin Core with the extensions proposed by the Education Working Group (DCEd) of the DCMI. They have proposed an "Audience" element, and suggest adopting "InteractivityType", "InteractivityLevel", and "TypicalLearningTime" elements from the IEEE LOM standard.
More general information on educational metadata is available in the two recent SCHEMAS Metadata Watch reports.
Also see the UK's Metadata for Education Group and note that the UK Government Information Age Champions group are working on a metadata schema that is likely to use Dublin Core.
A good learning and teaching package needs to be suitable for it's users, should be engaging and require active participation by the user and should generate a desire in the user to learn and to develop. They should also be sufficiently flexible to allow different modes of use and they should appeal to a wide variety of users at very different life-stages.
NOF have a very useful section in their programme manual on 'Creating Online Learning Materials' that is definitely worth a look.
There are a number of different standards that you may need to use. A full list is available from the UKOLN metadata site. Dublin Core tends to be the key metadata element set intended to facilitate discovery of electronic resources.
The Dublin Core is a metadata element set intended to facilitate discovery of electronic resources. Originally conceived for author-generated description of Web resources, it has attracted the attention of formal resource description communities such as museums, libraries, government agencies and commercial organizations.
The key characteristics of the Dublin Core are:
UKOLN provides two Dublin Core tools for use:
More information about other UKOLN software tools and other software tools for handling metadata in various formats is available from the UKOLN site.
The RLG Working Group which suggests using 16 elements to capture crucial information about a digital file, their elements are fairly 'lightweight' and would probably be OK assuming that some descriptive metadata (e.g. DCMES) is also available. It's a bit old now, and it might be worth looking at METS or the more detailed set of elements which can be found in the draft NISO Technical Metadata for Digital Still Images standard.
You could also have a look at the OCLC/RLG Preservation Metadata Working Group which has published an overview (chiefly of the OAIS model, and the specifications developed by Cedars, NEDLIB and NLA) and recommendations for 'Content Information' and a forthcoming one on 'Preservation Description Information'. Or there is OCLC's own preservation metadata set.
The terms "dynamic Web pages" and "dynamic Web sites" can be used in a number of senses, so it is important to clarify the meaning.
Movement on a Web page (example 1) may be useful in some cases. However, for accessibility purposes, the end user should be able to switch off scrolling text or moving images.
Access to search facilities, backend databases and legacy systems (example 2) is desirable on many Web sites.
Web sites which can be personalised for the end user (example 3) may be desirable in some cases.
Web sites which can be personalised for the end user's client environment (example 4) may be desirable. However users should not be disenfranchised if they have an unusual client environment.
Dynamic Web sites (example 5) may be desirable in some cases. However users should not be disenfranchised if their browser does not support ECMAscript, or if ECMAscript is disabled (e.g. for security purposes).
If you are considering developing a dynamic Web site you should consider the performance implications and the effect on caching. Will the performance of your service deteriorate if, for example, server-side scripting is used, or will the performance on the end users PC deteriorate if it is not powerful enough to support Java or ECMAscript? In addition will pages on the Web site fail to be cached and what will effect will this have on the performance for the end user?
You should also consider how easy it will be to cite and bookmark dynamic resources. Will resources have a meaningful URL which is easy to remember? Will bookmarked URLs return the same resource at a future date?
Cascading Style Sheets (CSS) are a way to separate presentation and content.
Proprietary file formats (such as the Microsoft Word format, Abobe PDF, Macromedia Flash, etc.) are owned by a company or organisation. The company is at liberties to make changes to the format or to change the licence conditions governing use of the format (including software to create files and software to view files). Use of proprietary formats leaves the user hostage to fortune: for example the owner of a popular and widely-used format may increase the costs of its software (or introduce charges for viewing software).
Open formats are not owned by a company or organisation - instead they are owned by a national or international body which is independent of individual companies.
Many standards bodies exist. Within the Web community important standards organisations include the World Wide Web Consortium (W3C), the Internet Engineering Task Force (IETF), ECMA (European Computer Manufactures Association), ISO, etc. These standards organisations have different cultures and working practices and coverage. W3C, for example, is a consortium of member organisations (who pay $5,000 to $50,000 per year to be members). W3C seeks to develop consensus amongst its members on the development of core Web standards. The IETF, in contrast with the W3C, is open to individuals. ISO probably has the most bureaucratic structure, but can develop robust standards. The bodies have different approaches to defining standards: ISO, for example, solicits comments from member organisations (national standards bodies) whereas W3C solicits comments from member organisations and from the general public.
You should not confuse "open standards" with "open source". Open source software means that the source code of the software is available for you to modify and the software is available for free. This is in contrast to licensed (or proprietary) software in which the source of the software is not normally available. Both open source and proprietary software can be used to create and view open standard formats.
The most important guidelines are the W3C WAI guidelines. See <http://www.w3.org/WAI/> for further information.
What file formats you use depends on whether your site is static or dynamic. The usual formats for Web pages include html, asp, php. The usual formats for images are gif, jpeg, svg. The key is to check how accessible to formats you are using are.
Flash is a popular authoring software developed by Macromedia and is used to create vector graphics-based animation programs. A Flash solution is fairly easy to implement on a Web site but the site will then be usable only by modern browsers which have a Flash plugin.
The problem is Flash is a proprietary solution, which is owned by Macromedia which means it may not be accessible and will probably not work on non-standard devices, such as a digital TV.
As with any proprietary solutions there are dangers in adopting it as a solution: there is no guarantee that readers will remain free in the long term, readers (and authoring tools) may only be available on popular platforms, the future of the format would be uncertain if the company went out of business, was taken over, etc. The company is at liberties to make changes to the format or to change the licence conditions governing use of the format (including software to create files and software to view files). Use of proprietary formats leaves the user hostage to fortune - for example the owner of a popular and widely-used format may increase the costs of its software (or introduce charges for viewing software).
It is also worth noting that indexing software, etc. often cannot index proprietary formats, so it can act as a barrier to resource discovery. Also there may be accessibility considerations, to users using old or specialist browsers.
The general advice is that where the job can be done effectively using non-proprietary solutions, and avoiding plug-ins, this should be done. If there is a compelling case for making use of proprietary formats or formats that require the user to have a plug-in then you could use the format as long as an alternative was available.
If you *require* the functionality provided by Flash, you will need to be aware of the longer term dangers of adopting it. You should ensure that you have a migration strategy so that you can move to more open standards, once they become more widely deployed.
It is important that Web services must be accessible to a wide range of browsers and hardware devices (e.g. Personal Digital Assistants (PDAs) as well as PCs). Web services should be usable by browsers that support W3C recommendations such as HTML, Cascading Style Sheets (CSS) and the Document Object Model (DOM).
Note:
Netscape 4.x browsers, i.e. browsers in Netscape series 4, (e.g. Netscape 4.07, 4.08, 4.71, etc.) have very poor support for CSS and we are aware of the difficulties this can pose. However just as the difficulties posed by Netscape 4.x differ amongst those projects affected, so do their potential response to the problem; hence there is no definitive one-size-fits-all answer.
Action steps:
(You might consider publishing a list of brower/OS combinations that you have successfully tested in an "about" section of your pages.)
What database you use depends very much on your individual requirements. Here are some notes to bear in mind.
Most people already have some experience of Access - it is very easy to work with. It is free with MS Office. You can either use access files or draw data from SQL Server using OLE DB. You can upsize access to SQL Server. Programming in Visual Basic for applications (VBA) - very similar to VB. MS Access was designed as a database system for small scale office use. It was not designed for use as a database server, although it can be used in this mode for simple use. Although Access may be capable of handling the sorts of query volume you suggest, at least in the short term, you do need to consider scalability (SQLServer scales better), Web site integration (SQLServer *probably* integrates better), enterprise access to the data (SQLServer will better enable intranet access to the data, etc). Data structures are unlikely to be affected by a move to SQLServer. The general rule seems to be that if you plan on more than 25 concurrent connections then go SQL instead of Access Cannot add so many entries.
Client-server technology. It is more scalable for busier/larger sites. You can connect to it remotely using ODBC and MS Access has an option to link to tables - use a front end. It is fairly easy to use, support from Microsoft Visual Studio programming suite (can work on it using VB). SQLServer integrates with a Web site better. Data can be output in XML. However it is more expensive. More to learn than with access.
High-powered software. Mixed platform. More complex to use
It's free. A combination of Perl or PHP and MySQL is pretty straight forward. User interfaces not as good.
The use of client-side scripting (including javascript and DHTML techniques) is acceptable, however please take note of the following points:
1) The site must still be accessible to browsers which are not scriptable. Use <noscript>< tags and "sniffer" routines to determine the client capabilities and provide content-equivalent pages to non-scriptable browsers.
2) Thoroughly test your pages for functionality under a wide range of browser / platform configurations.
A few general comments on client-side scripting:
There Is More Than One Javascript...
Javascript is supported in various flavours on browsers according
to version and manufacturer. "Javascript" is the name given to
the scripting language developed by Netscape and has also come to
mean the generic client-side scripting language. JScript is the
Microsoft equivalent. ECMAscript is the published open standard,
to which proprietary flavours of the language _should_ adhere.
Note however that ECMAscript defines the underlying structure of
the language, but not specific issues, which are addressed by the
Domain Object Model. The DOM too has a standardised structure,
defined by the W3C group.
Internet Explorer also supports VBscript, based on Visual Basic. This will not work in other browser versions.
( ECMA = European Computer Manufacturing Association )
USE "language" Attributes in <script>
Tags:
The version of javascript can be specified in the SCRIPT tag.
This can be useful for branching code, particularly when used in
conjunction with the "src" attribute'.
DHTML: What Is It?:
The term DHTML (Dynamic HTML) is a marketing term which denotes
the use of a client-side scripting language (normally
JavaScript/ECMAScript) to manipulate HTML and CSS properties.
Use of DHTML is compatible with the open standards developed by W3C, and so its use is OK from a standards point of view.
<noscript> Tags:
Use <noscript> tags where appropriate to deliver equivalent
content to browsers which do not support javascript or where
script processing has been disabled.
Accessibility:
Regardless of how javascript is used on the page, the overriding
principle for NOF projects is that the page must still be
accessible to non-javascript enabled browsers.
Future Proofing:
As new versions of browsers are released and new ways of
accessing web resources come into existence there is a need for
projects to test their content against these new interfaces. This
is true for all web content, but the situation is more acute when
pages include script. Projects should build into their plans a
provision for undertaking these checks and for making corrections
to HTML and script as necessary.
Script Block Content Shielding:
Commenting within <script>tags to prevent inclusion of the
script by browsers that do not understand the <script> tag.
Wrapping the text in HTML comment tags prevents these browsers
from displaying raw javascript on the page. Admittedly this is
highly unlikely now, however it is good defensive programming
practice.
<script> <!-- ...javascript goes here // --> </script>
Browser Detection Aka Sniffing:
It is superior to test for the component of the DOM that you
which to use rather than parsing the navigator.userAgent
property. For example, to create a rollover script for images,
you would code:
if (document.images){
file://image loading / swapping routine
}
Proportion Of Javascript-Enabled
Browsers:
Figures for the numbers of javascript enabled browsers are
difficult to interpret. These figures can be derived in two ways:
by making assumptions based on the userAgent or by placing a
testing script on a web page and recording results. Measuring
from the userAgent value can be done from the server access log
but it takes no account of the setting of the browser, so the
assumption that, say IE5.5 is javascript enabled is not always
true. The nimda virus outbreak last year highlights this. Many
people were advised to turn off scripting in their internet
settings to protect themselves from nimda.
The term Contents Management System (CMS) is usually used to describe a database which organises and provides access to digital assets, from text and images to digital graphics, animation, sound and video. This type of product is relatively new and there are a few CMS available as off-the-shelf packages. CMS range from very basic databases to sophisticated tailor-made applications and can be used to carry out a wide range of tasks, such as holding digital content, holding information about digital content, publishing online and publishing on-the-fly.
Is a database sufficient?
The CMS provides mechamisms to support asset management, internal and external linking, validation, access control and other functionality. Typically, a CMS is built on an underlying database technology.
Content Management Systems range from very basic databases, to sophisticated tailor-made applications. They facilitate easier tracking of different parts of a Web site, enabling, for example, staff to easily see where changes have been made recently and - perhaps - where they might need to make changes (a 'News' page that hasn't been edited for 6 months?). They also ease the handling of routine updating/modifying of pages, where you want to change a logo or text on every page, for example.
A CMS can also simplify internal workflow processes and can ensure that you are working with a single master copy of each digital asset.
However there are other approaches which may be usable, such as making use of server-side scripting to manage resources.
Solutions may include:
To summarise then, the issue to be aware of is the difficulties in maintaining resources in formats such as HTML. Using flat files and a CMS and/or database is a way of addressing this management issue. Whilst it is not an explicit requirement that projects manage their resources with a CMS and/or a database, if such tools are not used, the project must show how it intends to facilitate good management of its digital assets.
It is not necessary (or always desirable) to purchase and test Web pages against every combination of hardware device, browser version etc. Instead you should check that your Web pages are compliant with the version of HTML / XML / CSS that you use.
The testing should be carried out using a HTML / XML / CSS validator rather than relying on checking how it looks using a browser. A variety of validator are available.
In addition to these (and other) Web-based validators, many authoring tools will have their own validators.
There are a number of validators available for WAP phones. These may be bundled in with WAP / WML authoring tools.
There may be similar tools available for PDAs. However if the PDA supports HTML, you will be able to use a HTML validator.
Note that there are a number of WAP emulators available. These can be used to test out WAP sites. However, as stated above, it cannot be guaranteed that if a site works correctly in an emulator that it will work in the device itself.
Once your site has been released it is useful to carry out some Web site performance monitoring. This can be done using either externally-hosted Web services or a purchased statistics Services.
Each computer programming language has its own coding conventions. However there are a number of general points that you can follow to ensure that your code is well organised so that it can be easily understood by others. Have a look at the briefing paper for further information.
A software product should only be released after it has gone through a proper process of development, testing and bug fixing. Testing looks at areas such as performance, stability and error handling by setting up test scenarios under controlled conditions and assessing the results.
Before commencing testing it is useful to have a test plan which gives the scope of testing, details on the testing environment (hardware/software) and the test tools to be used. Testers will also have to decide on answers to specific questions for each test case such as what is being tested? How are results documented? How are fixes implemented? How are problems tracked? QA Focus will be looking mainly at automated testing which allows testers reuse code and scripts and standardise the testing process. We will also be considering the documentation that is useful for this type of testing such as logs, bug tracking reports, weekly status report and test scripts. We recognise that there are limits to testing, no programme can be tested completely. However the key is to test for what is important. We will be providing documentation on testing methodologies which projects should consider using.
As part of the testing procedure it is desirable to provide a range of inputs to the software, in order to ensure that the software can handle unusual input data correctly. It will also be necessary to check the outputs of the software. This is particularly important if the software outputs should comply with an open standard. It will be necessary not only to ensure that the output template complies with standards, but also that data included in the output template complies with standards (for example special characters such as & will need to be escaped if the output format is HTML).
A testing suite is an environment in which to test your software.
Deploying your resources into is a service is one of the most important aspects of your project. You need to think through where you would like your resources deposited and then contact the organisation who will give you further details.