Guidelines
For Setting Up
Project Web Sites
This document provides advice and guidance for projects wishing to set up a Web site to support their project.
This is a draft version of the document. The final version will be used to provide advice to a variety of funding programmes which are responsible for funding projects which will provide Web services.
This document complements the Standards and Guidelines document. This document is concerned only with the provision of Web services.
Feedback on this version of the document is welcome. Please send comments to the author at <B.Kelly@ukoln.ac.uk>.
This document is intended for project managers and Web site developers for projects which are funded by JISC programmes, such as JISC 5/99. The document is also provides advice for programme coordinators.
Before setting up your your
project Web
site you will find it useful is important to
identify clarify its purpose. Does the Web site
aim to deliver the project’s service or will it provide information about the
project? Is it intended to disseminate
information about the project to end users or is it intended to support
communications between project partners, steering and advisory groups, funders,
etc.?
Answers to these questions are needed at the planning stage in order to help with the planning of the structure of the Web site, identify the potential costs, management support, technical resources required and staffing numbers and skill levels, etc.
What domain name will your project Web site have? A project which has its own domain name (as several eLib projects have such as www.sosig.ac.uk, www.headline.ac.uk, etc.) will be easier to remember and is more easily indexed by search engines. If you have the skills available to manage your own domain name and Web server there should also be advantages in managing the site. However, especially for smaller projects, projects with an institutional focus or which do not have a Web focus, it may not be possible or desirable for a project to have its own domain name.
If your project Web site is hosted on an institutional or
departmental server there are several options available. Projects are often set up within a deep
hierarchical structure (e.g. www.foo.ac.uk/depts/library/projects/
jisc/bar/). Such URLs can be
difficult to remember and are prone to errors when typing. In addition many search engines are believed
not to index deeply within a Web site.
Another alternative is to make use of the ~ convention (e.g. www.foo.ac.uk/~bar/). Although this approach can provide a short URL novice users may find the ~ (tilde) key difficult to find. Also users may regard content on Web sites which contain ~ to be personal home pages rather than quality services.
Your project Web site should aim to be consistent in its URL naming. This should include use of case (are all URLs in lower case or is a mixture of case allowed) and delimiters between words (use of _ or -).
What is the expected life time of the project Web site? The information contained on a project Web site is likely to be valuable even after the project has finished. You should try to ensure that the Web site will continue to exist after the project funding has finished and that bookmarks and published URLs will continue to function.
Resources on your project Web site should be located within a project directory (e.g. if the project directory is www.foo.ac.uk/bar/ this should be the project entry point and not www.foo.ac.uk/bar.html). The use of a directory to define the Web site is needed in order for Web sites to be indexed, checked, audited, mirrored, etc. without the tools processing other (non-project) areas of the Web site.
You may set up a Web site once your project has been approved. You should note that once a Web site has been set up it may be indexed by search engines or linked to from other Web sites. Since the index and links may persist after the official project Web site has been launched it may be desirable to manage the dissemination of the pre-release Web site. For example, you may wish to provide text on the Web site informing users of its status. You should also consider use of the robots.txt file or equivalent <META> tags in HTML pages to prevent robots from indexing the site until it is launched [1].
Once your project Web site is officially launched you will wish to promote it, especially if the Web site is intended to provide a dissemination role for the project.
You should make use of a robots.txt file to ensure that quality areas of your Web site are indexed: for example, you may wish to exclude draft documents or personal pages from being indexed by search engines such as AltaVista.
You are advised to be pro-active in submitting your Web site to key search engines (e.g. AltaVista) and directory services (e.g. Yahoo). You should consider use of submission software and services (see [2]).
You should be aware of the difficulties which search engine software can have in indexing your Web site if you provide frames-interfaces (especially if you do not provide a no-frames alternative for accessing the full content of your Web site), if you use “splash screens” or if you use proprietary file formats, such as Flash.
The use of metadata on key areas of your web Web site is
desirable. Metadata (information
about a Web resource, such as the author, keywords, brief description, etc.) may
be used by search engines such as AltaVista and also by local search engines.
During the lifetime of the project there are likely to be
several deliverables and items of news which you would like your users to be
aware of. You should consider providing
a news area on your Web site. You may
wish to use an automated notification service such as Netmind [3] so that users
will receive an automated message when the news page is updated. You should also consider use of RSS [4] to
allow your news items to be automatically syndicated to remote Web sites.(what is metadata?)
For further information on Web site promotion see [95].
Before setting up your the Web site it
is useful to have an idea of the content which you will be provided. This is needed
in order to assist with the design of the Web site (particularly the
navigational elements on the Web site). In addition knowledge of the likely
content will help in structuring the content and defining the underlying
directory structure (or database design, if your web site
makes use of a database system). (why is a database better?).
You may find iIt may be useful to
produce a diagram showing the key areas of content which defines how the
content will be grouped. : Ttry to think
about how the
Web site will be structured in a couple of year’s time. You should try to
ensure that tThe Wweb site should be designed to
ensure that it will not have to be re-organised as the site grows as this
can be very time-consuming and can lead to many broken links. (how do you ensure
this?).
You will probably make extensive use of You
can make use of file formats based on open standards and choose to use
proprietary file formats. Open
standards (such as HTML and XML) are developed by consortia (such as W3C) and
are designed to be platform and application independent. There will often be freely available tools
to create and view formats based on open standards. Proprietary formats (such as Macromedia Flash and Adobe PDF) are
owned by commercial companies and may only work on limited platforms. Use of proprietary formats may require
licensed products, and even in when their use is free, there can be no
guarantee that this will continue.
An example of the dangers of use of proprietary formats can
be seen with the GIF format. GIF is
well-established on the Web for providing images. Unfortunately the GIF format
makes use of a patented compression technique which is owned by Unisys. The licensing
conditions require that software used to create GIF images has been licensed. (by who?).
JISC-funded projects providing Web sites should aim to make
use of open file formats.(How do you do this? References?)
You will probably make extensive use of HTML will be used extensively on your Web
site. In order to maximise the range of browsers which can access your web Web site you are
advised to ensure that your web Web site conforms to HTML standards
(currently HTML 4.0 or the newer XHTML 1.0 [62]
standard) and that you avoid use of proprietary extensions. (How do you do this?
References?)
HTML or XHTML should be used to describe the main structural
elements on your Web site. Cascading Style Sheets (CSS) [37] should
be used to define the appearance of the elements on a browser. Separation of the
structure of resources on the Web site from its appearance will enable the
appearance to be more easily changed and will ensure that resources can be accessed by a
variety of devices (digital TV, PDAs, etc.)(Why should you use
them? What are they? References?)
Although HTML/XHTML and CSS are currently the recommended
formats for Web sites, unfortunately many older browsers fail to support CSS
adequately. Until standards-compliant browsers are widely deployed you are
advised to consider use of “safe” CSS features which can be used with all
browsers and which degrade gracefully. (How do you find this out?see [8]).
Validation of HTML/XHTML and CSS resources will help to
detect errors. Your Web pages may not be displayed correctly or function
correctly in all browsers if they contain errors. You should ensure that you
systematically check Web pages using validation tools, which may be built into
the authoring tool, may be independent applications or may be provided on the
Web, such as W3C’s validation services [94] [105].
Your Web site is likely to contain a variety of images, such as navigational icons, photographs, flow-charts and organisational diagrams, etc. What graphical file formats should you use on your Web site (acknowledging that a richer master file format may be used for creating images)?
The GIF format is well-established on the Web. However the licensing conditions for use of
GIF require that software used to create GIF images has been licensed. (by who?) [11]. You should
ensure that the graphical software you use is licensed to create GIF images.
The JPEG format is also well-established on the web.
JPEG is ideal for photographs. Fortunately there are no licensing restrictions
to the use of JPEG tools.
The PNG format [12] was developed as a patent-free
replacement for GIF. Unfortunately PNG is not universally supported by Web
browsers. This has inhibited widespread deployment of PNG. However as new
browsers become more widely deployed, use of
PNG may become more widespread.
Browser plugin technologies allow a range of other file
formats to be provided on the Web such as Macromedia Flash and Adobe PDF. It
should be noted that such formats are proprietary and there is no guarantee
that plugins will continue to be free. There may also be accessibility
considerations for plugin technologies: the content may not be accessible to
speaking browsers , Web TVs, etc.
A number of approaches to creating HTML documents can be
taken. Experienced HTML authors may
make use of text editors to create HTML markup manually. However this approach has its limitationscan be time-consuming
and is prone to errors. (what are the
limitations?. )Many
HTML authors prefer to make use of dedicated HTML authoring tools such as
FrontPage, HoTMetaL or Dreamweaver.
If you have large numbers of documents in proprietary formats, such as a word processing format, you may wish to make use of a conversion tool. Some dedicated HTML authoring tools allow formats such as MS Word to be imported and converted, although the quality of the conversion may be poor. Dedicated conversion tools may do a better job, or enable large numbers of documents to be converted in bulk.
Another way of providing access to documents which use a
proprietary file format is to use “on-the-fly” conversion software on the
server. (explain on the flyi.e. files are
dynamically converted by software typically running on the Web server).
Another alternative is to make use of a content management system. A content management system may be
regarded as a database which provides manages management
functionality the content ofor a Web site.
Content management systems normally provide sophisticated
management facilities such as reuse of resources, automated removal of
expired resources, personalised interfaces, etc. Content management systems may
provide a dedicated data entry system in which knowledge of HTML is not required.
Content management
systems should also provide support for new file formats which may supercede or
extend HTML (e.g. provide support for WML to provide access to users
of mobile
phones).(Examples
would be good).
As the underlying Web technologies and file formats are constantly being developed it is important to keep up-to-date with developments in order to be in a position to exploit new developments in a timely manner. Important developments to be aware of include XML, XHTML and XSLT.
XML, the Extensible Markup Language, will act as the basis
for new file formats
[13]. It enables richly
structured resources to be described in an open and extensible format. XML
is already widely used in many large-scale commercial Web sites and will grow in
importance as new browsers become available which will provide native support
for XML. In addition to XML itself, there are many related developments (e.g.
XLink, XPointer and XSLT) which will enhance the functionality of
services based on XML. (What exactly does XML do and why is it the next
good thing? Where can you find more info about it?).
XHTML [14] is an XML version of HTML. It is designed
to provide the benefits of XML (i.e. structured, reusable documents) while
allowing resources to be accessed by existing browsers.. (what are the benefits
of XML?)
XSLT [15] is a transformation language which allows
XML resources (such as XHTML pages) to be transformed into other formats such
as WML pages for use by mobile phones).. (what are the benefits
of XML?)
RDF, the Resource Description Framework, is an XML
application which provides a framework for metadata applications. (Explain further)
Web sites are under constant development. They will grow in size as new resources are
added. They are also likely to provide
new functionality such as interactive feedback, personalised interfaces,
e-commerce, WAP services, etc. As we have
seen, Web sites are also likely to make use of new or enhanced file
formats. What approaches should Web
Management Teams take in order to maintain existing services as well as deploy
new ones?
The use of Content Management Systems (CMS) may be
needed. Unlike a HTML authoring tool, a
CMS not only provides management of Web resources, it may also provide
management of the application logic and make use of technologies such as XML. (I think you may have
to explain this more simply) Although CMSs may be expensive, they may
have an important role to play in the development of the next generation of Web
services.
A well-designed Web site will be quick and easy to use and will reflect positively on the organisation. A poorly designed Web site is likely to be difficult to use and will give a poor impression of the organisation. When designing your Web site think about the following issues:
Who designs? Who will be responsible for designing the Web site? Will it be done inhouse, or by an external designer? What skills do they have? Are the design skills relevant to a Web site?
The design brief.
It is important to produce a thorough design brief and methodology for
approving the proposed designs. (any examples of design briefs we can point to?).
Technologies. What technologies will be used to implement the design? Will the use of technologies such as Shockwave or Flash be acceptable? Are the technologies backwards compatible?
Accessibility. Is the design accessible to people with disabilities or users of older browsers or specialist devices?
The navigational aids on a Web site should be part of the overall site design. It is desirable that consistent navigational aids are widely available throughout the Web site. It should enable users to quickly access the key areas of the Web site such as the “home page”, a search facility or site map and help information or frequently asked questions.
A search facility is essential for most Web sites. A wide
range of search engines are available, many free-of-charge. If you do not have
the technical expertise to install a search facility you can make use of an
externally-hosted search engine. For further information see [716].
Your Web site’s 404 error page (the message which is displayed when a user selects
a Web page which does not exist) can play an important role in helping
users navigate. A well-designed 404 page will provide access to a search
facility or a site map. For further information see [817]. (What's a 404 page -
explain!].)
In order for a web Web site to
continue to provide a quality service after it has been launched it will be
necessary to maintain the service.
The content on the Web site will need maintaining to ensure that it remains up-to-date and relevant. The maintenance process can by assisted by the inclusion of contact details or clearly defining the person or group with responsibility for the information content. User feedback mechanisms, such as email links or Web forms can help to encourage users to report on inaccuracies.
Broken links on a web Web site are
always irritating. You should ensure that you provide systematic link checking.
This should cover both internal links to resources within your web Web site and
links to external resources.
Although there are many link checking tools available you
should bear in mind that broken links can be caused not only by use of the <A>
and <IMG>
elements to link to resources and images, but also by technologies, such as
style sheets, forms, etc.
Although many link checkers will only check for
broken <A> and <IMG>
elements, Yyou can
check for other broken links by analysing your web Web server’s log
file. The error log file (which may be a separate file) will give more complete
information on errors. (examples of link checkers).
You should ensure that you have procedures for monitoring the availability of your Web service. If you do not have procedures available locally you may wish to make use of remote services such as WatchMyServer [18] and InternetSeer [19].
You may be expected to provide performance indicators for
your wWeb site by a funding body or management
group. You may wish to record performance indicators for your own use, to help
with future planning for growth of the service.
Web statistics can provide a useful performance indicator,
although they should be treated with caution. For example a substantial growth
in the number of hits on our wWeb site may simply indicate a redesign of
your Wweb
site with greater numbers of images or that your wWeb site is being
accessed regularly by robot software rather than users. Information on the
number of page impressions or user sessions is probably better than number of
hits, but again this can be misleading. For example growth in the number of
page impressions and user sessions may be the result of large numbers of users
finding your Wweb
site using a search engine and leaving the Wweb site
after reading one page and deciding it is not relevant. (might be useful to
explain what a hit is and how it differs from a page impression).
Other performance indicators are available, such as the
number of links to your Web site, the coverage of your Web site by search
engines, Web server uptime, user feedback, etc. For further information see [1020].
You may find it useful to carry out periodic auditing of your Web site. This can help in spotting errors, evaluating the accessibility of the Web site, evaluating the success of your dissemination, etc.
For further information on approaches to monitoring and auditing Web sites see the survey of eLib project Web sites [21].
About This Document
This is version 0.1 of the document. It was last updated on 13 November 2000. This version was distributed at the JISC 5/99 project day held in Manchester on 16/17th November 2000.
[1] Robot
Exclusion Protocol
http://info.webcrawler.com/mak/projects/
robots/exclusion.html
[12] Submitting to Search Engines Using
"Scrub The Web", Exploit Interactive, 6,
http://www.exploit-lib.org/issue6/
software-used/
[13] Netmind
http://www.netmind.com/
[24] Rich Site Summary Resources, UKOLN
http://www.ukoln.ac.uk/metadata/resources/
rss/
[5] Promoting
Your Project Web Site, Exploit Interactive, 4
http://www.exploit-lib.org/issue4/
promotion/
[26] Hypertext Markup Language, W3C
http://www.w3.org/MarkUp/
[73] CSS, W3C
http://www.w3.org/Style/CSS/
[8] CSS Support Table, RichInStyle
http://richinstyle.com/bugs/table.html
[49] W3C HTML Validation Service, W3C
http://validator.w3.org/
[510] W3C CSS Validation Service, W3C
http://jigsaw.w3.org/css-validator/
[11] Burn All GIFs Home Page
http://burnallgifs.org/
[12] PNG
http://www.w3c.org/Graphics/PNG
[13] The XML FAQ
http://www.ucc.ie/xml/
[14] XHTML, W3C
http://www.w3.org/TR/xhtml1
[15] Extensible
Stylesheet Language,
W3C
http://www.w3.org/Style/XSL/
[6] W3C Home Page, W3C, <http://www.w3.org/>
[167] UK University Search Engines, Ariadne, issue 21
http://www.ariadne.ac.uk/issue21/webwatch/
[17] 404s: What’s
Missing?, Ariadne, issue 20 http://www.ariadne.ac.uk/issue20/404/
[8] 404s: What’s Missing?,
Ariadne, issue 20, <http://www.ariadne.ac.uk/issue20/404/>
[9] Promoting Your Project Web Site,
Exploit Interactive, issue 4,
<http://www.exploit-lib.org/
issue4/promotion/>
[18] WatchMyServer
http://www.watchmyserver.com/
[19] InternetSeer
http://www.internetseer.com/
[1020] Performance Indicators For Web Sites, Exploit
Interactive, issue 5,
http://www.exploit-lib.org/issue5/
indicators/
[21] WebWatching
eLib Project Web Sites, issue 26 (to be published in Dec 2000)
http://www.ariadne.ac.uk/issue26/web-watch/
Acknowledgments
UKOLN is funded by Resource: The Council for Museum, Archives and Libraries, the Joint Information Systems Committee of the Higher and Further Education Funding Councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the University of Bath where it is based.