Challenges
Why its not very straight-forward
"Just a quick phone call to ask you if you could set up something to archive the University Web site, it should be pretty straight-forward for someone with your technical know-how."
It is only a matter of time before someone in "Corporate Communications", the " Freedom of Information Office" or some similar department comes to you with this sort of request. How would you (have you) react(ed) to it?
Many acres of virtual text have been penned on the subject of Web archiving (a fair proportion of them no longer available because the sites no longer exist:-) One of the major problems, which is well illustrated by the Wikipedia article on the subject, is that most authors have concentrated almost entirely on "How?" to do it and the (technical) difficulties that arise.
The speaker will argue that "How?" is the least of your problems. What is your institutional web site for and what purpose is archiving it supposed to serve. To put it another way, the questions: "What?", "Why?", "When?" and "Where?" come well before deciding if the "Who?" is you, or trying to determine "How?".
As usual Currall asks awkward questions and never seems to provide any useful answers, just turning seemingly simple problems in complex, issue-strewn minefields. He hasn't written the talk yet, but you can be sure that it will raise some very fundamental issues and give you something serious to think about and discuss and aside from manufacturing Shakespearean quotes, will probably quote from the most read book in the English Language, although you might feel the need to check that he isn't just making it up!
On the other hand ... you could just leave it all to the Way Back Machine at the Internet Archive.
Dynamic Content
A particular page viewed may only have existed fleetingly for just one browser
No single Tool or Approach
Behaviours
Using a variety of scripting tolls and techniques
No single Tool or Approach
Content Undifferentiated
Makes for very difficult appraisal
No single Tool or Approach
Same URL, Different Content
A URL does not define identity
No single Tool or Approach
Server-side Scripting
No single Tool or Approach
Same Content, Different URL
Stuff get moved around as sites are reorganisaed
No single Tool or Approach
Rapidly Changing Technologies
An institutional web site is not delivered by the same technologies over time
No single Tool or Approach
Web 2.0
No single Tool or Approach
Links
Links may be important to the meaning of a pages and may be 'out-of-scope' of the web site hosting the page
No single Tool or Approach
Pages not Bounded
Objects Made up of Many Components - images (and other objects), scripts, stylesheets
No single Tool or Approach
Top