MetadataUKOLN Software Tools |
Here are some UKOLN software tools for handling metadata in various formats. UKOLN also maintain lists of:
HTML-sum.pl | |
URL |
http://www.ukoln.ac.uk/metadata/software-tools/HTML-sum.pl
|
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Perl 5.0, LWP (HTML::Parser) |
Description | Summarise HTML page to produce SOIF record |
Keywords | HTML, SOIF, summarisation, ROADS, DESIRE |
Language | perl |
Usage | HTML-sum.pl [-u URL] file |
Comments |
This script is primarily intended as a replacement for the
HTML summariser that is supplied with the Harvest suite of tools. It could
also be used on it's own to summarise local HTML files
or in combination with, say, lynx to summarise remote pages.
The '-u URL' argument causes HTML-sum.pl to generate a full SOIF record
including an opening '@FILE { URL' and closing '}'.
Here's a simple shell script that uses lynx, HTML-sum.pl and soif2metadc to
produce a Dublin Core description of a remote resource embedded in
HTML META tags:
#!/bin/sh lynx -source $1 > /tmp/$$ HTML-sum.pl -u $1 /tmp/$$ | soif2metadc rm /tmp/$$ |
soif2metadc | |
URL |
http://www.ukoln.ac.uk/metadata/software-tools/tools/soif2metadc
|
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Perl 5.0, LWP (HTML::Entities), soif.pl (from Harvest) |
Description | Convert SOIF record to Dublin Core embedded in HTML META tags |
Keywords | SOIF, Dublin Core, HTML, META tags |
Language | perl |
Usage | soif2metadc |
Comments |
Reads SOIF record from STDIN and writes HTML to STDOUT. See
HTML-sum.pl for a simple example of use.
Can also function as an Apache server-side include (SSI) script to embed DC META tags into resources on the fly. (This script may also work as a SSI for other Web servers but this has not been tested). The script looks for a SOIF record describing the current page in a file with the same name as the HTML file but with a .soif suffix. I.e. the SOIF record for intro.html is in intro.html.soif. The Apache syntax for calling the script is: <!--#exec cmd="/opt/bin/soif2metadc" -->So, the <head> section of intro.html may look like this: <html> <head> <title>A sample page</title> <!--#exec cmd="/opt/bin/soif2metadc" --> </head> <body> ... |
wfsend | |
URL |
http://www.ukoln.ac.uk/metadata/software-tools/tools/wfsend
|
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Perl 5.0, MIME-tools (MIME::Entity), LWP (MIME::(QuotedPrint, Base64)), MailTools (1.06 or higher) |
Description | Send a MIME encoded Warwick Framework container |
Keywords | Warwick Framework, MIME |
Language | perl |
Usage | wfsend [-d] -t to -s subject [[-c|-a|-r] file ... ] ... |
Comments |
This script can be used to send some fairly simple Warwick
Framework containers by wrapping them up in MIME e-mail messages.
The -c, -r and -a arguments indicate the start of a
container, where -c indicates that the packages in the container
are not directly related to each other (though they may each describe the
same resource in some way), -r is used to indicate that the
container holds the resource (object) and it's metadata
and -a that the packages
are alternatives (i.e. any one of the packages can be used by the
recipient).
This script looks for /etc/mime.types, /usr/local/lib/mime.types and ~/.mime.types in order to assign MIME types to files based on their extensions. The -d option turns on debugging - the MIME message is written to STDOUT instead of being sent by e-mail. Here are some examples of use:
|
gendc | |
URL |
http://www.ukoln.ac.uk/metadata/software-tools/tools/gendc
|
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Perl |
Description | Simple script to create embedded Dublin Core interactively |
Keywords | Dublin Core, HTML |
Language | perl |
Usage | gendc |
Comments | Simple script that asks questions about a resource in order to generate Dublin Core description (in embedded HTML META tag format). Can be configured to know about arbitrary qualifiers. Answers to questions are remembered and are given as defaults next time the command is used. |
ROADSHarvester | |
URL | http://www.ukoln.ac.uk/metadata/software-tools/tools/roads/roadsharvester-v1a1.tar.Z |
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Perl 5.0, ROADS, Harvest v1.4pl2, Perl MD5 package |
Description | This package provides a combine-harvester for the ROADS software, adding the following functionality. 1) Automatic generation of metadata to 'pump-prime' ROADS records as part of the process of manually creating resource descriptions. Records created in this way can currently be based on either the DOCUMENT or the SERVICE template type. 2) Web robot based bulk-harvesting of records into a ROADS database based on the URLs listed in another ROADS database. Typically, all the records created in this way will be based on the DOCUMENT template type. |
Keywords | ROADS, harvesting, robot, Harvest, metadata |
Language | perl |
Usage | See the README file. |
Comments |
roads2gils.pl | |
URL |
http://www.ukoln.ac.uk/metadata/software-tools/tools/roads2gils.pl
|
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Perl 5.0 |
Description | Converts ROADS records to an SGML-like GILS record suitable for loading into Zebra. |
Keywords | ROADS, IAFA, GILS, Zebra |
Language | perl |
Usage | roads2gils.pl |
Comments |
DC-dot | |
URL | http://www.ukoln.ac.uk/metadata/sofware-tools/tools/dcdot/dcdot1.0.tar.Z |
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Perl 5.0, Libwww-perl, soif.pl, Jon Knight's MARC module |
Description | A Perl CGI script for generating HTML Dublin Core META tags. |
Keywords | Dublin Core, DC, editor, Warwick Framework, USMARC, SOIF, TEI GILS, XML, ROADS, IAFA, DESIRE |
Language | perl |
Usage | |
Comments | See the installation instructions, <URL:http://www.ukoln.ac.uk/metadata/software-tools/tools/dcdot/INSTALL>. |
Java DC-dot | |
URL | http://www.ukoln.ac.uk/metadata/metadata-tools/tools/dcdot/java/ |
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Java 1.1 |
Description | A tool for creating Dublin Core metadata |
Keywords | Dublin Core, DC, editor, SOIF, XML |
Language | Java |
Usage | java ukoln.metadata.DCdot |
Comments | <URL:http://www.ukoln.ac.uk/metadata/software-tools/tools/dcdot/java/README> |
soif2nwi | |
URL |
http://www.ukoln.ac.uk/metadata/software-tools/tools/soif2nwi/soif2nwi
|
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Perl 5.0, soif.pl (from Harvest), Perl MD5 package |
Description |
A perl script for converting SOIF records into NWI records
suitable for loading into Zebra. This script is intended to run as a
Harvest 'post-summarising' script by adding something like:
Post-Summarizing: lib/nwirulesinto the Harvest gatherer.cf config file. You'll need two other files and a following wind to get this stuff to work. The files both live in the gatherer's 'lib' directory and are called nwirules and nwipostproc.pl. |
Keywords | Harvest, SOIF, NWI, GILS, Zebra, DESIRE |
Language | perl |
Usage | See above. |
Comments |
Patch to add MCF support to ROADS v1 addsl.pl | |
URL |
http://www.ukoln.ac.uk/metadata/software-tools/tools/addsl.pl.patch
|
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Perl 5.0 |
Description | This is a patch to addsl.pl (v1b2pl9), adding a -M switch which causes it to generate an MCF file as well as HTML pages. NOTE: this generates an *old* format MCF file! Needs updating to generate newer versions of MCF (??) but might be of interest. The MCF is sent to a file called alphalist.mcf in your HTML directory (~/htdocs/) by default. |
Keywords | MCF, ROADS |
Language | perl |
Usage | |
Comments |
To get the HotSauce plugin to work you need to get your Web server to type
.mcf files as 'image/vasa', e.g. add
AddType .mcf image/vasa 8bit 1.0to your CERN httpd config file. |
roads2metadc.pl | |
URL |
http://www.ukoln.ac.uk/metadata/software-tools/tools/roads2metadc.pl
|
Author | Tracy Gardner, Andy Powell |
Publisher | UKOLN |
Requirements | Perl 5.0, ROADS version 2 |
Description | Output a ROADS DUBLINCORE record as HTML META tags or as RDF. Primarily intended to be used as an SSI script. |
Keywords | ROADS, Dublin Core, DC, HTML, META, RDF |
Language | perl |
Usage |
For Apache, embed something like
|
Comments |
ls2cdf | |
URL |
http://www.ukoln.ac.uk/metadata/software-tools/tools/ls2cdf
|
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Perl 5.0 |
Description | Produce a Channel Definition Format (CDF) file based on a simple list of file names. |
Keywords | CDF, Channel Definition Format |
Language | perl |
Usage |
ls2cdf [-t title] [-a abstract] [-u URL] [-p period]
The ls2cdf command should be combined with the find command to build a CDF file for a set of Web resources. As an example of use, the following commands will build a 'channel' listing the UKOLN metadata Web pages that have been modified in the last 5 days: find /opt/web/content/www.ukoln.ac.uk/metadata -mtime -5 \ -name \*.html -print | ls2cdf -t "UKOLN Metadata News" \ -a "UKOLN Metadata Web pages modified in the last 5 days" \ -u http://www.ukoln.ac.uk/metadata/ |
Comments | The source to ls2cdf will need some local modofication to change the set of regular expressions near the top of the file that map filenames to URLs. |
hdlres | |
URL |
http://www.ukoln.ac.uk/metadata/software-tools/tools/hdlres.c
|
Author | Andy Powell |
Publisher | UKOLN |
Requirements | CNRI Handle Client Library |
Description | A very simple Handle resolution client. |
Keywords | Handle, DOI, Digital Object Identifier, URN, N2L |
Language | C |
Usage |
hdlres |
Comments | Intended for use with the N2L CGI-based URN resolver. |
N2L | |
URL |
http://www.ukoln.ac.uk/metadata/software-tools/tools/N2L
|
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Perl, hdlres |
Description | CGI-based DOI/Handle/IETF/ISBN URN resolver. Not quite RFC-2169 compliant. Resolves DOI namespace using hdlres.c. Resolves IETF namespace using code from draft-ietf-urn-ietf-04.txt. Resolves ISBN namespace using www.amazon.co.uk. |
Keywords | URN, N2L, Name to Location, resolver, Handle, DOI, IETF, ISBN |
Language | perl |
Usage | |
Comments | Probably needs installing as nph-N2L |
bibcheck.cgi | |
URL |
http://www.ukoln.ac.uk/metadata/software-tools/tools/bibcheck.cgi
|
Author | Ian Peacock |
Publisher | UKOLN |
Requirements | Perl |
Description | A CGI-based tool for computing a BIBLINK.Checksum, a message digest (checksum) for Web pages. |
Keywords | MD5, message digest, checksum, BIBLINK |
Language | perl |
Usage | |
Comments |
DC-Datamodel | |
URL | http://www.ukoln.ac.uk/metadata/software-tools/tools/dcdm/dcdm.tar.Z |
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Perl 5.0 |
Description | A set of Perl modules offering an object oriented implementation of the DC-datamodel as defined by "Guidance on expressing the Dublin Core within the Resource Description Framework (RDF)". |
Keywords | DC, datamodel, Perl, RDF, classes, object-oriented |
Language | perl |
Usage | |
Comments |
DC-assist | |
URL | http://www.ukoln.ac.uk/metadata/software-tools/tools/dcassist/dcassist.tar.Z |
Author | Andy Powell |
Publisher | UKOLN |
Requirements | Perl 5.0 |
Description | A Perl CGI script that generates a JavaScript metadata help utility. |
Keywords | DC, Perl |
Language | perl |
Usage | |
Comments |