Creating a Search from Scan Results

By Janifer Gatenby, Geac Computers

Version 3 August 6th 1999

The requirement

Scan returns results that consist of terms with complementary data, representing rows from an ordered list. The results can be presented to an end user, enabling him or her to browse forward and optionally backwards, then select a line for further information or processing. When a line is selected, the system may format a search request that may return one or more records associated with the term. Typical examples are scans (browses) on AUTHOR, SUBJECT and TITLE.

To construct a follow on search, the origin takes the USE attributes that were used for the scan and takes data from the term field of the scan as the search term or terms.

Database Models

The technique described above may be employed to construct a search for various database models.

There are the following possible models:

 

The technique of doing a follow on search with the same Use attributes from the scan request and the TERM from the scan response may not be the most efficient for the first two cases above where there are database links that may be employed. The problem with using Term for the follow search is that the resulting search may not be precise enough. There are a number of reasons for this. Firstly, the term may have been truncated and actually lacks significant words, important for the precision. Secondly, the target may not support position attributes such as first in field or the structure attribute phrase and therefore the search is constructed in an imprecise way such that it can retrieve unexpected records even when a single seemingly unique line has been extracted from a scan. Another case is where the term is not unique. This can happen where it is necessary to repeat the term because of differences in the display term. Example:

TERM DUPUIS FRANCOISE

DISPLAY TERM Dupuis, Françoise

TERM DUPUIS FRANCOISE

DISPLAY TERM Dupuis, Franžçoise

TERM DUPUIS FRANCOISE

DISPLAY TERM Dupuis, Francoise

 

What is required is a means of using database links where they exist to assist in the precision of the follow on search.

 

The Proposal

The proposal is to include this retrieval information in otherTermInfo as an external carried in externallyDefinedInfo. The external would be called DirectTermAccess and would comprise the following:

Data element

Comment

ServerName

Optional, if omitted assumed to be the same as for the SCAN

ServerAddress

Optional, if omitted assumed to be the same as for the SCAN

ServerPort

Optional, if omitted assumed to be the same as for the SCAN

DatabaseName

Mandatory if any of the above three elements are present.

Else, optional and if omitted assumed to be the same as for the SCAN

AttributesPlusTerm

Optional if DatabaseName is present; if missing, then the returned term can be used safely in the other database;

Mandatory DatabaseName is not present.

OccurrenceCount

Optional. Indicates the occurrence of the term in the database to be searched. For example will give the bibliographic occurrence count of an authority TERM.

 

 

Example:

 

An authority scan, e.g. author or subject performed on an index of an authority database (auth.file) produces a scan entry with a term occurrence of 1. There are actually 3 bibliographic records associated with this authority record. In otherTermInfo of the scan response, there is one entry containing the identifier of the authority record (4544).

serverName omitted

serverAddress omitted

serverPort omitted

databaseName bib.file

AttributePlusTerm

attributeSet 1.2.840.10003.3.1 (Bib1)

attributeType 1 (Use attribute)

attributeValue 12 (Local number)

term 4544 (local number of the term)

occurrenceCount 3

 

The target may define its own attribute values for internal numbers, particularly if it needs to distinguish between local bibliographic and authority numbers. As the target is able to supply these in the scan response, the origin does not need to know them in advance. Therefore internally defined attributes do not pose a problem for interoperability.

The authority file may be located in a separate database from the bibliographic file or it may be in the same database. For the purposes of retrieval, the authority file should be regarded as a separate database even where it is not. The origin needs to know the names of both databases.

Where an authority file is linked to a bibliographic file as per database models 2 and 3, it is possible that to:

 

 

 

 

 

 

 

Version

Date

Author

Description

1

9.04.99

Janifer Gatenby

 

2

25.07.99

Janifer Gatenby

Change other term info / url to alternative term / attributes plus term

3

6.08.99

Janifer Gatenby

Change from alternative term to other term info with Direct Term Access as an external