NetLab ILRT Home

RDF Thesaurus Specification (draft)

Abstract:

Conceptual relationships for encoding thesauri, classification systems and organised metadata collections and a proposal for encoding a core set of thesaurus relationships using an RDF Schema

Authors

Version History

Previous version: 2000-06-06 (DESIRE II report)
This version: 2000-01-24

Latest version URI: http://ilrt.org/discovery/2001/01/rdf-thes/

Status

This document presents a specification for an XML/RDF representation of thesauri. We expect this work to evolve through feedback from other implementors. Our deployment experience to date suggests that the current proposal is both practical and useful; however there are a number of open issues remaining to be resolved. The proposal may change significantly as we attempt to resolve these issues; implementors should be aware of this.

Note: 2000-01-24 the document is currently being revised; the version online at this address may change as we re-organise the structure and content of the proposal (danbri).

Comments and feedback should be directed to the publically archived rdfthes-dev mailing list (rdfthes-dev@egroups.com).

1. Introduction

This paper proposes an RDF representation of various conceptual relationships typical of controlled vocabularies such as thesauri, classification systems and organised metadata collections. The aim is to explore the use of RDF as a common formalism for representing a variety of different thesauri and classification systems within the same overall framework. By doing so, we expect to leverage generic RDF facilities (such as query and storage software components), and also to have a basis for mapping between subject classifications expressed using these various vocabularies.

The approach taken here is to divide the problem into two stages. Firstly we define a simple core RDF representation of concepts such as 'broader term' and 'narrower term' typically used in classification and thesauri systems. Then we extend this with a range of more semantically meaningful relationships expressed in terms of classes of objects. Many vocabulary systems have a tacit or unarticulated semantic model obscured behind relatively uninformative relationships such as 'broader' and 'narrower'. It is usually impossible to mechanically derive a richer set of relationships from a system based around these vague, generic relation types. General hierarchical relationships are frequently used to indicate one of several actual relationships. The relationships 'is a', 'has instantiation', and 'has part', for example, might all be encoded using the less informative 'narrower' relation.

The simpler 'core' relations are best thought of as being relationships between named concepts or terms, rather than as relations between real world (or abstract) entities. In other words, while we might say that "Fido is a dog" using a rich, semantic relationship, we would say that "the-term-Fido has-broader-term the-term-dog". The vague 'broader term' relation in this case subsumes the more informative 'is a' relation. The proposal in this document separates out these two approaches since it is crucial to remain unambiguous about when a node ('resource') in an RDF data model represents a named concept or term rather than some less abstract entity, i.e. the "thing in itself".

A further reason for creating two distinct RDF representations for these vocabulary systems is that RDF itself includes some common core vocabulary elements which have some overlap in functionality with the semantic modelling facilities required to transform simple flat vocabulary systems into richer knowledge bases. In particular, the RDF specifications define notions of 'Class', 'Property', 'subClassOf', 'type', 'domain' and 'range', which may be applicable to the task described above. By first addressing the need to find a simple RDF representation for the broader/narrow/preferred relationships, i.e. those simple relations which make sense in the context of terms/concepts rather than semantically modelled entities, we should be able to make some initial progress without having to solve the entire problem of 'knowledge modelling in RDF'.

The next section walks through the desired set of relationships, using bold type to indicate candidates for the simple core RDF vocabulary. The following section sketches how a machine-processable RDF representation of the simple term-oriented concepts of a thesaurus might look, and finally, we give a machine-readable RDF Schema for the simple vocabulary.

In the definitions given below, the term Category is used when the relationship applies to classification systems, Term when the relationship applies to thesauri, and Document when the relationship can be used with individual documents.

2. Conceptual relationships for encoding thesauri, classification systems and organised metadata collections

A HIERARCHICAL RELATIONSHIPS

Label: BroaderTerm

Term: Broader term

Member of: A

Definition:        Term one level up in a hierarchy, without specification of the type of hierarchical relationship

Label: NarrowerTerm

Term: Narrower term

Member of: A

Definition:        Term one level down in a hierarchy, without specification of type of hierarchical relationship

(For inclusion in the simple core vocabulary)

A1 GENERIC RELATIONSHIP

Label: IsA

Term: is a (instance of)

Member of: A1

Definition:          Term/Category is an instance of a term/category one level up in the hierarchy

Label: HasInstantiation

Term: has instantiation

Member of: A1

Definition:          Term/Category has an instantiation one level below in the hierarchy

A2 Whole-part relationships

Label: IsPartOf

Term: is part of

Member of: A2

Definition:          Document/Category/Term represents a (unspecified) part of a document/category/term one level up in the hierarchy

Label: HasPart

Term: has part

Member of: A2

Definition:Document/Category/Term has a (unspecified) part one level below in the hierarchy

Label: IsSpatialPartOf

Term: is spatial part of

Member of: A2

Definition:          Term/Category represents a spatial/geographical part of a term/category one level up in the hierarchy

Label: HasSpatialPart

Term: has spatial part

Member of: A2

Definition:          Term/Category has a spatial/geographical subterm/subcategory one level below in the hierarchy

Label: IsConceptuallyPartOf

Term: is conceptually part of

Member of: A2

Definition:          Term/Category is a subconcept to a term/category one level up in the hierarchy

Label: HasConceptualPart

Term: has conceptual part

Member of: A2

Definition:          Term/Category has a subterm/subcategory one level below in the hierarchy

Label: IsCollectionMemberOf

Term: is collection member of

Member of: A2

Definition:          Document/Category/Term is member of a group of documents/categories/terms

Label: HasCollectionMember

Term: has collection member

Member of: A2

Definition:          Group of documents/categories/terms has member

B EQUIVALENCE RELATIONSHIPS

B1 Single directional equivalence

Label: Use

Term: use, see

Member of: B1

Definition:        The term/category pointed to should be preferred

Label: UsedFor

Term: used for

Member of: B1

Definition:        The term/category pointed to is the non-preferred term/category

(For inclusion in the simple core vocabulary)

Label: IsVersionOf

Term: is version of

Member of: B1

Definition:          The document/category/term pointed to is a version of another document/category/term

Label: HasVersion

Term: has version

Member of: B1

Definition:          The document/category/term has another version

B2 Bi-directional equivalence

Label: IsSynonymOf

Term: is synonym of

Member of: B2

Definition: The term is a synonym of the one pointed to

Label: IsFormatOf

Term: is format of

Member of: B2

Definition:The document is a result of a format transformation of the one pointed to

C ASSOCIATIVE RELATIONSHIPS

Label: RelatedTerm

Term: related term, see also, similar to

Member of: C

Definition:        The document/category/term pointed to is related (in an unspecified way)

(For inclusion in the simple core vocabulary)

Label: IsReferencedBy

Term: is referenced by

Member of: C

Definition: The document is referenced by the document pointed to

Label: References

Term: references

Member of: C

Definition: The document is referencing the document pointed to

Label: IsRequiredBy

Term: is required by

Member of: C

Definition:          The document/object is required by/dependent on the document pointed to

Label: Requires

Term: requires

Member of: C

Definition:          The document requires/is dependent on the document/object pointed to

Label: IsBasedOn

Term: is based on

Member of: C

Definition: The document/term is based on the document/term/object pointed to

Label: IsBasisFor

Term: is basis for

Member of: C

Definition:          The document/term/object is the basis for the document/term pointed to

Label: IsDerivedFrom

Term: is derived from

Member of: C

Definition: The document/term is derived from the document/term pointed to

Label: HasDerivate

Term: has derivate

Member of: C

Definition: The document/term has the derivate pointed to

Label: IsTranslatedFrom

Term: is translated from

Member of: C

Definition: The document/term is translated from the document/term pointed to

Label: HasTranslation

Term: has translation

Member of: C

Definition: The document/term has the translation pointed to

Label: IsInterpretationOf

Term: is interpretation of

Member of: C

Definition:          The document is a (creative, artistic) interpretation of the document/object pointed to

Label: HasInterpretation

Term: has interpretation

Member of: C

Definition: The document has a (creative, artistic) interpretation pointed to

Label: IsMappedTo

Term: is mapped to

Member of: C

Definition: The document/category/term is mapped to the document/category/term pointed to

Label: HasMapping

Term: has mapping

Member of: C

Definition: The document/category/term has this document/category/term mapped to it

Label: IsLinkedFrom

Term: is linked from

Member of: C

Definition: The document/category/term is linked to from the document/category/term pointed to

Label: HasLinkTo

Term: has link to

Member of: C

Definition: The document/category/term is linking to the document/category/term pointed to

Label: IsSameLevelNeighbour

Term: is same level neighbour

Member of: C

Definition:          The document/category/term is a neighbour on the same level of a organisational structure to the document/category/term pointed to

Label: IsTopologicalNearestNeighbour

Term: is topological nearest neighbour

Member of: C

Definition:          The document/category/term is a topologically nearest neighbour in a organisational structure to the document/category/term pointed to

3. RDF core vocabulary

The relationships in bold above are candidates for a simple core set of relationships for a thesaurus. They are:

BroaderTerm

· NarrowerTerm

· Use

· UsedFor

· RelatedTerm

This terminology is taken from ISO 2788: Guidelines for the establishment and development of monolingual thesauri (International Organization for Standardisation, 1986).

The terminology deals with the terms themselves, that is, the lexical representation of concepts. For the creation of an RDF schema for storing structured vocabularies, we decided to differentiate between the lexical representation of a concept and the concept itself. It was felt that the unique resource should be the concept, each concept resource being indicated by one or more term resources. Thus the RDF resource used to represent cats, would be indicated by a term whose value was the word "cats". This is represented by the graph in figure 3 below.

diagrammatic representation of schema

Figure 3. RDF graph representation of the concept representing a cat (concept 5)

In figure 3, concept_5 represents the concept of cats. Its indicator is a term (term_7) whose value is the text string "cats". Another term indicating the concept might have the value "chats".

As a result of the above approach, the RDF schema refers to relationships between concepts rather than between terms, and this is reflected in the vocabulary used below, e.g. broaderConcept rather than broaderTerm.

Whilst the relationships: 'broader', 'narrower', and 'related' are still meaningful when considering concepts rather than terms, the relationships 'use' and 'used for' refer only to terms. This is because 'use' and 'used for' indicate which particular term has been chosen to be used to represent the relevant concept when indexing some resource. For the core RDF vocabulary then, these relationships have instead been represented by properties of the term resources. This is referred to using the attribute 'termUsage', which has values of 'preferred' or 'nonPreferred'. The second issue considered was that since broaderTerm and narrowerTerm are commutative, i.e.

A narrowerTerm B implies

B broaderTerm A,

utilising both relationships when storing or transferring the vocabulary data would be inefficient. We therefore decided to create a relationship 'broaderConcept' for the RDF Schema but not 'narrowerConcept', as this is implied; it being the responsibility of any application using the data to deduce the opposite relationship and present it to the user.

The second relationship between concepts chosen for the schema was 'relatedConcept'. This term is bi-directional, and hence if the relationship

A relatedConcept B exists, then it is implied that

B relatedConcept A is also true.

Hence we only add one of the two possible pairs to the datastore.

A further attribute often used within thesauri, is 'top term'. This indicates a term that is at the top of a hierarchy within the thesaurus. Since this is a property that may be deduced by an application from the lack of a broaderConcept property for that concept, this attribute is also left out of the schema.

broaderConcept and relatedConcept were therefore selected as the only two core relationships between concepts that would be required for a basic RDF vocabulary schema. Other properties are required however to allow the encoding of thesauri, taking into account the recommendations of ISO 2788 (International Organization for Standardisation , 1986) and general thesaurus usage. These are listed in the next section which describes the RDF thesaurus schema proposed.

4. RDF Thesaurus Schema

4.1. Resource Description Framework Schemas

The Resource Description Framework (RDF) is a W3C (World Wide Web Consortium, 2000) recommendation for representing structured data on the Web. RDF, like both the Web and thesaurus systems, is based around a strategy of managing information as a collection of links between uniquely named entities. RDF's Web-based information model uses the term 'resource' to refer to the entities that it models, and provides an application-neutral framework within which various kinds of entities and relationships can be described. A general introduction to RDF is beyond the scope of this document. The W3C home page for RDF (Swick, 2000) lists a number of introductory tutorials as well as the RDF specifications.

In this document we describe the application of RDF to the description of thesaurus-like data structures. Specifically, we show how the RDF data model can represent a Web of inter-related concepts and terms from one or more thesauri. To do this, we define a simple RDF vocabulary that uses Web identifiers (Universal Resource Identifiers) to name some relationships and resource types useful for the description of concepts and terms in a thesaurus. It should be noted that we do not here attempt to model the richer semantic relationships that hold between the entities denoted by such concepts, although RDF itself can also be used to represent this kind of information.

4.2. Proposal for an RDF Thesaurus Schema

The XML/RDF thesaurus schema is set out in Appendix A. An example set of XML/RDF thesaurus data is given in Appendix B.

As described above, the schema consists of two main resources: Concept and Term. Concept resources are related by the properties: 'broaderConcept' and 'relatedConcept'. Concepts have a property 'indicator' which points to one or more term resources. The value of each Term resource will be the actual text string.

As noted above, the Term resources have an optional property called 'termUsage', which can be used with those thesauri that have non-preferred terms linked to preferred terms through the use/'used for' relationships. The value of termUsage must be either the string 'preferred' or 'nonPreferred'.

A second Term property is 'lang', which can be used to indicate the language of the term; thus a single concept can be 'indicated' by both preferred and non-preferred terms, and by terms from different languages (there is likely to be one preferred term for each language). The thesaurus schema therefore provides a mechanism for storing multilingual thesauri. If an English term and a German term both 'indicate' the same concept resource, it is implied that the two terms are either equivalent, or at least are treated as such for indexing purposes.

It may be considered necessary to recognise relationships between terms of different languages other than 'exactly equivalent', such as recognising that the equivalent term is broader in meaning, or where a single term in one language can be represented by two or more terms in another. In such a case, separate sets of concepts could be used for the different languages, with a new set of properties devised to indicate the different types of relationship between them, rather than using the 'lang' property.

There are two further optional properties that are permissible for Concept resources: 'scope' and 'conceptCode'.

· The value of the scope property is a resource called 'ScopeNote', which also has a 'lang' property, and whose value is an optional scope note for the term. A scope note is defined in ISO 2788 as "a note attached to a term to indicate its meaning within an indexing language" (International Organization for Standardisation, 1986).

· The property 'conceptCode' can be used for any code that is assigned to the preferred terms in a systematic thesaurus. In ISO 2788, the property 'address code' is defined as a code which links terms in an alphabetical index to their location in the systematic section. They "should have obvious filing values … may consist simply of running numbers … or may comprise a system of hierarchically expressive notation" (International Organization for Standardisation, 1986). Such a code will be unique for each concept in an RDF version of the thesaurus and might perhaps be useful in providing a language neutral method for indexing documents. In other thesauri, there may be non-unique codes, such as notations that associate the terms to broad subject categories, and such codes could also be held as values of the conceptCode attribute. Any unique code associated with the preferred terms in a thesaurus could also be usefully incorporated into the URI of the Concept resources, as this would be an aid in future management of the data (for instance for updates to the database). However, the conceptCode property has also been provided as a means of storing such information if required.

Appendix A: RDF/XML Thesaurus Schema

<rdf:RDF xml:lang="en"

    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

    xmlns:rdfs="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#">

    <rdfs:Class rdf:ID="Concept">

       <rdfs:comment>

         A unique concept defined within a thesaurus. Instances

         use the rdfs:isDefinedBy property with a vocabulary

         namespace as its value, to indicate the vocabulary to

         which the concept belongs.

       </rdfs:comment>

       <rdfs:subClassOf

         rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-

         19990303#Resource"/>

    </rdfs:Class>

    <rdfs:Class rdf:ID="Term">

       <rdfs:comment>

          Instances of this class represent the written forms of

          Concepts. The string is given by the rdf:value of Term.

       </rdfs:comment>

       <rdfs:subClassOf

          rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-

          19990303#Resource"/>

    </rdfs:Class>

    <rdfs:Class rdf:ID="ScopeNote">

       <rdfs:comment>

          The value of this optional resource is a scope note:

          a note attached to a term to indicate its meaning within

          an indexing language

       </rdfs:comment>

       <rdfs:subClassOf

        rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-

        19990303#Resource"/>

    </rdfs:Class>

    <rdfs:Class rdf:ID="TermUsageValue">

       <rdfs:comment>

         The value of the property: termUsage. It can take one of two

         values: 'preferred' or 'nonPreferred'.

       </rdfs:comment>

       <rdfs:subClassOf

         rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-

         19990303#Resource"/>

    </rdfs:Class>

    <rdf:Property ID="broaderConcept">

        <rdfs:comment>

          This schema does not define a property 'narrowerConcept',

          but applications can assume the existence of a property

          narrowerConcept such that if:

          {broaderConcept,ConceptA,ConceptB}, then

          {narrowerConcept,ConceptB,ConceptA} is true.

        </rdfs:comment>

        <rdfs:domain rdf:resource="#Concept"/>

        <rdfs:range rdf:resource="#Concept"/>

    </rdf:Property>

    <rdf:Property ID="relatedConcept">

        <rdfs:comment>

          The relatedConcept is commutative, such that if:

          {relatedConcept,ConceptA,ConceptB}, then

          {relatedConcept,ConceptB,ConceptA} is true.

        </rdfs:comment>

        <rdfs:domain rdf:resource="#Concept"/>

        <rdfs:range rdf:resource="#Concept"/>

    </rdf:Property>

    <rdf:Property ID="indicator">

        <rdfs:comment>

          A mandatory property of a Concept whose value is

          the Term instance representing a written form of the

          Concept. A Concept may have as an indicator more than

          one Term. A Term may only be an indicator of one

          Concept.

        </rdfs:comment>

        <rdfs:domain rdf:resource="#Concept"/>

        <rdfs:range rdf:resource="#Term"/>

    </rdf:Property>

    <rdf:Property ID="conceptCode">

        <rdfs:comment>

          An optional property for any code assigned to the

          thesaurus concepts.

        </rdfs:comment>

        <rdfs:domain rdf:resource="#Concept"/>

        <rdfs:range

          rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-

          19990303#Literal"/>

    </rdf:Property>

    <rdf:Property ID="scope">

        <rdfs:comment>

          This optional property has as its value an instance of

          the resource ScopeNote.

        </rdfs:comment>

        <rdfs:domain rdf:resource="#Concept"/>

        <rdfs:range

          rdf:resource="#ScopeNote"/>

    </rdf:Property>

    <rdf:Property ID="lang">

       <rdfs:comment>

         Optional property that can be used to give the language

         of a Term instance. The codes from "ISO 639:1988,

         Code for the representation of names of languages" should

         be used as the values for this property.

       </rdfs:comment>

        <rdfs:domain rdf:resource="#Term"/>

        <rdfs:range

          rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-

          19990303#Literal"/>

    </rdf:Property>

    <rdf:Property ID="termUsage">

        <rdfs:comment>

          This optional property indicates whether the Term

          instance is the 'preferred or 'nonPreferred' textual

          expression of the Concept instance that is 'indicated'

          by the Term, for a given language.

        </rdfs:comment>

        <rdfs:domain rdf:resource="#Term"/>

        <rdfs:range rdf:resource="#TermUsageValue"/>

    </rdf:Property>

    <rdf:Description rdf:ID="preferred">

      <rdf:type rdf:resource="#TermUsageValue"/>

    </rdf:Description>

    <rdf:Description rdf:ID="nonPreferred">

      <rdf:type rdf:resource="#TermUsageValue"/>

    </rdf:Description>

</rdf:RDF>

Appendix B: Sample Thesaurus Metadata Expressed Using the RDF/XML Thesaurus Schema

The example below shows the relationships between three concepts, whose term values are: 'Interpersonal Attraction', 'Interpersonal Relations', and 'Friends'. A graph representation of the RDF follows the XML representation (excluding the scopeNote property).

<web:RDF xml:lang="en"

   xmlns:thes="http://snowball.ilrt.bris.ac.uk/~pldab/rdf-

   dot/Thes/Thes.xrdf#"

   xmlns:web="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

   xmlns:rdfs="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#">

<web:Description about="http://sosig.ac.uk/hasset/terms/TID_3">

   <web:type resource="http://snowball.ilrt.bris.ac.uk/~pldab

      /rdf-dot/Thes/Thes.xrdf#Term"/>

   <thes:lang>en</thes:lang>

   <web:value>Interpersonal Attraction</web:value>

   <thes:termUsage web:resource="http://snowball.ilrt.bris.ac.uk

      /~pldab/rdf-dot/Thes/Thes.xrdf#preferred"/>

</web:Description>

<web:Description about="http://sosig.ac.uk/hasset/concepts/CID_6">

   <web:type resource="http://snowball.ilrt.bris.ac.uk/~pldab/

      rdf-dot/Thes/Thes.xrdf#Concept"/>

   <rdfs:isDefinedBy web:resource="http://sosig.ac.uk/hasset/

      concepts/"/>

   <thes:indicator web:resource="http://sosig.ac.uk/hasset/

      terms/TID_3"/>

   <thes:conceptCode>768</thes:conceptCode>

   <thes:broaderConcept>

      <web:Description about="http://sosig.ac.uk/hasset/

         concepts/CID_8">

         <rdfs:isDefinedBy web:resource="http://sosig.ac.uk/

            hasset/concepts/"/>

         <thes:indicator web:resource="http://sosig.ac.uk/

            hasset/terms/TID_15"/>

         <thes:conceptCode>769</thes:conceptCode>

      </web:Description>

   </thes:broaderConcept>

   <thes:relatedConcept web:resource="http://sosig.ac.uk/hasset/

      concepts/CID_15"/>

</web:Description>

<web:Description about="http://sosig.ac.uk/hasset/concepts/CID_15">

   <web:type resource="http://snowball.ilrt.bris.ac.uk/~pldab/

      rdf-dot/Thes/Thes.xrdf#Concept"/>

   <rdfs:isDefinedBy web:resource="http://sosig.ac.uk/hasset/

      concepts/"/>

   <thes:indicator web:resource="http://sosig.ac.uk/hasset/

      terms/TID_21"/>

   <thes:conceptCode>780</thes:conceptCode>

   <thes:scope web:resource="http://sosig.ac.uk/hasset/

      scopenotes/SN_12"/>

</web:Description>

<web:Description about="http://sosig.ac.uk/hasset/terms/TID_15">

   <web:type resource="http://snowball.ilrt.bris.ac.uk/~pldab/

     rdf-dot/Thes/Thes.xrdf#Term"/>

   <thes:lang>en</thes:lang>

   <web:value>Interpersonal Relations</web:value>

   <thes:termUsage web:resource="http://snowball.ilrt.bris.ac.uk/

      ~pldab/rdf-dot/Thes/Thes.xrdf#preferred"/>

</web:Description>

<web:Description about="http://sosig.ac.uk/hasset/terms/TID_21">

   <web:type resource="http://snowball.ilrt.bris.ac.uk/~pldab/

      rdf-dot/Thes/Thes.xrdf#Term"/>

   <thes:lang>en</thes:lang>

   <web:value>Friends</web:value>

   <thes:termUsage web:resource="http://snowball.ilrt.bris.ac.uk/

       ~pldab/rdf-dot/Thes/Thes.xrdf#preferred"/>

</web:Description>

<web:Description about="http://sosig.ac.uk/hasset/scopenotes/SN_12">

   <web:type resource="http://snowball.ilrt.bris.ac.uk/

       ~pldab/rdf-dot/Thes/Thes.xrdf#ScopeNote"/>

   <thes:lang>en</thes:lang>

   <web:value>To be used only for platonic relationships</web:value>

</web:Description>

</web:RDF>

HASSET Data

Appendix B: Open Issues

The following issues require further attention. Future versions of this document might attempt this (but then again, they might not...)

(1) URIs for 'nodes' in the thesaurus graph.
Should we remain agnostic about whether the nodes (resources) are assigned public, well known URI identifiers, or should we explicitly articulate alternative idenfification strategies for use with (so-called) anonymous resources?
(2) XML Schemata
We should write some XML schemas to support various message formats using our proposed thesaurus information model. For applications that prefer predictable, high constrained data structures to open-ended, extensible ones, this would be particularly useful. We could use W3C XML Schema and/or something like Schematron, which might be more useful for mixed-namespace specifications, eg. RSS newsfeed taxonomy data.
(3) Implementation overview
An appendix describing the ILRT implementation would be useful, particularly if we showed use of the thesaurus data via RDF query and RDF graph API interfaces.
(4) Analysis: thesaurus vs class hierarchy
Any modeling environment that has the notion of a class/type hierarchy presents a challenge for thesaurus apps: when to use the 'built in' hierarchy versus when to layer on top using application-specific machinery ('broader-term' versus 'rdfs:subClassOf'). More guidance for implementors is needed on this topic. Use WordNet to illustrate this.

References:

International Organization for Standardisation. 1986. ISO 2788: Guidelines for the establishment and development of monolingual thesauri, 2nd ed., Geneva: ISO.

Swick, Ralph et al. (Accessed June 2000). W3C Resource Description Framework. http://www.w3.org/RDF/

World Wide Web Consortium. (Accessed June 2000). W3C – The World Wide Web Consortium. http://www.w3.org/