In pursuit of interoperability: Can we standardize mapping types?

22
In pursuit of interoperability: Can we standardize mapping types? Stella G Dextre Clarke Project Leader, ISO NP 25964

description

In pursuit of interoperability: Can we standardize mapping types?. Stella G Dextre Clarke Project Leader, ISO NP 25964. Overview. Compare mapping types used in some well-known projects: MACS; CrissCross; RENARDUS; KoMoHe - PowerPoint PPT Presentation

Transcript of In pursuit of interoperability: Can we standardize mapping types?

Page 1: In pursuit of interoperability: Can we standardize mapping types?

In pursuit of interoperability:Can we standardize mapping types?

Stella G Dextre ClarkeProject Leader, ISO NP 25964

Page 2: In pursuit of interoperability: Can we standardize mapping types?

Overview Compare mapping types used in some

well-known projects: MACS; CrissCross; RENARDUS; KoMoHe

and in Doerr’s well-cited paper on Semantic problems of thesaurus mapping

And in 3 standards: BS 8723-4, SKOS and the forthcoming ISO 25964-2

Ask how feasible it is to achieve standardization

Page 3: In pursuit of interoperability: Can we standardize mapping types?

MACS Project Context: enabling multilingual access to

collections indexed with different vocabularies

Vocabularies are all subject heading schemes All mappings are considered equivalence Equivalence can be simple or compound Two types of compound equivalence:

Heading A = Heading B OR Heading C Heading A = Heading B AND Heading C

Page 4: In pursuit of interoperability: Can we standardize mapping types?

CrissCross Project Context: improving access to

vocabularies and heterogeneously indexed collections (in one natural language)

One-way mappings From a subject headings scheme to a

classification scheme Many mappings from one keyword “Degrees of determinacy” rather than

distinct mapping types – D1, D2, D3, D4

Page 5: In pursuit of interoperability: Can we standardize mapping types?

RENARDUS Project Context: search/browse across

gateways using different classification schemes

One-way mappings, from DDC to local schemes

Five mapping types: fully equivalent broader or narrower equivalent major or minor overlap

Page 6: In pursuit of interoperability: Can we standardize mapping types?

GESIS/KoMoHe Context: distributed search across systems

using 25 different vocabularies (thesauri and classification schemes)

(Separate) mappings in both directions Three basic mapping types:

Equivalence Hierarchical Associative

Also there is an explicit “null relationship” Any mapping can be one-to-one or one-to-many Every mapping can have a “relevance rating” of

high, medium or low.

Page 7: In pursuit of interoperability: Can we standardize mapping types?

Doerr’s findings(see http://journals.tdl.org/jodi/article/view/31/32)

Context: query transformation is assumed to be the main application of mappings

All the vocabularies discussed are thesauri, applied to documents and/or museum collections

Basic types of mapping are: exact equivalence inexact equivalence broader equivalence narrower equivalence

Exact, broader and narrower equivalence can be simple or compound

Compound equivalence means a Boolean expression of target terms using AND, OR or NOT (but in practice no examples are given using NOT).

Page 8: In pursuit of interoperability: Can we standardize mapping types?

BS 8723-4 Provides for mapping search terms or index

terms Emphasis on thesauri, although other vocabulary

types are taken into account Basic mapping types:

equivalence; hierarchical, associative Hierarchical subdivides into broader/narrower Equivalence subdivides into simple/compound Degrees of equivalence (such as exact, inexact,

partial) are discussed but not formalised as distinct types other than those described above.

Page 9: In pursuit of interoperability: Can we standardize mapping types?

SKOS (Simple Knowledge Organization System) data model Context is sharing/linking KOSs via the Web SKOS development began with thesauri, but

has extended to classification schemes, subject heading schemes, etc.

Basic mapping “properties” (skos:mappingRelation): skos:closeMatch (symmetric)

skos:exactMatch (symmetric, transitive) skos:relatedMatch (symmetric) skos:broadMatch (inverse of narrowmatch) skos:narrowMatch (inverse of broadmatch)

No provision for compound mappings

Page 10: In pursuit of interoperability: Can we standardize mapping types?

ISO 25964-2 (still in draft) A revision of ISO 2788 and ISO 5964 as well as

BS 8723 Provides for mapping search terms or index

terms Emphasis on thesauri, although other

vocabulary types are taken into account Basic mapping types:

EquivalenceHierarchicalAssociative

“Inexact” can apply to any mapping, but most commonly to equivalence

Page 11: In pursuit of interoperability: Can we standardize mapping types?

ISO 25964-2 (still in draft) A revision of ISO 2788 and ISO 5964 as well as

BS 8723 Provides for mapping search terms or index

terms Emphasis on thesauri, although other

vocabulary types are taken into account Basic mapping types:

Equivalence Laptop computers EQ Notebook computersHierarchical Roads NM Streets; Streets BM RoadsAssociative Journals RM Magazines

“Inexact” can apply to any mapping, but most commonly to equivalence

Horticulture ~EQ Gardening

Page 12: In pursuit of interoperability: Can we standardize mapping types?

ISO 25964-2 mapping types

Basic mapping types:EquivalenceHierarchicalAssociative

“Inexact” can apply to any mapping, but most commonly to equivalence

Page 13: In pursuit of interoperability: Can we standardize mapping types?

ISO 25964-2 mapping types in more detail Basic mapping types:

EquivalenceSimpleCompound

Intersecting compound equivalenceCumulative compound equivalence

HierarchicalBroaderNarrower

Associative “Inexact” can apply to any mapping, but most

commonly to equivalence, including compound equivalence

Page 14: In pursuit of interoperability: Can we standardize mapping types?

ISO 25964-2 equivalence mappings in more detail

Simple Laptop computers EQ Notebook computers

Compound Intersecting compound equivalence

Women executives EQ Women + Executives

Cumulative compound equivalenceInland waterways EQ rivers |

canals

Page 15: In pursuit of interoperability: Can we standardize mapping types?

Intersecting versus cumulative equivalence

Women executives EQ Women + Executives

Inland waterways EQ rivers | canals

executives

women

women executives

canals

inland waterways

rivers

Page 16: In pursuit of interoperability: Can we standardize mapping types?

Some key messages re compound equivalence If you use mappings for conversion of

index terms, you implement intersecting equivalents quite differently from cumulative equivalents.

With simple equivalence (exact or inexact) and with hierarchical or associative mappings, two-way conversions are usually OK; but compound equivalence typically works in one direction only.

Page 17: In pursuit of interoperability: Can we standardize mapping types?

Inexact: another complication for equivalence mappings Simple Laptop computers EQ Notebook computers

Compound Intersecting compound equivalence

Women executives EQ Women + Executives Cumulative compound equivalence

Inland waterways EQ rivers | canals Inexact simple equivalence Lawns ~EQ Turf

Inexact compound equivalence Women executives ~EQ Females + Managers

Page 18: In pursuit of interoperability: Can we standardize mapping types?

Major/minor overlap: yet another complication Found useful in Renardus project Is there a parallel with the KoMoHe “relevancy

rating”? Earlier versions of SKOS allowed “majorMatch”

and “minorMatch”; these were subsequently deprecated

It would apply to inexact equivalence; maybe also to hierarchical and associative mappings?

How would you judge it in cases of compound equivalence?

A recent draft of ISO 25964 admits major/minor as an optional attribute of inexact equivalence, in the context of a particular application.

Page 19: In pursuit of interoperability: Can we standardize mapping types?

Now we come to the crunch:Can we standardize these mapping types?

We can certainly write them in a standards document, but can we make them stick? Will real users implement them according to the guidance rules in the standard?

Page 20: In pursuit of interoperability: Can we standardize mapping types?

To make a standard stick: Keep it simple Address a real need Adopt rules that are already broadly

accepted in the user community Keep it within the implementation range

of available software Make the standard available easily and

free – or at least at a low price Commit to lifelong maintenance

Page 21: In pursuit of interoperability: Can we standardize mapping types?

Want a copy of ISO 25964-2 ? A draft is due to appear in January 2011,

“ISO DIS 25964-2”, with the hope of attracting comments from potential users

The official way to get it is through your national standards body (e.g. DIN)

Distribution policies vary from one country to another; last time round we found a way to make the draft available online free of charge and free of passwords, on the BSI site.

Send me an email and I’ll alert you when the DIS is released. [email protected]

Page 22: In pursuit of interoperability: Can we standardize mapping types?

References (abbreviated) MACS: Landry, Patrice. Multilingual subject access: the linking

approach of MACS. Cataloging & Classification Quarterly. 2004; 37(3/4):177-191

CrissCross: http://linux2.fbi.fh-koeln.de/crisscross/swd-ddc-mapping_en.html RENARDUS: http://www.mpdl.mpg.de/staff/tkoch/publ/preifla-final.html KoMoHe:

http://www.gesis.org/en/research/programs-and-projects/knowledge-technologies/project-overview/komohe/

Doerr: http://journals.tdl.org/jodi/article/view/31/32 SKOS: http://www.w3.org/TR/skos-reference/ BS 8723-4:2007 Structured vocabularies for information

retrieval - Guide - Interoperability between vocabularies. British Standards Institution

ISO 25964-2 (still in draft). Thesauri and interoperability with other vocabularies – Part 2: Interoperability with other vocabularies