M.L.Zeng @ ISSAI, Helsinki,2007 4

12
1 Introductory Review of Current Knowledge Organization Systems/Structures/Services (KOS) Marcia Lei Zeng Second International Seminar on Subject Access to Information, Helsinki, Finland, 29-30 November 2007 M.L.Zeng @ ISSAI, Helsinki,2007 2 Purpose of this talk Introduce different types of knowledge organization systems/structures/services (KOS) Provide a common terminology and background M.L.Zeng @ ISSAI, Helsinki,2007 3 1. KOS overview (1) Knowledge organization systems/structures/services (KOS) encompass all types of schemes for organizing information and promoting knowledge management. (Gail Hodge, 2000) M.L.Zeng @ ISSAI, Helsinki,2007 4 1. KOS overview (2) These systems model the underlying semantic structure of a domain, and provide semantics, navigation, and translation through labels, definitions, typing, relationships, and properties for concepts. (Hill et al. 2002, Koch and Tudhope 2004). A Taxonomy of KOS Term Lists: Authority Files Synonym Rings Classification & Categorization: Subject Headings Classification schemes Taxonomies Categorization schemes Relationship Models: Ontologies Semantic networks Thesauri Glossaries/Dictionaries Pick lists Gazetteers Directories Metadata-like Models: Function Structure M.L.Zeng @ ISSAI, Helsinki,2007 6 2. Fundamentals of KOS Approaches 2.1 Eliminating ambiguity 2.2 Controlling synonyms or equivalents 2.3 Making explicit semantic relationships Hierarchical relationships Hierarchical + other associate relationships 2.4 Presenting relationships as well as properties of concepts

Transcript of M.L.Zeng @ ISSAI, Helsinki,2007 4

1

Introductory Review ofCurrentKnowledge OrganizationSystems/Structures/Services(KOS)

Marcia Lei ZengSecond International Seminar on SubjectAccess to Information, Helsinki,Finland, 29-30 November 2007

M.L.Zeng @ ISSAI, Helsinki,2007 2

Purpose of this talk

• Introduce different types ofknowledge organizationsystems/structures/services(KOS)

• Provide a commonterminology and background

M.L.Zeng @ ISSAI, Helsinki,2007 3

1. KOS overview (1)

Knowledge organizationsystems/structures/services(KOS) encompass all types ofschemes for organizinginformation and promotingknowledge management.– (Gail Hodge, 2000)

M.L.Zeng @ ISSAI, Helsinki,2007 4

1. KOS overview (2)

These systems• model the underlying semantic

structure of a domain, and• provide semantics, navigation, and

translation through labels,definitions, typing, relationships,and properties for concepts.– (Hill et al. 2002, Koch and Tudhope 2004).

A Taxonomy of KOS

Term Lists:Authority Files

Synonym Rings

Classification &Categorization:

Subject Headings

Classification schemesTaxonomies

Categorization schemes

Relationship Models: OntologiesSemantic networks

Thesauri

Glossaries/DictionariesPick lists

GazetteersDirectories

Metadata-likeModels:

Function

Structure

M.L.Zeng @ ISSAI, Helsinki,2007 6

2. Fundamentals of KOS Approaches

• 2.1 Eliminating ambiguity• 2.2 Controlling synonyms or

equivalents• 2.3 Making explicit semantic

relationships– Hierarchical relationships– Hierarchical + other associate

relationships• 2.4 Presenting relationships as well

as properties of concepts

2

M.L.Zeng @ ISSAI, Helsinki,2007 7

2.1 Eliminating ambiguity

• Ambiguity: terms having thesame spelling (homographs)that represent differentconcepts or meanings

• Ambiguity exists when a giventerm can be used to representcompletely different concepts.

Ambiguity / Homographs

Source: Z39.19-2005, p.25

M.L.Zeng @ ISSAI, Helsinki,2007 9

To eliminate ambiguity (1)

1. Adding a qualifier to a term-- one of the major methods used

by almost every type of KOS,especially lists of subjectheadings and thesauri.

• e.g., Mercury (automobile)

M.L.Zeng @ ISSAI, Helsinki,2007 10

2. Providing a scope note-- another major method used by

almost every type of KOS,especially lists of subjectheadings, classifications, andthesauri.

To eliminate ambiguity (2)

Screenshot from MeSHhttp://www.nlm.nih.gov/mesh/MBrowser.htmlEntry: mercury

M.L.Zeng @ ISSAI, Helsinki,2007 11

http://www.nlm.nih.gov/mesh/MBrowser.html

M.L.Zeng @ ISSAI, Helsinki,2007 12

To eliminate ambiguity (3)

3. providing a context of a term

3

M.L.Zeng @ ISSAI, Helsinki,2007 13

What are these?

• Flying Horse• King Fisher• Royal Challenge• Heineken• Budweiser• Miller-Lite• Bud-Light

Drinks• Flying Horse• King Fisher• Royal Challenge• Taj Mahal• Hayward’s 2000• Heineken• Corona• Budweiser• Miller-Lite• Bud-Light

Lists (Picklists)A type of controlled vocabulary induced in

NISO Z39.19 Standard

M.L.Zeng @ ISSAI, Helsinki,2007 16

• Lists are used to describe aspects of contentobjects or entities that have a limited number ofpossibilities.

• Examples include:– geography (e.g., country, state, city),– language (e.g., English, French, Swedish),– format (e.g., text, image, sound), or– … …

M.L.Zeng @ ISSAI, Helsinki,2007 17

Lists can be used effectively forboth browsing and searching.

• In browsing, items are directlyaccessed when the list of termsis reviewed and one term isselected

M.L.Zeng @ ISSAI, Helsinki,2007 18

Source: http://www.ncbi.nlm.nih.gov/genome/guide/human/resources.shtml

4

M.L.Zeng @ ISSAI, Helsinki,2007 19

• In searching, a list may beused to access content in asingle term search, or the termsfrom the list may be used tolimit a retrieved set by anotherattribute of interest for the user(one or more terms in thesearch).

M.L.Zeng @ ISSAI, Helsinki,2007 20

Source: Google’s advanced search http://www.google.com

pick lists

Waterford County Image Archivehttp://www.waterfordcountyimages.org

M.L.Zeng @ ISSAI, Helsinki,2007 22

Waterford County Image Archivehttp://www.waterfordcountyimages.org

M.L.Zeng @ ISSAI, Helsinki,2007 23

List - Definition, Purpose, and Uses• A list (also called a pick list) is

a limited set of terms arrangedas a simple alphabetical list orin some other logically evidentway.– A list is a series of terms in some

sequential order.– Terms can be ordered

alphabetically, chronologically,numerically, etc.

Exercise: Which list isbetter?

5

M.L.Zeng @ ISSAI, Helsinki,2007 25

• The defining characteristics ofa list are that the terms:· are all members of the same set

or class of items (e.g., countries,products)

· are not overlapping in meaning· are equal in terms of specificity

(granularity)

M.L.Zeng @ ISSAI, Helsinki,2007 26

Typical applications

• Lists are frequently used todisplay small sets of terms thatare to be used for quitenarrowly defined purposessuch as a web pull-down list orlist of menu choices.

M.L.Zeng @ ISSAI, Helsinki,2007 27

2. Fundamentals of KOS Approaches

• 2.1 Eliminating ambiguity• 2.2 Controlling synonyms or

equivalents• 2.3 Making explicit semantic

relationships– Hierarchical relationships– hierarchical + other associate

relationships• 2.4 Presenting relationships as

well as properties of concepts

M.L.Zeng @ ISSAI, Helsinki,2007 28

2.2 Controlling synonyms orequivalents• Synonyms: terms with the same or

similar meanings1. True synonyms (unusual)

– mean exactly the same thing and areused in precisely the same context

2. Near synonyms (most common)

M.L.Zeng @ ISSAI, Helsinki,2007 29

1. True Synonyms• common and technical names

– salt vs. sodium chloride• changes in usage of terms over time

– electronic calculating machines vs.computers

• in different languages– eyeglasses, spectacles, glasses

• acronyms– BBC, British Broadcasting Company;

MPG, miles per gallon• variant spellings:

– cancelled, canceled; honor, honour

M.L.Zeng @ ISSAI, Helsinki,2007 30

2. Near Synonyms

• Same stem– computing, computers,

computed,microcomputers,supercomputers

• Overlapping concepts– medicine, drugs– fired, laid off– forest, woods– arid, dry

• General andspecific termsCoffee– Double Espresso– Latte– Cappuccino– Short Black– Macchiato– Flat White– etc.

6

M.L.Zeng @ ISSAI, Helsinki,2007 31

Synonymy

Source: Z39.19-2005, p.25M.L.Zeng @ ISSAI, Helsinki,2007 32

• Each distinct concept shouldrefer to a unique linguisticform.

• Information or content that isprovided to a user should notspread across the system undermultiple access points, butshould be gathered together inone place.

… …150 World War, 1939-1945450 European War, 1939-1945450 Second World War, 1939-

1945450 World War 2, 1939-1945450 World War II, 1939-1945450 World War Two, 1939-1945

Source: FAST: FacetedApplication of SubjectTerminologyhttp://fast.oclc.org/

Controlling synonyms: there will only be one term used to representa given concept or entity.

or:

World War, 1939-1945UF European War, 1939-1945UF Second World War, 1939-1945UF World War 2, 1939-1945UF World War II, 1939-1945UF World War Two, 1939-1945

European War, 1939-1945USE World War, 1939-1945

Second World War, 1939-1945USE World War, 1939-1945

World War 2, 1939-1945USE World War, 1939-1945

World War II, 1939-1945USE World War, 1939-1945

World War Two, 1939-1945USE World War, 1939-1945

AuthorityFile

Thesaurus

M.L.Zeng @ ISSAI, Helsinki,2007 34

Source: Art and ArchitectureThesaurus (AAT)

M.L.Zeng @ ISSAI, Helsinki,2007 35

Source: Medical Subject Headings (MeSH)

Synonym RingsA type of controlled vocabulary induced in

NISO Z39.19 Standard

7

astronaut

spaceman cosmonaut

spationaut taikonaut

A synonym ring connects a set of words that aredefined as equivalent for retrieval.

An example from International SEMATECH.

A search for Silicon would look like this:

Your search was submitted as “CILICON” or “SI”

M.L.Zeng @ ISSAI, Helsinki,2007 39

Synonym Rings are used--• to expand queries for content objects

– If a user enters any one of these terms asa query to the system, all items areretrieved that contain any of the termsin the cluster.

• in systems where the underlyingcontent objects are left in theirunstructured natural languageformat– The control is achieved through the

interface by drawing together similarterms to these clusters.

• in conjunction with search engines

Poverty mitigation

Poverty alleviation

Poverty elimination

Poverty reducation

Poverty eradication

Poverty abatement

Poverty prevention

Poverty reduction

Rings can include all kinds ofsynonyms - true,misspellings, predecessors,abbreviationsSource: Bedford, 2006 ppt.

M.L.Zeng @ ISSAI, Helsinki,2007 41

Exercise

• Find synonyms of this type ofobject:

M.L.Zeng @ ISSAI, Helsinki,2007 42

2. Fundamentals of KOS Approaches• 2.1 Eliminating ambiguity• 2.2 Controlling synonyms or

equivalents• 2.3 Making explicit semantic

relationships– Hierarchical relationships– hierarchical + other associate

relationships• 2.4 Presenting relationships as well

as properties of concepts

8

M.L.Zeng @ ISSAI, Helsinki,2007 43

2.3 Making explicit semantic relationships –Hierarchical relationships

BirdsCardinalsDovesRobinsWrens

All specific names ofbirds are kinds of birds.

Phylum: ChordataClass: Reptilia

Subclass: AnapsidaOrder: Testudines

Suborder: CryptodiraFamily: Dermochelyidae

Genus: DermochelysSpecies: Dermochelys coriacea

(Leatherback turtle)

Scientific TaxonomyAn example: Leatherback turtle

M.L.Zeng @ ISSAI, Helsinki,2007 45

superordinate classes (e.g., parents). coordinate classes (e.g., siblings)

. . subordinate classes (e.g., children). . subordinate classes

. coordinate classes

. coordinate classes. . subordinate classes

relationship types: generic, instance, and whole-part

Classifications

M.L.Zeng @ ISSAI, Helsinki,2007 46

M.L.Zeng @ ISSAI, Helsinki,2007 47

Part / WholeCause / EffectProcess / AgentAction / ProductAction / PatientConcept or Thing / PropertiesConcept or Thing / OriginsThing or Action / Counter-agentRaw material / ProductAction / Property

Antonyms

Bicycle / Bicycle WheelAccident / InjuryVelocity measurement / SpeedometerWriting / PublicationTeaching / StudentSteel alloy / Corrosion resistanceWater / WellPest / PesticideGrapes / WineCommunication / Communication

skillsSingle people / Married people

Relationship Example

2.3 Making explicit semantic relationships –Associative relationships (not hierarchical)

9

M.L.Zeng @ ISSAI, Helsinki,2007 49 M.L.Zeng @ ISSAI, Helsinki,2007 50

Source: Z39.19-2005, p.29

KOS in Use at World Bank

• Topic Thesaurus (500,000+English terms, French andSpanish language versions inprogress now)

• Topic Classification Scheme(30 top classes, 700+ subtopics,300+ subsubtopics)

• Business Function Thesaurus(50,000 terms and growing)

• Business FunctionClassification Scheme (5business areas, 30 lines ofbusiness, 300+ businessprocesses)

• Country-Region classificationscheme (6 regions, ca. 200countries)

• Content Type ClassificationScheme (8 content types, 300+secondary content types – inrefinement now)

• Media-Format ClassificationScheme

• Country Name Authority Control(synonym, predecessor, successorsources)

• Edition Statements AuthorityControl

• Publisher Name AuthorityControl

• Organization Authority Control• Language Authority Control• Series Name/Collection Title

Authority Control• Translation Type Authority

Control

Source: Bedford, 2007, ASIST

M.L.Zeng @ ISSAI, Helsinki,2007 53

Pick lists Hierarchicaltaxonomy

SynonymRings

SynonymRings

Vision of An Enterprise Advanced Search

Source: Revised based on Bedford, 2006 ppt.

M.L.Zeng @ ISSAI, Helsinki,2007 54

Synonym Rings

Thesaurus

Metadata

Source: Revised based on Bedford, 2006 ppt.

10

2. Fundamentals of KOS Approaches• 2.1 Eliminating ambiguity• 2.2 Controlling synonyms or

equivalents• 2.3 Making explicit semantic

relationships– Hierarchical relationships– hierarchical + other associate

relationships

• 2.4 Presentingrelationships as well asproperties of concepts M.L.Zeng @ ISSAI, Helsinki,2007 56

2.4 Presenting relationships aswell as properties of concepts

• Entity types• Relationship types• Properties

M.L.Zeng @ ISSAI, Helsinki,2007 57

Semantic networks

organize sets of termsrepresenting concepts,modeled as the nodes in anetwork of variablerelationship types.

M.L.Zeng @ ISSAI, Helsinki,2007 58

UMLS Semantic Network

135 Semantic Types (link) and 54 Semantic Relation Types (link)

Source: Noy, N. F. and Tu, S.W. (2003).

Ontologies

Classes

attributes

instances

11

M.L.Zeng @ ISSAI, Helsinki,2007 61 M.L.Zeng @ ISSAI, Helsinki,2007 62

M.L.Zeng @ ISSAI, Helsinki,2007 63

The Graph view of relations

M.L.Zeng @ ISSAI, Helsinki,2007 64

A Taxonomy of KOS © 2007 Zeng

OntologiesSemantic networks

Thesauri

Glossaries/DictionariesPick lists

xxxxxpresenting properties

xxxxxxxxxestablishingrelationships: associative

xxxxxxx xxxxestablishingrelationships: hierarchical

xxxxxxxxx xxxxxxcontrolling synonymsxxxxxxxxx xxxxxeliminating ambiguity

establishing

xestablishingxxxx

function

Two-dimensions

Term Lists: Synonym RingsFlat

structure

Classification &Categorization:

Subject Headings

Classification schemesTaxonomies

Categorization schemes

Relationship Models:

GazetteersDirectories

Authority Files

Metadata -likeModels:

Multipledimensions

Majo

r fun

ction

s

M.L.Zeng @ ISSAI, Helsinki,2007 66

Networked KOSè NKOS

• KOS are not used in isolation;• KOS may be used, re-used, and re-

purposed in web-based services;• KOS are used for:

– organizing, indexing, cataloging, and searching,AND

– learning, knowledge modeling, reasoning, etc.• NKOS need to be machine-processable,

machine-understandable– (more to discuss later today)

12

M.L.Zeng @ ISSAI, Helsinki,2007 67

References

• Hodge, Gail (2000). Systems of Knowledge Organization forDigital Libraries: Beyond Traditional Authority Files. Washington,DC: Council on Library and Information Resources.http://www.clir.org/pubs/reports/pub91/contents.htmlhttp://www.clir.org/pubs/reports/pub91/pub91.pdf

• Hill, Linda, Buchel, Olha, Janee, Greg, and Zeng, Marcia L.2002. Integration of knowledge organization systems intodigital library architectures: In: Mai, Jens-Erik, et al. ed.:Advances of classification research, volume 13, proceedings of the13th ASIST SIG/CR Workshop, 17 November 2002Philadelphia PA, pp. 62-68.

• Koch, Traugott and Tudhope, Douglas. 2004. User-centredapproaches to Networked Knowledge OrganizationSystems/Services (NKOS): Background.http://www2.db.dk/nkos-workshop/#Background