Http:// © 2003 Ontopia AS 1 The TAO of Topic Maps An Introduction to Topic Maps, Ontologies, and...

36
http:// www.ontopia.net/ © 2003 Ontopia AS 1 The TAO of Topic Maps An Introduction to Topic Maps, Ontologies, and Published Subjects Steve Pepper, CEO, Ontopia Convenor ISO/IEC JTC 1/SC 34/WG 3 Editor XML Topic Maps <[email protected]>

Transcript of Http:// © 2003 Ontopia AS 1 The TAO of Topic Maps An Introduction to Topic Maps, Ontologies, and...

TAO of Topic MapsAn Introduction to
Editor XML Topic Maps
Convenor of ISO/IEC JTC 1/SC 34/WG 3 (Information Association)
Editor of XML Topic Maps 1.0 specification (XTM)
Editor of Topic Map Constraint Language
Founder and CEO of Ontopia
Ontopia
Norwegian company, headquartered in Oslo
The TAO of Topic Maps
(c) 2003 Ontopia AS
An international standard, approved by the ISO
A form of knowledge representation that is optimized for information management
A formal data model with an XML interchange syntax
An indexing and navigation paradigm for humans
A source of intelligent data for software agents
A technology for exploiting ontologies
The TAO of Topic Maps
(c) 2003 Ontopia AS
Introducing the Topic Map Model
The core concepts of Topic Maps are based on those of the back-of-book index
The same basic concepts have been extended and generalized for use with digital information
Envisage a 2-layer data model consisting of
a set of information resources (below), and
a “knowledge map” (above)
into content and index
can be in any format or notation
can be text, graphics, video, audio, etc.
This is like the content of the book to which the
back-of-book index belongs
Topics represent the subjects that the information is about
Like the list of topics that forms a back-of-book index
Associations represent relationships between those subjects
Like “see also” relationships in a back-of-book index
knowledge layer
composed by
born in
composed by
The two layers are linked together
Occurrences are information resources that are pertinent
to a given knowledge topic
The links (or locators) are
like page numbers in a
back-of-book index
= The TAO of Topic Maps
A pool of information or data
any type or format
composed by
born in
composed by
Occurrences
information that is relevant in some way to a given knowledge topic
Puccini
Tosca
Lucca
Madame
Butterfly
Topics
a set of knowledge topics for the domain in question
http://www.ontopia.net/
Cavalleria Rusticana, 71, 203-204
singers, 39-52
Associations: e.g. “Puccini was born in Lucca”
Occurrences: e.g. “http://www.opera.net/puccini/bio.html is a biography of Puccini”
Each of these constructs can be typed
Topic types: “composer”, “city”, “opera”
Association types: “born in”, “composed by”
Occurrence types: “biography”, “street map”, “synopsis”
All such types are also topics (within the same topic map)
“Puccini” is a topic of type “composer” … and “composer” is also a topic
A topic map thus contains its own ontology
(“Ontology” is here defined as the classes of things that exist in the domain…)
Demo of the Omnigator
Online demo:
An Omnivorous Topic Map Navigator
The Omnigator will Eat Anything (provided it’s a topic map!)
Any Ontology: including your own
Just drop your own topic map into the Omnigator directory
and away you go!
“reasonably sensible” topic map
http://www.ontopia.net
http://www.ontopia.net/omnigator
http://www.ontopia.net/
Make knowledge explicit, by
Expressing the relationships between those subjects
Bridge the domains of knowledge and information, by
Describing where to find information about the subjects
Linking information about a common subject across multiple repositories
Transcend simple categories, hierarchies, and taxonomies, by
Applying rich associative structures that capture the complexity of knowledge
Enable implicit knowledge to be made explicit, by
Providing clearly identifiable hooks for attaching implicit knowledge
But there’s more (of course)…
http://www.ontopia.net/
Knowledge is not absolute; it has a contextual aspect
Context sensitivity is handled through the concept of scope
Scope makes it possible to
Cater for the subjectivity of knowledge
Express multiple viewpoints in one knowledge base
Provide personalized views for different groups of users
Track the source of knowledge during merging
(Scopes are defined as sets of topics)
http://www.ontopia.net/
How Scope Works
Topics have “characteristics”
Its names and occurrences, and the roles it plays in associations with other topics
Every characteristic is valid within some context (scope), e.g.
the name “Ruotsi” for the topic
Sweden in the scope “Finnish”
a certain information occurrence
in the scope “technician”
scope (according to) “Authority X”
T
T
name
occurrence
Reality is ambiguous and knowledge has a subjective dimension
Scope allows the expression of multiple perspectives in a single Topic Map
Contextual knowledge
Some knowledge is only valid in a certain context, and not valid otherwise
Scope enables the expression of contextual validity
Traceable knowledge aggregation
When the source of knowledge is as important as the knowledge itself:
Scope allows retention of knowledge about the source of knowledge
Personalized knowledge
Scope permits personalization based on personal references, skill levels,
security clearance, etc.
http://www.ontopia.net/
Intuitive navigational interfaces for humans
The topic/association layer mirrors the way people think
Powerful semantic queries for applications
A formal underlying data structure
Demo of querying in the Omnigator
Customized views based on individual requirements
Personalization based on scope
Information aggregation “sans frontiers”
http://www.ontopia.net/
In Topic Maps, every topic represents some subject
The collocation objective requires exactly one topic per subject
When two topic maps are merged, topics that represent the
same subject should be merged to a single topic
When two topics are merged, the resulting topic has the
union of the characteristics of the two original topics
Merge the two topics together...
name
occurrence
name
A second topic (in another topic map) “about” the same subject
T
name
occurrence
...and the resulting topic has the union
of the original characteristics
Information integration
Information that spans multiple repositories can be merged to provide a unified view of the whole
Knowledge sharing across the organization
Knowledge captured in one part of an organization can be made available to the whole organization
Distributed knowledge management
There is no need to centralize knowledge management in order to make it sharable
Knowledge sharing between organizations
Information and knowledge can be shared without enforcing a common vocabulary
http://www.ontopia.net/
Topic Maps are designed for ease of merging!
Multiple Topic Maps can be created from many different repositories of
information ... and then merged to provide a unified view of the whole
Typical Applications:
Integration of hitherto disconnected “islands” of information within an enterprise
Federation of knowledge
from multiple sources
Names are not unambiguous
(the synonym problem)
Reliable knowledge aggregation is only possible through the use of unique global identifiers
The issue of identification of subjects is crucial
If subjects have unique identifiers, people can be free to use whatever names they like – and machines can still aggregate information
http://www.ontopia.net/
The Crucial Concept of Subject Identification
Topics exist in order to allow us to discourse about subjects
It is crucially important to be able to establish exactly which subject a topic represents, i.e. to establish its subject identity
Without the ability to know when applications are talking about the same thing, there can be no interoperability
The most prevalent method of establishing identity in today’s networked environments is to use URIs
COMPUTER
DOMAIN
“REALITY”
57.unknown
http://www.ontopia.net/
URIs are the addresses of resources
They work fine when subject is a resource (e.g. a document)
It exists somewhere within the computer system, has a location, and can therefore be “addressed”
For example, this presentation might be located at
http://www.ontopia.net/tutorials/tm-intro.ppt
The address of an addressable subject is sufficient to unambiguo establish the subject’s identity
This is called the subject address
But most subjects are not information resources
Puccini, Lucca, Tosca, Madame Butterfly, love, darkness, French, …
These all exist outside the computer domain and cannot be addressed directly
http://www.ontopia.net/
The identity of non-addressable subjects is established indirectly
Through an information resource (like a definition or a picture) that provides some kind of indication of the subject’s identity to a human
Such a resource is called a
subject indicator
A topic may have multiple subject indicators
Because it is a resource, a subject indicator has an address, even though the subject that it is indicating does not
Computers can use the address of the subject indicator to establish identity
These are called subject identifiers
Subject indicators and subject identifiers are the two sides of the human-computer dichotomy
http://www.ontopia.net/
The Computer Domain
subject
Giacomo Puccini, Italian composer, b. Lucca 22nd Dec 1858, d. Brussels, 29th Nov 1924. Best known for his operas, of which
Tosca is the
Published Subjects
A subject indicator that has been made available for use outside one particular application is called a published subject indicator (PSI)
Anyone can publish PSI sets
Adoption of PSI sets will be an evolutionary process that will lead to greater and greater interoperability – between topic map applications, between topic maps and RDF, and across the Semantic Web in general
Publishers and users of ontologies may be among the greatest beneficiaries
OASIS technical committees
geolang: http://www.oasis-open.org/committees/geolang/
Based on existing standards (e.g. ISO 639, ISO 3166)
xmlvoc: http://www.oasis-open.org/committees/xmlvoc/
A PSI set for an ontology of XML and related standards
http://www.ontopia.net/
58.unknown
http://www.ontopia.net/
61.unknown
http://www.ontopia.net/
Make them publicly available!
Define URIs as unique identifiers for the concepts in your ontologies – including the relationship types
http://psi.fao.org/disease/#bse
Make sure they resolve to human-readable resources
Guarantee their stability
Bovine Spongiform Encephalopathy (?), BSE
And enable interoperability and reuse across applications:
Topic Maps, RDF, DAML+OIL, OWL, KIF, XML, etc.
http://www.ontopia.net/
A formal data structure suitable for data processing
Support for rich semantic queries
High degree of built-in semantics simplifies application development
Published Subjects enable widespread and spontaneous knowledge interchange
International standard interchange syntax
http://www.ontopia.net/
Topic Maps for Human Agents
A way of representing knowledge that corresponds to how humans think about the world
Organized around subjects not resources
Direct support for context sensitivity
A level of built-in semantics that makes the model easy to understand
Distinguishes between names, occurrences and associations
Privileges the class-instance relationship
Typed associations provide a rich and intuitive navigational interface
http://www.ontopia.net/
The Topic Map structure governs the application – and the knowledge
Users navigate intuitively from topic to topic
Having found the appropriate topic, they
immediately see all recorded explicit knowledge
can dip down into information resources to “extract” implicit knowledge
Publisher benefits:
Easier content maintenance (simply update the Topic Map)
Easier link maintenance (links are in separate layer, not in content)
New portals easy to derive from same content
User benefits:
Shorter click-through
Far greater structural consistency means less confusion
Demo of the OperaMap portal
http://www.ontopia.net/operamap
http://www.ontopia.net/
Ontopia web site
We are interested in participating in EU projects
Please contact me for more details