The World as Database Barry Smith University at Buffalo Institute for Formal Ontology and Medical...

Post on 19-Dec-2015

215 views 3 download

Tags:

Transcript of The World as Database Barry Smith University at Buffalo Institute for Formal Ontology and Medical...

The World as Database

Barry Smith

University at Buffalo

Institute for Formal Ontology and Medical Information Science,

University of Leipzig

http://ontologist.com

The riddle of representation

two humans, a monkey, and a robot are looking at a piece of cheese; what is common to the representational processes in their visual systems?

Answer:

The cheese, of course

The Technological Background

How the world became part of the World Wide Web

the cheese

Sources

“Motion in Databases: Issues and Possible Solutions”

Ouri Wolfson (University of Illinois)

“Intersection of GI and IT Spatial Databases”

Max J. Egenhofer (University of Maine)

Information Technologies

Global Positioning Systems (GPS)

Digital cameras

Information Technologies

Digital video cameras

Information Technologies

• chemical

• biological

Information Technologies

Microsensors

Location based services

Examples:

Where is the closest gas station? How do I get there?

Track my pet/child/prisoner

Location based services

Wall Street Journal May 8, 2000: Location-based services a killer application for the wireless internet

Strategy Analytics: consumer lbs a $7B market in North America by 2005

Why now? – Proliferation of portable/wearable/wireless devices

Moving Objects Database Technology

Query example:How often is bus #5 late by more than 10

minutes at station 20?

GPS

GPS

GPS

Wireless link

Moving Objects Database Technology

Trigger example:Send message when helicopter in a given

geographic area (trigger)

GPS

GPS

GPS

Wireless link

Moving Objects Database Technology

Query example:List trucks that will reach destination

within 20 minutes (future query)

GPS

GPS

GPS

Wireless link

Moving Objects Database Technology

Present query:

List taxi cabs within 1 mile of my location

GPS

GPS

GPS

Wireless link

PalmPilot context aware

Automatically display the resume of a person I am speaking with

Display the wiring/plumbing behind this wall

Display seismographic charts, maps, graphics, images, concerning a terrain a geologist is viewing

European Media Lab, Heidelberg

Tourism information services

Intelligent, speaking camera plus map display

Display all non-smoking restaurants within walking distance of the castle

Read out a history of the building my camera is pointing to

Mobile e-commerce

Inform a person located at L who needs items of a given sort where he can them (a) most quickly (b) most cheaply (c) at 2am.

Inform a person walking past a bar of his buddies in the bar

Further Applications

Digital battlefieldEmergency responseAir traffic controlSupply chain managementMobile workforce managementDynamic allocation of bandwidth in

cellular network

Syntax and Semantics

Traditional Syntactic/Semantic Approach to Information Systems

011011101010001000100010010010010010010001001111001001011011110110111011

String-Arrays vs. Objects

ghjui123

xxxxx xxxxx

Fodor’s Methodological Solipsism

011011101010001000100010010010010010010001001111001001011011110110111011

Humans, Machines, and the Structure of Knowledge

Harry M. CollinsSEHR, 4: 2 (1995)

Knowledge-down-a-wire

Imagine a 5-stone weakling having his brain loaded with the knowledge of a champion tennis player. He goes to serve in his first match -- Wham! -- his arm falls off.

He just doesn't have the bone structure or muscular development to serve that hard.

Sometimes it is the world which knows

I know where the book is

= I know how to find it

I know what the square root of 2489 is

= I know how to calculate it

I know how to recognize the presencfe of a tiger

= Smell, noise … (in real-world context)

A. Clark, Being There

humans can accomplish much without building detailed, internal models; we rely on

Epistemic action = manipulating Scrabble tiles – using the re-arranged pieces as basis for brain's pattern-completing abilitieswriting one large number above another to multiply them with pen on paper

and on

External scaffolding = maps, models, tools, language, culture

we act so as to simplify cognitive tasks by "leaning on" the structures in our environment.

Not all calculations done inside the head

Gibson: the world is not all chaos

the information outside of the head (the environment) is structured in a way that the brain can process

Types of knowledge/ability/skill

1. those that can be transferred simply by passing signals from one brain/computer to another.

2. those that can’t: -- here the "hardware" is important(a) abilities/skills contained in the body(b) abilities/skills contained in the world

From

The Methodological Solipsist Approach to Information Processing

ToThe Ecological Approach to Information

Processing

… J. J. Gibson

Functioning of Information System intelligible only as part of environment

0110

1110

1010

0010

0010

0010

0100

1001

0010

0100

0100

1111

0010

0101

1011

1101

1011

1011

Ontology

… a branch of philosophy

the science of what is

the science of the kinds and structures of objects, properties, events, processes and relations in reality

Ontology is in many respects comparable to the theories produced by science

… but it is radically more general than these

It can be regarded as a kind of generalized chemistry or zoology

(Aristotle’s ontology grew out of biological classification)

(Russell: Logic is a zoology of facts)

Aristotle

First ontologist

First ontology

(from Porphyry’s Commentary on Aristotle’s Categories)

Linnaean Ontology

Sources for ontological theorizing:

thought experiments

the study of ancient texts

development of formal theories

the results of natural science

now also: working with computers

The existence of computers

and of large databases

allows us to express old philosophical problems in a new light

The problem of the unity of science

The logical positivist solution to this problem addressed a world in which sciences are identified with

printed textsWhat if sciences are identified with

Information Systems ?

Each information system

has its own idiosyncratic terms and concepts by means of which it represents the information it receives How to resolve the incompatibilities which result when information systems (sciences) need to be merged?

The Information System Tower of Babel Problem

Opportunities

Sensor-based information systems

Massively parallel data acquisition

location per second of each person

SIG-INT and HUM-INT

Result: The World Wide Web

Vast amount of heterogeneous data sources

Needs dramatically better support for richly structured ontologies in databases

Ability to query and integrate across different ontologies (Semantic Web)

The term ‘ontology’

came to be used by information scientists to describe the construction of standardized taxonomies designed to make information systems mutually compatibleand thus to make data transportable from one information environment to another

An ‘ontology’

is a dictionary of terms formulated in a canonical syntax and with commonly accepted definitions and axioms designed to yield a shared frameworkfor use by different information systems communities

An ontology

is a concise and unambiguous description of the principal, relevant entities of an application domain and of their potential relations to each other

SO FAR

SO GOOD

But how was this idea in fact realized?

How did information systems engineers proceed to build ontologies? By looking at the world, surely Well, NoThey built ontologies by looking at what people think about the world

(methodological solipsism …)

Quine

For Quineans

Ontology studies, not reality,

but scientific theories

From ontology

… to ontological commitment

Quine:

each natural science has its own preferred repertoire of types of objects to the existence of which it is committed

Quineanism:

ontology is the study of the ontological commitments or presuppositions embodied in the different natural sciences

Quine:

only natural sciences can be taken ontologically seriously The way to do ontology is exclusively through the investigation of scientific theories

Thus it is reasonable to identify ontology

– the search for answers to the question: what exists? –

with the study of the ontological commitments of natural scientists

All natural sciences are compatible with each other

PROBLEM

The Quinean view of ontology becomes strikingly less defensible

when the ontological commitments of various non-scientists are allowed into the mix

How, ontologically, are we to treat the commitments of

astrologists,

clairvoyants,

believers in voodoo?

How, ontologically, are we to treat the commitments of

patients who believe that their illness is caused by evil spirits or magic spells?

Growth of Quinean ontology outside philosophy:

Psychologists and cognitive anthropologists have sought to elicit the ontological commitments (‘ontologies’, in the plural) of different cultures and groups.

This is not ontology

Not all the things that people believe in are genuine objects of ontological investigation

Only what exists is a genuine object of ontological investigation

Why, then,

do information systems ontologists study peoples’ beliefs, thoughts, concepts (STRING-ARRAYS)

rather than the objects themselves?

Arguments for Ontology as Conceptual Modeling

Ontology is hard.

Life is short.

Let’s do conceptual modeling instead

programming real ontology into computers is hard

therefore:

we will simplify ontology

and not care about reality at all

Painting the Emperor´s Palace is

h a r d

therefore

we will not try to paint the Palace at all

... we will be satisfied instead with a grainy snapshot of some other building

Ontological engineers

neglect the standard of truth to reality

in favor of other, putatively more practical, standards:

above all programmability

They turn to substitutes:

to models, to conceptualizations to STRING-ARRAYS

because these are easier to handle

For an information system ontology

there is no reality other than the one created through the system itself, so that the system is, by definition, correct

Only those objects exist which are represented in the system

(constructivism)

Tom Gruber (1995):

‘For AI systems what “exists” is

what can be represented’

Ontological engineering

concerns itself with conceptualizations

It does not care whether these are true of some independently existing reality.

In the world of information systems

there are many surrogate world models

and thus many ontologies

… and all ontologies,

are equalboth good and bad,

ATTEMPTS TO SOLVE THETOWER OF BABEL PROBLEM

VIA ONTOLOGIES AS“CONCEPTUAL MODELS” HAVE

FAILED

Can we do better?

Test Domain:

Medical Terminology

IFOMIS

Institute for Formal Ontology and Medical Information Science

University of Leipzig

Example 1: UMLS

Universal Medical Language SystemTaxonomy system maintained by National Library

of Medicine in Washington DC

134 semantic types800,000 concepts10 million inter-concept relationships

Example 2: SNOMED

Systematized Nomenclature of Medicine

Taxonomy system maintained by the College of American Pathologists

121,000 concepts

340,000 relationships

SNOMED

designed to foster interoperability

to serve as a“common reference point for comparison and aggregation of data throughout the entire healthcare process”

Problems with UMLS and SNOMED

Each is a fusion of several source vocabulariesThey were fused without an ontological system being established first They contain circularities, taxonomic gaps, unnatural ad hoc determinations… several billion dollars still being wasted in the making of retrospective fixes

Blood

Representation of Blood in UMLS

Blood

Tissue

EntityPhysical Object

Anatomical StructureFully Formed Anatomical Structure

An aggregation of similarly specialized cells and the associated intercellular substance.

Tissues are relatively non-localized in comparison to body parts, organs or organ components

Body SubstanceBody Fluid Soft Tissue

Blood as tissue

Representation of Blood in SNOMED

Blood

Liquid Substance

Substance categorized by physical state

Body fluid

Body Substance

Substance

Blood as fluid

So what is the ontology of blood?

We cannot solve this problem just by looking at concepts

concept systems may be simply incommensurable

the problem can only be solved

by taking the world itself into account

“golem”

objects are in the worldnot all concepts correspond to objects

not all concepts are relevant to ontology

concepts are in the head

problem of ‘merging’ ontologies

“golem”

“phantasy”

Another Example: Statements of Accounts

Company Financial statements may be prepared under either the (US) GAAP or the (European) IASC standards Under the two standards, cost items are often allocated to different revenue and expenditure categories depending on the tax laws and accounting rules of the countries involved.

Ontology’s job

is to develop an algorithm for the automatic conversion of income statements and balance sheets between the two systems.

Not even this relatively simple problem has been satisfactorily resolved

… why not?

because the two concept systems are simply incommensurable

the problem can only be solved

by taking the world itself into account

How to solve the Tower of Babel Problem?

How to fuse the two mutually incompatible ‘conceptual models’ of revenue ?

By drawing on the results of philosophical work in ontology carried out over the last 2000 years

This implies a view of ontology

not as a theory of concepts

but as a theory of reality

But how is this possible?

How can we get beyond our concepts?

answer: ontology must be maximally opportunistic

it must relate not to beliefs, concepts, syntactic strings but to the world itself

Maximally opportunistic

means:

look at concepts and beliefs critically

and always in the context of a wider view which includes independent ways to access the objects themselves

at different levels of granularity

Ontology must be maximally opportunistic

This means:

don’t just look at beliefs

look at the objects themselves

from every possible direction,

formal and informal

scientific and non-scientific …

Maximally opportunistic

means:look at the same objects at different levels of granularity:

Second step: select out the good conceptualizations

these have a reasonable chance of being integrated together into a single ontological system

• based on tested principles

• robust

• conform to natural science

Ontology

like cartography

must work with maps at different scales

Medical ontologies

at different levels of granularity:

cell ontology

drug ontology *

protein ontology

gene ontology *

anatomical ontology *

epidemiological ontology

Medical ontologies

disease ontology

therapy ontology

pathology ontology *

and also

physician’s ontology

patient’s ontology

There are many compatible map-like partitions

many maps at different scales,

all transparent to the reality beyond

the mistake arises when one supposes

that only one of these partitions is a true map of what exists

Partitions should be cuts through reality

a good medical ontology should NOT be compatible with the conceptualization of disease as:

caused by evil spirits and demons and cured by golems

The End