OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual...

69
OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group

Transcript of OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual...

Page 1: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

OntoClean Methodology

As presented at AOS Workshop by

Aldo GangemiCNR-IP, Ontology and Conceptual Modelling Group

Page 2: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Credits

CNR - Ontology and Conceptual Modelling Groups

Nicola Guarino, Claudio Masolo, Alessandro Oltramari{Nicola.Guarino,claudio.masolo,alessandro.oltramari}@ladseb.pd.cnr.it

Aldo Gangemi, Domenico Pisanelli, Geri Steve{gangemi,pisanelli,steve}@itbm.rm.cnr.it

Vassar College

Chris [email protected]

Publications are retrievable on-line at the following URLs:http://www.ladseb.pd.cnr.it/infor/ontology/ontology.html

http://saussure.irmkant.rm.cnr.it/onto/index.html

Page 3: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Some Applications so far

• ON9 ontology library

• UMLS Metathesaurus semantic mining

• Medical terminologies integration

• Integration of Clinical Guidelines Standards

• Ontological upgrading of Wordnet

• Ontological Web Agents

• Product ontologies

• OntoClean Top-Level

• Legal ontologies and norm dynamics

• Portal directories maintenance and subject ontology

• Content standards for the semantic web

Page 4: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Some Projects so far

• GALEN (EU AIM Project, academic/industrial project for a medical terminology server)

• SOLMC (Ontological and Conceptual Modelling Tools, CNR Special Project)

• Arianna Catalog (industrial pilot project to build and maintain portal directories)

• IKF, Intelligent Knowledge Fusion (Eureka Project E!2235, academic/industrial project for information integration)

• IIDEAS, Integration of Industrial Data for Exchange, Access, and Sharing

• IEEE Standard Upper Ontology Study Group

• OntoWeb: Ontology-based information exchange for knowledge management and electronic commerce (EU Network of Excellence)

• OntoWeb SIG on Content Standards

• TICCA (Italian Project, Cognitive technologies for artificial agents)

Page 5: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

OntoClean Antecedents

• Guarino and Welty’s theoretical tools for the ontological refinement of taxonomies

• ONIONS techniques for domain ontology development and large-scale terminology integration

Page 6: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

OntoClean Components

• Formal Criteria

• Top-Level Ontology

• Ontology of Universals

• Domain-Level Development Guidelines

• Applications

Page 7: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

OntoClean Components

• Formal Criteria (brief explanation)

• Top-Level Ontology

• Ontology of Universals

• Domain-Level Development Guidelines

• Applications

Page 8: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Individuals and Concepts

• The term "meta-property" adopted here is based on a fundamental distinction within the domain of discourse:

• individuals or particulars vs.• concepts or universals

• Meta-level properties induce distinctions among concepts, while object-level properties induce distinctions among individuals

Page 9: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Rigidity

• A property is essential to an individual iff it necessarily holds for that individual

• A property is rigid (+R) iff, necessarily, it is essential to all its instances. A property is non-rigid (-R) iff it is not essential to some of its instances, and anti-rigid (~R) iff it is not essential to all its instances

• Person vs Student

Page 10: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Identity

• A property carries an identity criterion (+I) iff all its instances can be (re)identified by means of a suitable sameness relation. A property supplies an identity criterion iff such criterion is not inherited by any subsuming property

• Person vs. Student

Page 11: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Dependence

• An individual x is constantly dependent on y iff, at any time, x can't be present unless y is fully present, and y is not part of x. Ex: Hole/Host

• A property P is constantly dependent (+D) iff, for all its instances, there exists something they are constantly dependent on.

• Here Dependent = Constantly Dependent

Page 12: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Types vs. Roles

• A rigid property that supplies an identity criterion and is not (notionally) dependent is called a type.

• An anti-rigid property that is notionally dependent is called a role. It is a material role if it carries (but not supplies) an identity criterion, and a formal role otherwise.

• Person vs. Student vs. Part

Page 13: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Extensionality

• An individual is said to be extensional iff, necessarily, everything that has the same proper parts is identical to it

• A property is extensional (+E) iff, necessarily, all its instances are extensional

• A property is anti-extensional (~E) iff, necessarily, all its instances are non-extensional, so that they can possibly change some parts while keeping their identity

Page 14: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Concreteness

• An individual is concrete iff it has a physical location. A property whose instances are necessarily concrete will be marked with the meta-property +C

• Note that an individual can be concrete without being necessarily real, or actual: Peter Pan is not real but is concrete

• This meta-property is a bit less formal (in the ontological sense) than the previous ones, since it makes an ontological commitment towards the existence of physical (spatial, temporal or spatio-temporal) locations. We see physical locations as primitive qualities that individuals can have

Page 15: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Unity

• An individual is unified by a (suitably constrained) relation R iff it is a mereological sum of entities that are bound together by R. Ex. the relation having the same boss may unify a group of employees in a company

• An individual w is a whole under R iff it is maximally unified by R, in the sense that R is internal to w, and no part of w is linked by R to something that is not part or w

• A property P is said to carry unity (+U) if there is a common unifying relation R such that all the instances of P are essential wholes under R. A property carries anti-unity (~U) if all its instances can possibly be non-wholes. If every instance of P is an essential whole, but there is no unifying relation common to all instances of P, then we mark P with the property *U

Page 16: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Singularity and Plurality

• An individual is a singular whole iff its unifying relation is the transitive closure of the relation "strong connection", like that existing between two 3D regions that have a surface in common. Topological wholes of this kind have a special cognitive relevance, which accounts for the natural language distinction between singular and plural

• A plural individual is a sum of singular wholes that is not itself a singular whole. Plural individuals may be wholes themselves or not. In the former case they will be called collections; in the latter case pluralities

• A piece of coal is a singular whole. A lump of coal is a topological whole, but not a singular whole, since the pieces of coal merely touch each other, with no material connection. It is therefore a plural whole

Page 17: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Applying Formal Properties

• If a property holds necessarily for all the instances of a certain concept, of course its negation cannot hold necessarily for all the instances of a subsumed concept.

• Then, if F is a certain formal property, anti-F cannot subsume F: anti-rigidity cannot subsume rigidity, anti-unity cannot subsume unity, and anti-extensionality cannot subsume extensionality.

• After labeling every concept in a taxonomy with its formal properties, we can easily check its ontological consistency

Page 18: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

OntoClean Components

• Formal Criteria

• Top-Level Ontology

• Ontology of Universals

• Domain-Level Development Guidelines

• Applications

Page 19: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

The OntoClean Top-Level Ontology:

an Overview

Page 20: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Basic Design Guidelines for the OntoClean Top-Level

• Introduction of ontological categories lying behind Natural Language and Human Commonsense

• Use of formal properties (general and neutral as possible) to characterize the ontological categories– Rigidity, Identity, Dependence, Unity, Extensionality, Singularity,

Concreteness (see next slide for references)

• Refinement of top-distinctions by further analysis (taking into account philosophy, cognitive sciences, linguistics,…)

IMPORTANT: All top-concepts are considered to be rigid, as they are assumed to reflect essential properties of their instances

Page 21: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Brand New Essential Bibliography

• Guarino and Welty [2001], – Supporting Ontological Analysis of Taxonomic RelationshipsSupporting Ontological Analysis of Taxonomic Relationships

(“Data and Knowledge Engineering” - in press)– Identity and SubsumptionIdentity and Subsumption (In R.Green, C. Bean and S.Myaeng

[eds.], The Semantics of Relationships: an Interdisciplinary Perspective. Kluwer - in press)

• Gangemi, Guarino, Masolo, Oltramari [2001], – Understanding Top-Level Ontological DistinctionsUnderstanding Top-Level Ontological Distinctions ( Proceedings of

IJCAI 2001 workshop on Ontologies and Information Sharing)

• Gangemi, Guarino, Oltramari [2001],– Conceptual Analysis of Lexical Taxonomies: The Case of WordNet Conceptual Analysis of Lexical Taxonomies: The Case of WordNet

Top-LevelTop-Level (Proceedings of FOIS 2001)

Page 22: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

• Aggregate (~D, ~U)– Amount of matter (+E)

– Arbitrary collections

• Object (~D, *U)– Extensional Body (+E)

– Ordinary Object (~E)

• Event (+D, +E)

• Feature (+D, *U, -E)– Relevant part

– Dependent Region

(1) The Top-Level: Unique Beginners, Direct Hyponyms, Some Synsets from WordNet 1.6

body substance, mixture#1, mass#5

universe#1, elementary particle artifact, land#4,(unitary) collection#1

phenomenon, act#2, state#4

edge#3, skin#1, paring#2 opening#10, excavation#3

Page 23: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

(2) The Top-Level: Unique Beginners, Direct Hyponyms, Some Relevant Synsets from WordNet 1.6

• Abstraction (~C)– Abstract entity

• Proposition• Set• ...

– Quality space• Color space• Shape space• …

• Quality (+D,+E,+U)– Color– Shape– ...

conclusion#5, lemma#1

union#7, singleton#2

= chromatic color

= shape#2

Page 24: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

(1) Aggregate vs. Object

• What distinguishes an object from an aggregate is that the former is an essential whole, namely it has a unity criterion, while the latter is not. For example, John can “make-up” a snowman (object) starting from the scattered snow (amount of matter) covering his courtyard, adding a hat, a carrot, two deadwoods, etc. In general, amounts of matter are mass-nouns (you can’t say a snow, a water, ...), while objects are count-nouns (such as a snowman, five glasses of water,and so on).

Page 25: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

(2) Aggregate vs. Object

• Arbitrary collections are just mere sum of wholes which are not themselves essential wholes (as the collection of goods in a bazar). In this sense, they are kinds of aggregate. On the other hand, there are collections which are themselves essential wholes, as a library. In our top-level these unitary collections are to be conceived as a specialization of the object category.

Page 26: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

(3) Aggregate vs. Object

• An object can change some parts, keeping or not its identity. In the first case, we call it Ordinary Object (~E), in the second case Extensional Object (+E). My car will continue to be the same even if I replace one of its wheels. On the contrary, if I consider the universe, removing a single elementary particle I won’t have the universe any more, but a different entity.

• Regarding aggregates, we can say that amounts of matter are clearly +E, while arbitrary collections can be considered as pseudo-extensional (changes in the parts of a member of a collection may be allowed).

Page 27: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Event

• Events occur in time. They are assumed to be dependent (+D) on those objects (~D) that are their partecipants.– The penalty kick by Roberto Baggio (main partecipant)

• Partecipants are not parts of events. Parts of events can be:– temporal (the first movement of a symphony)– spatial (the strings playing within a symphony)

• Parts of events are always essential, which means that events are extensional (+E).

Our taxonomy of events needs to be improved and populated. A comparison with EuroWordNet and SIMPLE, in this sense,

may be useful.

Page 28: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Feature

• Features are “parasitic” (+D) entities, that exist insofar their host exists. Features may be relevant parts of their host, like a bump in a road, or dependent regions, such as a hole in a piece of cheese, the underneath of a table, or the shadow of a tree (which are not parts of their hosts). All features are essential wholes, but no common unity criterion may exist for all of them (*U). Some features can change parts keeping their identity, while some others not: for this reason, we use -E as the common formal property (+E and ~E are both subsumed by -E).

Page 29: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Abstraction

• Abstractions are entities that are not concrete, that is, they do not have a physical location (~C). Quality spaces are the first examples of abstractions: time, geometric space, length, color, are all conceptual spaces, with different topological structure. Terms like red, long, sweet, old, recent etc. correspond to regions in a quality space. We can therefore describe the structure of a quality space with a first-order theory, using topological notions: for instance, we can say that “red is adjacent to brown”. Other examples of Abstraction are propositions, sets, symbols, etc.

Page 30: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Quality (1)

• Qualities are always “qualities of something”: in this sense, they are dependent (+D). Qualities are individual, i.e. that they are inherent to a unique entity (the color of this rose is red). We call quality-type every homogenous group of individual qualities, such as color, shape, volume, etc. In the OntoClean top-level qualities are structured in strict relationship with quality spaces: every quality-type “corresponds” to a quality space in the branch of Abstractions. A region in a quality space corresponds to an individual quality of an entity in our conceptualization of the world.

Page 31: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Quality (2)

• Following our approach, the red of the rose on the right figure is represented as located in a certain region in the colors-quality-space. In the same way, the spherical shape of the ball below is represented as located in a certain region of the shape-quality-space. In principle, time and space could be treated as qualities too. We are currently studying the ontological commitments and the formal properties concerning this options.

Page 32: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Appendix: An Alternative View of the OntoClean Top-Level (1)

• Since the agreement on the meaning of general categories is not always easy, in this short presentation we preferred to make clear first the most relevant top-level concepts, leaving aside the various ways they can be presented in a hierarchy. For example, we could have introduced the OntoClean top-level by considering the general distinction between concrete and abstract entities as the complete partition of “what there is” in the world. Then, within concrete entities we could have distinguished independent from dependent entities. The overall taxonomy would have been like this:

Page 33: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Appendix: An Alternative View of the OntoClean Top-Level (2)

• Quality (+E,+U)– Color

– Shape

– …

• Abstraction (~C)• Abstract entity

– Proposition

– Set

– ...

• Quality space

– Color space

– Shape space

– …

• Concrete (+C)– Independent (~D)

• Aggregate ( ~U)– Amount of matter (+E)

– Arbitrary collections

• Object ( *U)– Extensional Body (+E)

– Ordinary Object (~E)

– Dependent (+D)• Event ( +E)• Feature ( *U, -E)

– Relevant part

– Dependent Region

Page 34: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

OntoClean Components

• Formal Criteria

• Top-Level Ontology

• Ontology of Universals (list of main kinds)

• Domain-Level Development Guidelines

• Applications

Page 35: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Which part are you talking about?

• If my liver is part of my digestive system, and that system is part of me, is my liver part of me?

• If my liver is a part of me and I am part of the CNR, is my liver part of the CNR?

• My liver is a component of my digestive system, while I am a member of CNR. No rule for composing component and member relations

• Moreover, I am a body, but I am also a person. A living person depends on a body. Nevertheless, a living person can be a member of CNR, but a body cannot

Page 36: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Ontology of Universals - Main Relation Kinds -

• Intracategorial– Mereological (entity, entity)– Topological (entity, entity)

• Intercategorial– Localization (region, entity)– Participation (event, entity)– Representation (*sign, entity)

• Entrenched axiomatic characterisation

Page 37: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

OntoClean Components

• Formal Criteria

• Top-Level Ontology

• Ontology of Universals

• Domain-Level Development Guidelines (hints)

• Applications

Page 38: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Kinds of Terminological Ontology Sources

Catalog of normalized terms, e.g. a list of terms used in the reports from a laboratory: no taxonomy, no axioms, and no glosses

Glossed catalog, e.g. a dictionary: a catalog with glosses.

Thesaurus, e.g. many parts of the UMLS Metathesaurus, GEMET: a hierarchical collection of terms; the hierarchical link is usually polysemous

Taxonomy, e.g. the ICD10: a collection of classes with a partial order induced by inclusion (classification)

Axiomatized taxonomy, e.g. the GALEN Core Model: a taxonomy with axioms

Ontology library, e.g. the Ontolingua repository: a set of axiomatized taxonomies with relations among them. Each element of the library is a module, which can be included into another one. Also, a concept from a module can be only used into another one. Ontology modules can be considered subdivisions of the namespace of a model

Page 39: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Impairments in Traditional Terminologies

• Lack of hierarchies

• Ambiguous hierarchies

• Informality

• Lack of modularity or cyclic taxonomical dependencies between modules

• Polysemy of various sorts

• Uncertain semantics

• Ontological opaqueness

• Lack of a (minimal) set of axioms

• 'Remainder' partitions

• 'Exception' partitions

• Terminological cycles

• Meta-level soup (individuals mixed up with universals or even higher order concepts)

• Low maintenance capabilities

Page 40: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Ontologies: some desiderata• An explicit taxonomy with subsumption among concepts

• Semantic explicitness of relations

• Rigorous modularity of namespace

• A stratified design of the modules

• Absence of polysemy within a module

• Disjointness of rigid concepts within a module and within the top-level

• A proper interface between the ontology namespace and one or more sets of lexical realizations

• Linguistically meaningful naming policy (cognitive transparency)

• Rich documentation

• Some (minimal) axiomatization to detail the difference among sibling concepts

• Explicit linkage to concepts and relations from generic theories

• Meta-level assignments to distinguish among the formal primitives assigned to concepts

• Languages and implementations that support the previous needs as well as the possibility of collaborative modeling

Page 41: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

ONtologicIntegration OfNaïveSources

• context

general theories

concept

defining elements

Page 42: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Lexical and Linguistic Analysis

• Morpho-semantic analysis (extraction of sematically meaningful units)

• Functional informational structures (extraction of head/modifier syntagmatic structures)

• Conceptual polysemy treatment (templates for systematic ambiguity resolving)

Page 43: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Conceptual Issues in Ontology Integration

• Ontology integration is – generally speaking – the construction of an ontology C that formally specifies the union of the vocabularies of two other ontologies A and B

• To be sure that A and B can be integrated at some level, C has to commit to both A's and B's conceptualizations. In other words, the intension of the concepts in A and B should be mapped to the intension of C's concepts

• Unfortunately, this cannot be realized using only the conceptual relations specified in A and B for local tasks (for a specific context). The methodological principle adopted here is that generic ontologies reused from the philosophical, linguistic, mathematical, AI literature must found the comparison of different intensions. Our approach may be called principled conceptual integration

Page 44: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

44

Ontology Library Architecture

Domain Ontologies(eg, banana, organic lettuce, rose)

Middle Ontologies(eg, plant, crop, fishery)

Upper Ontologies(eg,object, event, part, precedes, shape)

Representation Ontologies(eg, concept, slot, instance, role, function)

Page 45: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Aspects of integration

• Three aspects of an ontology are taken into account:

• the intended models of the conceptualizations of its vocabulary

• the domain of interest of such models, i.e. the 'topic' of the ontology

• the namespace of the ontology

• The most interesting case is when A and B are supposed to commit to the conceptualization of the same domain of interest or of two overlapping domains. In particular, A and B may be:

Page 46: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

The main steps (I)

– 0. Semantically opaque hierarchies and lists are pre-processed in order to create ‘clean’ taxonomies

– 1. All concepts, relations, templates, rules, and axioms from a source ontology are represented in the ONIONS formalisms, currently Loom, Ontolingua, and OKBC

– 2. When available, plain text descriptions are analyzed and axiomatized (text formalization)

– 3. The union of such products is integrated by means of a set of generic ontologies. This is the most characteristic activity in ONIONS, which can be briefly described as follows:

Page 47: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

II

– 3.1. For any set of sibling concepts in a taxonomy, the conceptual difference between each of them is inferred, and such difference is formalized by axioms that reuse the relations and concepts already in the library. If no concept is available to represent the difference, new concepts are added to the library

– 3.2. For any set of polysemous senses of a term, different concepts are stated and placed within the library according to their topic and to the available modules. (Polysemy occurs when two concepts with overlapping or disjoint intended models have the same name.)

– 3.3. Often, polysemous senses of a term - as well as different 'alternative' concepts - are metonymically related. For example: process/outcome (as in inflammation), region/object (as in body region), etc. Alternatives must be properly defined by making it explicit the relationship between them: e.g. "has-product" for inflammation, "location" for body-region

– 3.4. When stating new concepts, the relations necessary to maintain the consistency with the existing concepts are instantiated. If conflicts arise with existing theories, a more general theory is searched which is more comprehensive. If this is impracticable, an alternative theory is created

Page 48: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

– 3.5. Relevant integration cases. Since ONIONS requires the use of generic theories to axiomatize alternative theories, the integration of a concept C from an ontology O is performed by comparing C with the concepts D1,…,n already present in the evolving ontology library L, whose ontology set M1,…,n contains at least a significant subset of generic ontologies and the set of domain ontologies at that state in the evolution of L. The following cases appear relevant to the methodology:

– 3.5.1. C's name is polysemous in O (internal polysemy). Iterate 3.2 ÷ 3.4

– 3.5.2. C's name is homonym with the name of a Di. (Homonymy occurs when both the intended models and the domains of two concepts with the same name are disjoint.) Homonyms must be differentiated by modifying the name, or by preventing the homonyms to be included in the same module namespace

– 3.5.3. C's name is synonym with the name of a Di. (Synonymy is the converse of homonymy and occurs when two concepts with different names have both the same intended model and the same domain.) Synonyms must be preserved, or included in the set of lexical realizations related to the concept

– 3.5.4. C is subsumed by some Di in L, but it has no total mapping on any Dj in L. The gap in L must be filled by adding C as a subconcept of Di

III

Page 49: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

– 3.5.5. C is an intersection between two concepts Di and Dj in L. Solved by distinguishing types and roles, or different defining elements

– 3.5.6. C has an alternative concept Di in L (same domain, but overlapping or disjoint intended models):

• 3.5.6.1. If C metonymically depends on Di, C is properly related to Di

• 3.5.6.2. If C and Di are different viewpoints on the same domain of interest, both concepts are kept; if the case, they are included in separate modules

• 3.5.6.3. If the intended model of C is finer than Di's, Di is substituted with C

• 3.5.6.4. If the intended model of C is coarser than Di's, C is ignored (but track of it is kept for mapping between sources)

IV

Page 50: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

– 4. The library of generic, intermediate, and domain ontologies should be stratified, say domain modules should include intermediate modules - that should include generic modules - so that each set of modules can be plugged or unplugged from its more general set without affecting the coherence of the entire library

– 5. The source ontologies are explicitly mapped to the integrated ontology, in order to allow interoperability. The only admitted mappings are equivalent and coarser equivalent. Formally: for any source ontology SO and an ontology IO that is supposed to result (also) from the integration of SO, for any concept Ci in SO, there is a Di in IO such that Ci

I = DiI (equivalence of possible

interpretations), or there is a disjunctive concept (or Di Dj) in IO such that CiI =

DiI Dj

I (equivalence of possible interpretations to a disjunction of concepts – i.e. to a union of finer concepts)

– 5.1. Partial mappings must have been already resolved through the methodology: if any, some step in the integration procedure must be iterated

V

Page 51: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Formal Approach

• Ontology Integration Framework (Calvanese,DeGiacomo,Lenzerini 2001)

• Description Logic-based

• Mapping relations between local ontologies

• Views either globally or locally

Page 52: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

OntoClean Components

• Formal Criteria

• Top-Level Ontology

• Ontology of Universals

• Domain-Level Development Guidelines

• Applications (examples)

Page 53: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

The OntoWordnet Loom KB

• We created a Loom KB, containing, for each named concept, its direct super-concept(s), some annotations describing the quasi-synonyms, the gloss and the synset subject assignment, and its original numeric identifier in WordNet; for example:

(defconcept Horse$Equus_Caballus

:is-primitive Equine$Equid

:annotations ((subject animals)

(word |horse|)

(word |Equus caballus|)

(documentation "solid-hoofed herbivorous quadruped domesticated since prehistoric times"))

:identifier |101875414|)

Page 54: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

What can be improved in WordNet

• Top-level not based on formal ontological principles

• Concepts vs. individuals lack of distinction

• Object-level vs. meta-level lack of distinction

• Roles vs. types lack of distinction

• Taxonomies to be revised

• Multihierarchies

• Few conceptual relations among synsets

• Weak subdivision by subject

• Phrases to be augmented

Page 55: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Concepts vs. individuals

• Under Event, we find Fall (of mankind)• Under Territorial_Dominion we find Macao

• Problem > Expressivity lack

• Solution > We need an “instance-of” relation

Page 56: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Object-level vs. Meta-level

• The synset Abstraction_1 includes both abstract entities, such as Set, Time, and Space, and abstractions such as Attribute, Relation, Quantity.

• Abstraction_1 is glossed as "a general concept formed by extracting common features from specific examples”. Abstraction seems to be intended therefore as a psychological process of generalization. This meaning seems to fit the latter group of terms (Attribute, Relation, Quantity), but not the former, which would be considered as abstract under a different notion, namely not being extended in space/time.

• Solution:

• Attribute, Relation, and Quantity are meta-level concepts, while Set, Time, and Space are object-level concepts.

Page 57: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Role vs. Type

• RULE > A role cannot subsume a type

• Person (that we consider as a type) is subsumed by two different concepts, Organism and Causal_Agent.

• Organism can be conceived as a type as well, while Causal_Agent as a formal role (inferrable from its subconcepts)

• The first subsumption relationship is therefore feasible, while the second one shows a rigidity violation (roles are not rigid)

• Solution > Maintain meta-property-based views of taxonomy

Page 58: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

• Apparently sibling noun phrases are located in distant taxonomy branches

content#2,subjectmatter

communication#2

socialrelation

relation#1

abstraction#6

subject#1,topic#1,theme#1

subject#2,topic#2,issue#4,matter#2

content#5,mentalobject

cognition,knowledge

psychologicalfeature

knowledgedomain,

knowledgebase

domain#5,region#5,

realm#3

subject#3,discipline#1,

field#4

subject#4,content#7,

depictedobject

entity

Taxonomical Ambiguity in Wordnet

• Solution: dissolve heterogeneous branchings by analysing complex senses (knowledge is a type, communication is a role, social relation is a meta-level

concept, etc…)

Page 59: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Top-Synset

Object, PhysicalObject

Abstraction

Psychological Feature

Attribute

Substance, Matter

Natural Object

Life Form, Organism,

Being, Living Thing

Artifact Location

Measure, Quantity, Amount, Quantum

Relation Space

FeelingMotivation,

Motive, Need

Cognition, Knowledge

Possession

Shape, Form

Entity, Something

Act, Human Action, Human Activity

State

Event

Phenomenon

Causal Agent, Cause, Causal Agency

Cell

Food, Nutrient

Person, Individual,

Someone, Somebody, Mortal,

Human, Soul

Plant, Flora, Plant

Life

Animal, Animate Being, Beast, Brute,

Creature, Fauna

Time

Social Relation

Communication

Group, Grouping

Wordnet’s Top Synsets (UBs and some hyponyms)

Page 60: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Some more examples

• Possession_1 is a role, and it includes both roles and types

• Some hyponyms of Physical_Object are mapped to Feature

• Abstraction_1 is the most heterogeneous Unique Beginner. It contains abstracts (Set_5), quality spaces (Chromatic_Color), qualities (mostly from the synset Attribute) and a hybrid concept (Relation_1) that contains abstracts, other entities, and even meta-level categories

• Psychological_Feature contains both abstract entities (Cognition) and Events (Feeling_1)

• Event_1, Phenomenon_1, State_1, Act are globally mapped to our Event category, although – by simply looking at their children – it seems quite hard to explicit any criteria to maintain the original distinctions

Page 61: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Cleaned-Up WordNet

• Following OntoClean, we have concentrated first on the so-called backbone taxonomy, which only includes the rigid properties. Formal and material roles have been therefore excluded from this preliminary work

• An extreme heterogeneity appears evident:• Ex. Entity seems a “catch-all” class (Imaginary_place,

Anticipation, Inessential, Location, Object, Causal_Agent, etc.) • Therefore, we also decided to exclude from the top-level

cleaning those synsets sharing very few hyponyms (resembling “orphans”), like some of the above hyponyms of Entity

Page 62: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Revised Top-Level TableTop Categories Covered Synsets Rejected Hyponyms Imported HyponymsAggregate Aggregate_2 !Amount_of_matter Substance$Matter* Bedding_Material,

Philosopher's_ Stone…Mass_5,Cement_2,Substance,…

PluralityObject Entity$Something* Anticipation, Causal_Agent,

Imaginary_Place,SubstancePhysical_Body Natural_Object* Dead_Body, Constellation,

Stone, Nest…Ordinary_Object Physical_Object*

GroupFinding, Catch, Vagabond;arrangement, …

FeatureRelevant_part part$portion*

fragmentEdge_3, Skin_4,Paring$Parings,…

Plural_FeatureDependent_Region Excavation$hole_in_ground,

Opening_3, …Quality Attribute* Trait, Ethos, Inheritance…

Time time_interval$interval* Eternity, Present, Past, Future,Greenwich_Mean_Time

Color… chromatic_colorAbstractionAbstract_Entity Statement_1, Ownership_1,

Cognition, Arrangement_2, ...Proposition Proposition_1Set set_5

Quality_Space Attribute* Trait, Ethos, Inheritance…Space space_1 Subspace,…Time time_interval$interval* Eternity, Present, Past, Future,

Greenwich_Mean_Time,Color… chromatic_color

Event PHENOMENON_1 *,STATE_1*, Cognitive Event,EVENT_1*, ACT*

Page 63: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Main Types in the Revised Top-Level

Aggregate

Entity

Event Feature Quality AbstractionObject

Quality-Space

Information

SetPhysical-Body

Ordinary-Object

Amount-Of-Matter Plurality

Action Process Achievement

Page 64: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Ontological Integration Applied to Medical Terminologies

• UMLS Metathesaurus Category Intersections

• UMLS Semantic Network

• SnoMed Nomenclature

• ICD10 Classification

• MeSH Thesaurus

Page 65: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Some UMLS concepts pertaining the intersection: Amino Acid, Peptide, or Protein & Carbohydrate

• (|hamster oviduct-specific glycoprotein|)

• (|Par j I|)

• (|(Man)6(GlcNAc)2Asn|)

• (|Zn(+2)-IAA|)

• (|collapsing factor|)

• (|BDV 18K glycoprotein|)

• (|SI-gene-associated glycoprotein, Nicotiana|)

• (|FdI allergen|)

• (|sca gene product|)

• (|EPV20 protein|)

• (|lubricin|)

• (|Pluritene|)

• (|Par h 1 allergen|)

• (|Wnt11 gene product|)

• (|I-D-Gal-BSA|)

• (|mannose-bovine serum albumin conjugate|)

• (|acrosome granule lysin|)

• (|sulfatide activator|)

• (|vaccinia virus A34R protein|)

=> More than 118,000 UMLS concepts (25%) are classified under an intersection

Page 66: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Ontological analysis of the intersection (Loom DL syntax)

• (defconcept |Amino Acid, Peptide, or Protein & Carbohydrate|

• "834 instances. This conjunct includes two sibling types.• A protein containing a carbohydrate."

• :annotations ((Sugg.Name "carbohydrate-containing-protein")• (onto-status integrated))

• :is-primitive (:and protein • (:some has-component carbohydrate))• :context :substances)

Page 67: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Medical Morphologies

• Names of anatomical morphologies are often polysemous: Both a condition and the function that caused the condition ("inflammation", "ulcer",

"fracture", "wound", "hyperplasia") Both an object and the function that produced the object ("neoplasm", "hemorrhage") Both an object O and the condition created in another object O' by O ("obstruction")

• For example: "the fracture has been caused by a fall" vs. "the fracture is transverse"; "the obstruction occurred in the jejunum" vs. "the obstruction has been removed"

• Conceptual analysis puts into evidence other issues concerning morphologies:

• The dependence between a morphological condition, a function, and the related organ. For example, an "ulcer" (as a condition) of a stomach implies that the stomach embodies an ulceration function (an ulcer as a function)

• The mereological import of morphologies: some are featured by an organ, some only by a part of an organ. For instance, an "ectopic heart" is wholly ectopic, but an "ulcerated stomach" is only partly ulcerated

Page 68: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Morphologies Analyzed

a quality space ("color", "consistency", "thickness", "size", "number", "shape") a situation:

a topologically relevant condition: an alteration of connection:

that creates a configuration (a new property) in an object ("fracture", "wound")

in the holey interior of an object ("obstruction") between several objects ("fusion")

an alteration of the boundary between an object holey interior and the object complement:

creating a configuration in the boundary ("cavitation", "ulcer") producing a substance flow ("hemorrhage", "ulcer")

an abnormal placement ("dislocation", "ectopia", "absence") a form alteration condition ("deformity", "hyperplasia", "hypoplasia") a condition involving the alteration of several properties ("inflammation", "eruption")

an abnormal, foreign object ("mass", "neoplasm", "calculus", "obstruction")

Page 69: OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group.

Use OntoClean for all your

ontology cleaning needs!