O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of...
-
Upload
loreen-andrews -
Category
Documents
-
view
213 -
download
1
Transcript of O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of...
ONTOLOGY EVALUATION AND RANKING USING ON-TOQA
By. Samir Tatir and I.Budak Arpinar
Department of Industrial Engineering
Park Jihye
WHY “ONTOQA?” More and more ontologies are being introduced Difficult to find good ontology related to user’s
work Need tools for evaluating and ranking the on-
tologies
Provides a flexible technique to rank ontologies based on user’s contents and relevance
OntoQA is the first approach that evaluates on-tologies using their instances as well as schemas
CONTENTS Architecture
Terminology
The Metrics Schema Metrics Instance Metrics
Ontology Score Calculation
Experiments and Evaluation
Conclusion
ARCHITECTURE
1. Input Ontologya. OntoQA calculates metric values
ARCHITECTURE
2. Input Ontology and Keywords OntoQA calculates metric valuesUses WordNet to expand the keywords to include
any related keywords that might exist in the ontol-ogy
Uses metric values to evaluate the overall contents of the ontology and obtain its relevance to the keywords
ARCHITECTURE
3. Input Keywords OntoQA uses Swoogle to find the RDF and OWL
ontologies in the top 20 results returned by Swoogle
OntoQA then evaluates each of the ontologies OntoQA finally displays the list of ontologies ranked by their score
TERMINOLOGY Schema
A set of classes, A set of relationships, A set of class-ancestor pairs,
Knowledgebase
A set of instances, A class instantiation function, A relationship instantiation function,
HP
C
I
( )iinst C( , )i iinstr I I
METRICS Two dimension
Schema
Ontology design and its potential for rich knowl-edge representation
Instances
Placement of instance data and distribution of the data
Overall Knowledgebase Class-specific metrics Relationship-specific metrics
SCHEMA METRICS (1) Relationship Diversity(RD)
: Whether user prefers a taxonomy or diverse rela-tionships
If RD value is close to 0, most of the relationships are inher-itance relationship
IF RD value is close to 1, most of the relationships are non-inheritance
PRD
H P
SCHEMA METRICS (2) Schema Deepness(SD)
: Distinguish Shallow ontology from a deep ontol-ogy
If SD value is low, ontology would be deep, and covers spe-cific domain in detailed manner
IF SD value is high, ontology would be shallow, and repre-sents a wide range of general knowledge
HSD
C
?
INSTANCE METRICS (1) OVERALL KB MET-
RICS Class Utilization(CU)
: Indicate how classes defined in the schema are
being utilized in the Knowledgebase
C’ is the set of populated classes
If CU value is low, knowledgebase does not have data that exemplifies all the knowledge that exists in the schema
'CCU
C
INSTANCE METRICS (1) OVERALL KB MET-
RICS Cohesion(Coh)
: Represents the number of connected components in the KB
Class Instance Distribution(CID)
: Indicate how instances are spread across the classes on the schema
Standard deviation in the number of instances per class
Coh CC
( ( ))iCID StdDev Inst C
INSTANCE METRICS (2) CLASS SPECIFIC METRICS
Class Connectivity(Conn) : Indicate centrality of a class
NIREC (C) is the set of relationships, instances of the class havewith instances of other classes
( ) ( )i iConn C NIREL C
INSTANCE METRICS (2) CLASS SPECIFIC METRICS
Class Importance (Imp) : Indicate what parts of the ontology are consid-
ered focal and what parts are on the edge
Number of instances that belong to the inheritance subtree rootedat in the KB, compared to the total number of class instances in the KB
( )Im ( )
( )i
i
Inst Cp C
KB CI
iC ( )iInst C iC
INSTANCE METRICS (2) CLASS SPECIFIC METRICS
Relationship Utilization(RU) : Reflects how the relationships defined for each
class in the schema are being used at instance level
is the set of distinct relationships used by instances of a class ,
is the set of relationships a class has with another class ,
( )( )
( )i
ii
IREL CRU C
CREL C
IRELiC ( ) : { ( , ), ( )}i i j i iIREL C instr I I where I inst C
CREL iC
jC ( ) : { ( , )}i i jCREL C P C C
INSTANCE METRICS (3) RELATIONSHIP-SPECIFIC METRICS
Relationship Importance(Imp) : Measures percentage of importance of the current
relationship
Number of instances of relationship in the KB,compared to the total number of property instances in the KB (RI)
( )Im ( )
( )i
i
Inst Rp R
KB RI
( )iInst RiR
ONTOLOGY SCORE CALCULA-TION Evaluation of Ontology based on the entered
keywords
I. The terms entered by the user are extended by adding any related terms
II. Determines the class and relationship whose name contain any term of the extended set of terms
III. Aggregate the overall metrics to get overall score for the ontology
i iScore W Metric
EXPERIMENTS AND EVALUA-TION Compare the ranking of the ontoQA, On-
toRank of Swoogle, group of expert users.
OntoRank1)
Similar to Google’s pageRank approach Gives preference to Popular Ontologies
wPR(a) is weighted PageRank variation
1)Finin T., et all. Swoogle:Searching for knowledge on the Semantic Web
( )
( ) ( ) ( )x OTC a
OntoRank a wPR a wPR x
EXPERIMENTS AND EVALUA-TION
Problem of OntoRank1)
If two copies of the same ontology are placed in two different locations and one of these locations is cited more than the other, it will rank the copy at this popular location higher than the other copy
OntoQA will give both ontologies the same ranking
1)Finin T., et all. Swoogle:Searching for knowledge on the Semantic Web
EXPERIMENTS AND EVALUATION
1 2 3 4 5 6 7 8 90123456789
10
1 2 3 4 5 6 7 8 90123456789
10
With Balanced WeightWith Higher Weight for Schema Size
user
OntoQASwoogle
user
Swoogle
OntoQA
CONCLUSION Different from other approaches in that it is
tunable, requires minimal user involvement
Consider both the schema and the instances of a populated ontology
REVIEW Ranking result depends highly on the Weight
Difficult to decide proper Weight
Due to inconsistent metrics, every metric has its own range
=> “same weight” doesn’t mean “same preference”
About 10 kinds of metrics, too many cases of com-bination