The schema theory for semantic link networkh.zhuge/data/SLG-FGCS2010-schema...Schema::::,:::::
A Semantic Approach to Discovering Schema Mapping
description
Transcript of A Semantic Approach to Discovering Schema Mapping
![Page 1: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/1.jpg)
A SEMANTIC APPROACH TO DISCOVERING SCHEMA MAPPINGYuan An, Alex Borgida, Renee J. Miller, and John MylopoulosPresented by: Kristine Monteith
![Page 2: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/2.jpg)
OVERVIEWGoal of the paper: Matching schemas with
more than just simple element correspondence
(e.g. Can we improve on a naïve mapping?)
![Page 3: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/3.jpg)
OVERVIEWApproach: Derive a conceptual model for the
semantics in a table and match the conceptual model in the source schema to the conceptual model in the target schema
e.g. Can we figure out that a source schema like this:
can match a target schema like this: hasBookSoldAt(aname,sid)
![Page 4: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/4.jpg)
EXAMPLE 1
![Page 5: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/5.jpg)
BASELINE SOLUTION: REFERENTIAL INTEGRITY CONSTRAINTS Find correspondences
v1: connect person.pname to hasBookAt.aname v2: connect bookstore.sid and hasBookSoldAt.sid
Create logical relations using referential constraints S1: person(pname) |X| writes(pname, bid) |X| book(bid) S2: book(bid) |X| soldAt(bid,sid) |X| bookstore(sid) S3: person(name) S4: bookstore(sid)
Look at target T1: hasBookSoldAt(aname,sid)
Look at each pair of source and target relations and check to see which are “covered” <S1,T1,v1> <S2,T1,v2> <S3,T1,v1> <S4,T1,v2>
![Page 6: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/6.jpg)
ASK THE USER ABOUT THE FOLLOWING:
Doesn’t present an entire tuple to match the target query: hasBookSoldAt(aname,sid)
![Page 7: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/7.jpg)
WHAT THIS PAPER SEEKS TO ACCOMPLISH: Generate the following:
compose “writes” and “soldAt” to produce a new semantic connection between “person” and “bookstore”
![Page 8: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/8.jpg)
APPROACH:REPRESENTING SEMANTICS OF SCHEMAS Create a Conceptual Model (CM) graph
Create nodes for classes and attributes Create directed edges for relationships and
inverses
C1 ---ISA--- C2 subclassesC ---p--- D relationshipsC ---p->-- D functional relationships
o Duplicate concept nodes to represent recursive relationships
![Page 9: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/9.jpg)
GENERATING MAPPING CANDIDATES Problem description
Inputs: A source relational schema S and a target
relational schema T A concept model (GS and GT respectively)
associated with each relational schema via table semantic mappings
A set of correspondences L linking a set L(S) of columns in S to a set L(T) of columns in T
Goal: A pair of expressions <E1,E2> which are
“semantically similar” in terms of modeling the subject matter
![Page 10: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/10.jpg)
MARKED NODES The set L(S) of columns gives rise to a set CS
of marked class nodes in the graph GS Likewise, the set L(T) gives rise to a set CT of
marked class nodes in the graph GT
![Page 11: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/11.jpg)
BASIC ALGORITHM Create conceptual subgraphs
find a subgraph D1 connecting concept nodes in CS, and a subgraph D2 connecting concept nodes in CT such that D1 and D2 are “semantically similar
Suggest possible mapping candidates translate D1 and D2 into algebraic expressions E1
and E2 and return the triple < E1,E2,LM> as a mapping candidate
![Page 12: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/12.jpg)
CREATING CONCEPTUAL SUBGRAPHS Notice simple matches
a node v in CS corresponds to a node u in CT when v and u have attributes that are associated with corresponding columns via the table semantics
More complicated rules The connections (v1,v2) and (u1,u2) should be
“semantically similar” or at least “compatible” (cardinality constraints, relationships like “is-a” or “part of”)
Use edges from pre-selected trees Represent “intuitively meaningful” concepts Favor smaller trees (Occam’s razor)
Other considerations Favor lossless joins Reject contradictions
![Page 13: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/13.jpg)
EXAMPLE Looking for a functional tree with a root
corresponding to the anchor Proj
![Page 14: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/14.jpg)
EXAMPLE Notice simple matches Find a tree with minimal cost (edges in pre-selected
trees don’t contribute to cost) Find a tree containing the most number of edges in the
pre-selected trees
Project ---controlledBy->-- Department --hasManager->-- Employee
![Page 15: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/15.jpg)
MORE COMPLICATED EXAMPLE
Same Answer:Project ---controlledBy->-- Department --hasManager->-- Employee
Still looking for low-cost, minimal trees to connect Employee to Project
![Page 16: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/16.jpg)
DEALING WITH N-ARY RELATIONS StoreSells(Person, Product)
![Page 17: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/17.jpg)
CONSIDERATIONS FOR REIFIED RELATIONSHIPS A path of length 2 passing through a reified
relationship node should be considered to be length 1
The semantic category of a target tree rooted at a reified relationship induces preferences for similarly rooted (minimal) functional trees in the source (cardinality restrictions, number of roles, subclass relationship to top level ontology concept)
![Page 18: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/18.jpg)
OBTAINING RELATIONAL EXPRESSIONS
![Page 19: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/19.jpg)
EXPERIMENTAL RESULTS
![Page 20: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/20.jpg)
AVERAGE PRECISION
![Page 21: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/21.jpg)
AVERAGE RECALL
![Page 22: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/22.jpg)
CONCLUSIONS Semantic approach performs at least as well
as the RIC-based approach on datasets studied
These approaches made significant improvements in some cases
Many of the datasets did not have complicated schema; a semantic approach didn’t provide as much benefit in those cases
![Page 23: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/23.jpg)
STRENGTHS/WEAKNESSES Strengths
Lots of examples Provides a useful solution to a common problem
Weaknesses Formalism sometimes made things more
complicated rather than more clear Assumes a lot of background knowledge
![Page 24: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/24.jpg)
FUTURE WORK Embed this functionality into pre-existing
mapping tools (they suggest Clio since a lot of their work is based off of this)
Add negation to semantic representation Investigate more complex semantic
mappings
![Page 25: A Semantic Approach to Discovering Schema Mapping](https://reader036.fdocuments.us/reader036/viewer/2022062814/568167a1550346895ddced0d/html5/thumbnails/25.jpg)
QUESTIONS???