1
Doan Dai Duong and Le Thi Thu Thuy{Duong_Dai.Doan, Thuy_Thi_Thu.Le}@unb.ca
The University of New Brunswick, Fredericton, NB, Canada
A Unified Framework for the Semantic Integration of
XML Databases
First IEEE International Conference on Digital Information Management (ICDIM)
December 06-08, 2006
Presented by
Virendrakumar C. Bhavsar
2
Agenda
Introduction XML Declarative Description (XDD) Modeling of Data Components Modelling of Processing
Components Conclusion
3
Introduction General model of XML database integration
Integrated XML schema
Set of mappings
XML Database Schema
Integration System
XML schema1
XML schemaN
XML schema2
Ontology
RDS
RDS
OODS
convert
Step 1: Schema Integration
Integrated data
<studenewrrwerr"><Fname>><room/rrrrrrrrrrr><national/rrewe></studeerewrewnt>
query<Fname>><national/></student>
<student source=“A"><Fname> Xuan</Fname><room>G26</room><nationality>Vietnam</nationality>
</student><student source="B">
<Fname>Phuoc</Fname><room>A12</room><nationality>Campuchia</nationality>
</student>
Local data<student source="B">
<Fname> Xuan</Fname><room>G26</room><nationality>Vietnam</nationality>
</student><student source="B">
<Fname>Phuoc</Fname><room>A12</room><nationality>Campuchia</nationality>
</student>
Local data<student source=“C">
<Fname> Xuan</Fname><room>G26</room><nationality>Vietnam</nationality>
</student><student source="B">
<Fname>Phuoc</Fname><room>A12</room><nationality>Campuchia</nationality>
</student>
Local data
s
n r c
s
n r c
s
n r c
xxx
xx xx
x x
xx
Integrated schema
Step 2: Query Processing
Users
4
Powerful XDD supports for all tasks of framework
Input XML query, input XML data, output XML data Rules, constraints, mappings Metadata
Based on XML standard format, XDD combines all tasks of framework tightly and makes it easily to manipulate data
Reduce time and effort of programmers and users and syntax errors
Integration
system
Integrated schema
Database sources
Metadata
User query
Integrated data
Proposed Integration Framework
XMLSchem
aXML database
XML data
XML query
XDD as underlying model
5
6
XML Declarative Description*
XML Declarative Description (XDD) is XML-based information representation
Ordinary XML expressions (ground XML expressions)+ variables = Non-ground XML expressions
Enhancement of expressive power and representation of implicit information
XML clauses of the form H ← B1, … , Bm, C1, …, Cn
Able to express conditions, constraints*Wuwongse, V., Anutariya, C., Akama, K., and Nantajeewarawat, E. XML Declarative Description (XDD): A Language for the Semantic Web. IEEE Intelligent Systems, Vol. 16, No. 3, (2001) 54-65
7
XML Databases Extension (actual data values): ground XML
expressions Intension (schemas, logical specifications,
relationships, indexes and constraints): non-ground XML expressions
XML Queries Include constructor, patterns, and filters Correspond to three parts (H, Bi, Cj) of XDD
rule H B1 …, Bm, C1,…,Cn
Modeling of Data Components
<Student> <name>John</name> <nationality>Canadian </nationality> <GPA>4</GPA></Student>
<Student> <name>Duong</name> <nationality>Vietnamese </nationality> <GPA>4.2</GPA></Student>
<Student> <name>John</name> <nationality>Canadian </nationality> <GPA>4</GPA> <phone>234-7856<phone> <ID>3224567<ID> </Student>
<Student> <name>Duong</name> <nationality>Vietnamese </nationality> <GPA>4.2</GPA> <phone>456-3241<phone></Student>
Data source
Queryresult1 result2
Query Execution Example
10
Mappings Describes correspondence between object
in integrated schema and its corresponding objects in local schemas
Supports decomposing XML queries and converting data
Modeled by non-ground XML expressions
Modeling of Data Components
12
Schema Integration Component The main task is to resolve conflicts between
schemas of participating databases Conflict resolution between various schemas
is done at one time (one-shot strategy) Each local schema is big non-ground XML
expression ($E_variable)
Modelling of Processing Components
13
<Integrating_schema>
<schema name="1">…</schema>
<schema name="2">…</schema>
…
<schema name="n">…</schema>
</Integrating_schema>
<schema name="1"></schema>
<schema name="2"></schema>
<schema name="n"></schema>
$E expression
$E expression
$E expression
Schema Integration Component
XDD can interactively process all schemas as $E expressions
14
Schema Conflict Classification
Naming conflicts Synonyms
Acronyms
Homonyms
Structural conflicts Missing items conflicts
Internal path discrepancy conflicts
Aggregation conflicts
Generalization/specification
Constraint conflicts Occurring numbers of elements
Fixed vs. default values
Constraints of attributes
Data type conflicts Disjoint or incompatible data types
Compatible data types
IDREF and IDREFS
Conflicts between schemas can be classified into four main kindsConflicts between schemas can be classified into four main kinds
Aggregationconflict
Professor
FName MName LName
Professor
Name
Professor
FName MName LName Name
Union rule
Professor
FName MName LName
Name
Aggregation checking and
data type constructing ruleNew
data type is
created
14
16
Query Decomposition The main task yield n local subqueries from global query
<student id =“$S:id”><name>$S:name</name><country>$S:country</country>
</student>
<SATstudent key =”$S:id” source=”B”><fullname> $S:name </fullname><country>$S:country</country>
</SATstudent>
<SOMstudent id=”$S:id” source=”A”><name> $S:name </name><nation>$S:country</nation>
</SOMstudent>
SATstudent
country
fieldStudy
fullname
key
SOMstudent
nation program
name position
id
student
country
fieldname positionid
Integrated schema
Schema for source B
Schema for source A
Query Decomposition
B. Solution
Sub query for local source
query Query Decomposition
Sub query for local source
Mappings from global to localA. Brief
view
<student id =”$S:id”><name>$S:name</name><country>$S:country</country>
</student>
XML metadata
•XDD rules for transformation
Input XML query
<SATstudent key =”$S:id” source=”B”>
<country>$S:country</country></SATstudent>
<SOMstudent id =”$S:id” source=”A”>
<nation>$S:country</ nation></SOMstudent>
Output XML queries
<name> $S:name</name>
<fullname> $S:name </fullname>
16
<answer><SATstudent source=”B”> <country>$S:country</country></SATstudent><SOMstudent source=”A”> <nation>$S:country</nation></SOMstudent>
</answer>
<answer>$E:expression
</answer> <Mapping> <student> <country>$S:country</country> </student> <local>$E:expression</local> </Mapping>
<Mapping> <student>
<country>$S:country</country> </student> <local>
<SATstudent source=“B"> <country>$S:country</country></SATstudent><SOMstudent source=“A">
<nation>$S:country</nation></SOMstudent>
</local></Mapping>
1
matches with
2
bounds to
3
4
infers to
results in
Local query for source A
Local query for source B
Query Decomposition Example
19
Query Decomposition
Using special structure of mapping and applying XDD rules for query decomposition Subqueries for distributed data sources are
simultaneously produced
Similarly for data conversion, extracted data are simultaneously converted to global schema format
20
Conclusion
XDD is used to model all data components and processing components of XML database integration framework
Components of system modeled by XDD can communicate and
exchange data easily Special structure for XDD-based bidirectional mappings is
designed. Information is produced efficiently for both query decomposition and data conversion, avoiding data redundancy
The framework can Integrate n participating schemas Decompose a query into n subqueries at a time.
Top Related