VAMANA (Talk 2) ( vǎ - mǎ - nǎ )
description
Transcript of VAMANA (Talk 2) ( vǎ - mǎ - nǎ )
![Page 1: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/1.jpg)
`
1
VAMANA (Talk 2)
(vǎ - mǎ - nǎ)
Venkatesh Raghavan & Prof. Elke Rundensteiner
DSRG Talk
1ST May 2003
An Efficient XPath Query Engine Exploiting the MASS Index
![Page 2: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/2.jpg)
2
Introduction Purpose of the talk.
Generation of Execution Tree Execution
Running Example 1. Running Example 2.
XPath Expression Execution. Cost Estimation. Heuristics and Transformation.
![Page 3: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/3.jpg)
3
Running Examples
E.g. 1: //name/parent::person/descendant::watch
E.g. 2: //name [ text() = “Klemens Pelz” ]/parent::person
<people>
<person id="person1">
<name> Klemens Pelz </name>
<people>
<person id="person1">
<name> Hayato Cappelletti </name>
<watches>
<watch open_auction="open_auction82" />
![Page 4: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/4.jpg)
4
Bigger Picture
MASS(A Multi-Axis Storage Structure
for Large XML Documents)
VAMANA(XPath Query Engine)
XQuery Engine(future development)
Execution Tree
Mass Interface Node Set
Node Set
XPath Expression
XPath Processor
![Page 5: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/5.jpg)
5
How many “ROOT(s)” are there? Root of the Document
We call it “Document Root”
Root of the expression //name/parent::person/descendant::watch
We call it “First Location Step”
Root of Execution Tree We call it “ROOT”
![Page 6: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/6.jpg)
6
XPath Processor
Execution Tree
XPath Expression
XPath ProcessorE.g. 2: //name [ text() = “Klemens Pelz” ]/parent::person
name//
CONTEXT
personParent
ROOT
BIPRED=
PRED
textchild
OPERAND
“Klemens Plez”LITERALOPERAND
Phase 1: Parse Tree
![Page 7: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/7.jpg)
7
Contd..
name//
CONTEXT
personParent
ROOT
BIPRED=
PRED
textchild
OPERAND
“Klemens Plez”LITERALOPERAND
Phase I: Parse Tree
BIPRED=
PRED
textchild
OPERAND
“Klemens Plez”LITERALOPERAND
Phase II: Transformed Parse Tree
Execution Tree
XPath Expression
XPath Processor
![Page 8: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/8.jpg)
8
Phase III: Execution Tree Generation
Execution Tree
XPath Expression
XPath Processor
name//
CONTEXT
personParent
ROOT
BIPRED=
PRED
textchild
OPERAND
“Klemens Plez”LITERALOPERAND
Phase II: Transformed Parse Tree
“person”X: Parent
“name”X: //
“”X: child
“Klemens Plez”
BI_PREDICATE“EQ”
Phase III: VAMANA Execution Tree
![Page 9: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/9.jpg)
9
VAMANA Nodes (VNode)
Node Base
VRootNode
MassNode
VBinaryPredicateNode
VExistPredicateNode
VJoinNode
VLiteralNode
VAMANA(XPath Query Engine)
Execution Tree
Mass Interface Node Set
MASS
![Page 10: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/10.jpg)
10
VNode Structure
Context Side
Expression Side
Root Node
child
VAMANA(XPath Query Engine)
Execution Tree
Mass Interface Node Set
MASS
![Page 11: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/11.jpg)
11
VNode Flow Structure Data-Flow style of querying.
Most of commercial relational database system. Each node is arranged in a fashion such that data “flow”
from one node to another in a procedure-consumer fashion. Correctness. Each node performs some operation on the data that flows
through it. The result is produced by the last node on the dataflow chain.
IN SHORT: Data Flows upwards. Control Flows downwards.
Iterative.
VAMANA(XPath Query Engine)
Execution Tree
Mass Interface Node Set
MASS
![Page 12: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/12.jpg)
12
Contd. Iterative.
Currently VAMANA executes nodes iteratively. So no copies of the data is made.
IS IT A PROBLEM?
MASS produces nodes in document order so not a problem.
But there are some expression that in sibling order.
Work in progress.
VAMANA(XPath Query Engine)
Execution Tree
Mass Interface Node Set
MASS
![Page 13: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/13.jpg)
13
Execution Tree
“name”X: //
“watch”X: AXIS_DESCENDANT
“person”X: AXIS_PARENT
E.g. 1: //name/parent::person/descendant::watch
Context Side
Root Node
VAMANA(XPath Query Engine)
Execution Tree
Mass Interface Node Set
MASS
![Page 14: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/14.jpg)
14
How Do We EXECUTE ?
Step 1: Set Context Node of the root of the expression.
In this example the root of the expression is the root of the document.
Step 2: Ask the VAMANA Root Node for nodes.
//name/parent::person/descendant::watch
VAMANA(XPath Query Engine)
Execution Tree
Mass Interface Node Set
MASS
![Page 15: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/15.jpg)
15
Step1: Setting Context for the “First Location Step”
“watch”X: AXIS_DESCENDANT
“person”X: AXIS_PARENT
“name”X: //
//name/parent::person/descendant::watch
![Page 16: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/16.jpg)
16
OUT OF NODE
FETCHING
INTIAL
“watch”X: AXIS_DESCENDANT
“person”X: AXIS_PARENT
“name”X: //
b.i.c.c
b.i.c
b.i.c
b.i.c.m.c
b.i.c.c
b.i.c.m.c
//name/parent::person/descendant::watch
![Page 17: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/17.jpg)
17
“watch”X: AXIS_DESCENDANT
b.i.c
“person”X: AXIS_PARENT
b.i.c.c
“name”X: //
b.i.c
b.i.c.m.c
b.i.c.m.c
b.i.c.c
b.i.c.m.e
b.i.c.m.e//name/parent::person/descendant::watch
![Page 18: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/18.jpg)
18
“watch”X: AXIS_DESCENDANT
b.i.c
b.i.c.m.e
b.i.c.m.e
“person”X: AXIS_PARENT
b.i.c.c
b.i.i
“name”X: //
b.i.i.c
b.i.i.c
b.i.i
b.i.i.m.c
//name/parent::person/descendant::watch
![Page 19: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/19.jpg)
19
IO Operation
a.a , a.b , a.c
a.a.a , a.b.a, a.b.b , a.c.a , a.c.a, a.c.b/z
//y
** Please see handout
![Page 20: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/20.jpg)
20
Example 2
“name”X: //
“person”X: AXIS_PARENT
“ ”X: AXIS_CHILD
“Klemens Pelz”
BI_PREDICATEEQ
Context Side
Expression Side
//name [ text() = “Klemens Pelz” ]/parent::person
![Page 21: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/21.jpg)
21
“person”X: AXIS_PARENT
BI_PREDICATEEQ
“name”X: //
“ ”X: AXIS_CHILD
“Klemens Pelz”
b.i.e.c
b.i.e.c
b.i.e.c.b
Klemens Pelz
b.i.e.c
b.i.e
//name [ text() = “Klemens Pelz” ]/parent::person
b.i.e.c
b.i.e.c
![Page 22: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/22.jpg)
22
Determining Selectivity
Count.
The exact count of the number of nodes in MASS storage structure of that particular nodetest.
IN. The number of tuples that are fetched by the child VNode.
OUT. The number of tuples produced by the VNode.
I_Tuples. Total number of tuples processed till that VNode. This includes the cutrrent node also.
NodeType:NodeTest:X:Count:IN:OUT:I_Tuples:
![Page 23: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/23.jpg)
23
Example 1: //name/parent::person/emailaddress
NodeType: MASSNodeTest: nameX: //Count: 482IN: 482OUT: 482
NodeType: MASSNodeTest: personX: AXIS_PARENTCount: 255IN: 482OUT: ?
![Page 24: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/24.jpg)
24
Worst Case – Costing Categorize the axis into three division Division 1:
child | descendant | descendant-or-self
NodeType: NodeTest: X: Count:IN: OUT:
NodeType: NodeTest: X: Count:IN: OUT:
X
Y
Cases:
1. #X > #Y
2. #Y > #X#X
![Page 25: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/25.jpg)
25
Contd. Division 2:
parent, ancestor, ancestor-or-self, following, following-sibling, preceding, preceding-sibling
NodeType: NodeTest: X: Count:IN: OUT:
NodeType: NodeTest: X: Count:IN: OUT:
X
Y
Cases:
1. #X > #Y
2. #Y > #X#Y
![Page 26: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/26.jpg)
26
Contd. Division 3:
Self
For Example: //*/self::X Y/self::*
NodeType: NodeTest: X: Count:IN: OUT:
NodeType: NodeTest: X: Count:IN: OUT:
X
Y
Cases:
1. #X > #Y #Y
2. #Y > #X #X
![Page 27: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/27.jpg)
27
NodeType: MASSNodeTest: nameX: //Count: 482IN: 482OUT: 482I_Tuple: 482
NodeType: MASSNodeTest: personX: AXIS_PARENTCount: 255IN: 482OUT: 482I_Tuple: 737
NodeType: MASSNodeTest: watchX: AXIS_DESCENDANTCount: 488IN: 482OUT: 488I_Tuple: 1225
![Page 28: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/28.jpg)
28
What about Binary Operator Cost expression sides w.r.t. to child. Operator = AND | OR | EQ.
ALL go out.
Arithmetic Operators. ALL go out. Because cannot predict before execution.
![Page 29: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/29.jpg)
29
Contd.
![Page 30: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/30.jpg)
30
Heuristics
Higher the ratio, better the selectivity.
Generate a multimap <scaled(IN/OUT),VNode>. Each optimize-able node can then applied the
rules that apply to it.
Ratio = IN/OUT
Scaled Ratio = scale0..1 (IN/OUT)
![Page 31: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/31.jpg)
31
Transformation Rule 1:
“name”X: //
“person”X: AXIS_PARENT
BI_PREDICATEEQ
“ ”X: AXIS_CHILD
“Klemens Pelz”
Binary Predicate with text comparison Value Index
“name”X: //
“Klemens Pelz”X: AXIS_VALUE
“Klemens Pelz”
“name”X:AXIS_PARENT
![Page 32: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/32.jpg)
32
Transformation Rule 2 Mass Node to Join
“name”X: //
“watch”X: AXIS_DESCENDANT
“person”X: AXIS_PARENT
Root Node
“name”X: //
“person”X: AXIS_PARENT
“watch”
X: AXIS_DESCENDANT
JOINX: AXIS_DESCENDANT
//name/parent::person/descendant::watch
![Page 33: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/33.jpg)
33
* RemovalRule:
p/descendant :: */child::n ≡ p/descendant::nWhere,
p : path expression
Need for this rule: with nodes "*" as node test, during the cost
estimation this might be the spoilsport.
![Page 34: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/34.jpg)
34
“Axis::self” RemovalRule:
p/descendant::*/self::m ≡ p/descendent::m
Rule:
p/descendant-or-self::*/self::m ≡ p/descendent-or-self::m
Need for the node: “self” node in combination with * or a node test not
necessary.
![Page 35: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/35.jpg)
35
Reverse Axes rules Rule : p/descendant::n/parent::m
≡ //descendant-or-self::m[child::n]
Rule: p/descendant::n/m ≡ p/descendant::m[parent::n]
Rule: /descendant::m/preceding::n ≡ /descendant::n [ following::m]
From Paper: Symmetry in XPath by Dan Olteanu, Holger Meuss, Tim Furche, Francois Br
![Page 36: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/36.jpg)
36
Predicate Axis Rules Rule:
p/descendant::* [child::n] ≡ p [descendant::n] / descendant:: *
Predicate Node to Join.
![Page 37: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/37.jpg)
37
Conclusion Work in progress in THREE main areas.
Frame work for XPath expression execution. Selectivity Determination. Transformation Rules.
![Page 38: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/38.jpg)
38
![Page 39: VAMANA (Talk 2) ( vǎ - mǎ - nǎ )](https://reader036.fdocuments.us/reader036/viewer/2022062802/56814634550346895db341c9/html5/thumbnails/39.jpg)
39
References1. James Clark and Steve DeRose. XML Path Language (XPATH),
http://www.w3.org/TR/xpath, 2002.
2. S.Boag, D.Chamberlin, Mary F. Fernandez, D.Florescu, J.Robie and J.Siméon,
XQuery 1.0: An XML Query Language. W3C Working Draft, http://www.w3.org/TR/xquery/, 2002.
3. Kurt W. Deschler and Elke Rundensteiner. MASS- Multi Axis Storage Structure, 2002, Technical Report in progress\.
4. T. Milo and D. Suciu. Index structure for path expression, In Proceedings of 7th International Conference on Database Theory, 1999, pages 277-295.
5. Flavio Rizzolo, Alberto Mendelzon. Indexing XML Data with ToXin},WebDB, pages 49-54, Santa Barbara, USA, 2001.
6. Q. Li and B. Moon. Indexing and Querying XML Data for Regular Path Expressions, Proceedings of 27th International Conference on Very Large Database (VLDB'2001), Rome, Italy, September 2001, pages 361-370.
7. XMark - The XML Benchmark project. http://monetdb.cwi.nl/xml/.