Sum-Max Monotonic Ranked Joins for Evaluating Top-K Twig Queries on Weighted Data Graphs Yan Qi,...
-
Upload
angelina-conley -
Category
Documents
-
view
217 -
download
0
description
Transcript of Sum-Max Monotonic Ranked Joins for Evaluating Top-K Twig Queries on Weighted Data Graphs Yan Qi,...
Sum-Max Monotonic Ranked Joins for Evaluating Sum-Max Monotonic Ranked Joins for Evaluating Top-K Twig Queries on Weighted Data GraphsTop-K Twig Queries on Weighted Data Graphs
Yan Qi, Arizona State UniversityK. Selcuk Candan, Arizona State University
Maria Luisa Sapino, University of Torino
VLDB ’07, September 23-28, 2007, Vienna, AustriaVLDB ’07, September 23-28, 2007, Vienna, Austria
2008. 01. 25Summarized by Dongjoo Lee, IDS Lab., Seoul National University
Presented by Dongjoo Lee, IDS Lab., Seoul National University
Copyright 2008 by CEBT
ContentsContents Motivation Top-k Twig Queries over Weighted Data Graphs Answering Twig Queries on Weighted Graphs
Sum-Max Monotonicity Progressive Result Enumeration
HR-Join, MHR-Join Experiments Conclusion
2
Copyright 2008 by CEBT
MotivationMotivation Query Processing on Metadata with Conflicts (FICSR,
SIGMOD07) Feedback-based InConSistency Resolution and Query
Processing on Misaligned Data Sources
3
Copyright 2008 by CEBT
FICSR Integrated RepresentationFICSR Integrated Representation
4
Internal FICSR Representation
Simplified Visualization for the User
Copyright 2008 by CEBT
FICSR AggrementsFICSR Aggrements Based on source analysis and user feedback
5
Copyright 2008 by CEBT
Top-k Twig Queries over Weighted Data Top-k Twig Queries over Weighted Data GraphsGraphs XPath, XQuery
6
More desirable!Can we find it before the other?
Copyright 2008 by CEBT
Answering Twig Queries on Weighted Answering Twig Queries on Weighted GraphsGraphs
NP-complete problem By reduction from the Group Steiner Tree Problem.
7
VLSI design
Copyright 2008 by CEBT
How to Solve the Problem?How to Solve the Problem? Use ranked-join algorithms for top-k queries
A query plan with better sub-plans is always more desirable Must be monotonic
Twig query? Sum-max monotonicity is held! We can enumerate the result incrementally!
8
Copyright 2008 by CEBT
Sum-Max Monotonicity (1)Sum-Max Monotonicity (1)
9
A//C A//D A[//C]//D
A
E
C B
D
5
7 3
210
12
A
E
C B
D
5
7 3
210
12
A
E
C B
D
5
7 3
210
12
Cost = 12 Cost = 10 Cost = 17 < 22
PROPOSITION 1q = Tq (Vq, Eq)r = <µnode, uedge>SR = {sr1, sr2, …, srm}
Rsr
iiRsri
i
srRsr )(cost)(cost))(cost(max
(max(10, 12) = 12) < (cost = 17) < (12 + 10 = 22)
Copyright 2008 by CEBT
Sum-Max Monotonicity (2)Sum-Max Monotonicity (2)
10
PROPERTY 1answer r1, r2
sets of sub result R1, R2
)(cost)(cost))(cost(max)(cost 211
2
rrsrsrRsr
jRsrii
j
Use for pruning
Copyright 2008 by CEBT
Progressive Result EnumerationProgressive Result Enumeration
11
cost
Join for A[//C]//BA//B A//C
v3(4) v2(2)
v1(7)v3(4) v2(2)
v8(5) v1(7)v9(7) v19(8)
v8(9)
Horizon = ∞
Horizon = 14Horizon = 11
14 ( <=5+9)
v3(4) v2(2)
v8(5) v1(7)v9(7) v19(8)
v4(10) v8(9)v1(10) v5(10)
14
11 ( <=10+7)
1411
v3(4) v2(2)
v8(5) v1(7)v9(7) v19(8)
v4(10) v8(9)v1(10) v5(10)
v11(13) v15(10)v16(12)
14
11
Horizon = 14Horizon Tightening
Horizon Relaxation
The best can be returned as the top-1 result
submatchesallsubmatchesnecessarysubmatchesalladdataoverhe
_#_#_#
Copyright 2008 by CEBT
HR-Join : Horizon based Ranked JoinHR-Join : Horizon based Ranked Join
12
Result SieveCreates cost ranked output streamUse heap sort Alg.Control the horizon valves
Horizon ValveControl the data availability on a given stream of cost-ranked dataUse horizon variable that externally controlled
Copyright 2008 by CEBT
Query Plan using HR-Join OperatorsQuery Plan using HR-Join Operators
13
A query twig and sub-queries
Copyright 2008 by CEBT
M-way HR-Joins (HRM-Joins)M-way HR-Joins (HRM-Joins)
14
A query twig and sub-queries
Copyright 2008 by CEBT
Sub JobsSub Jobs Sub-result enumeration
K-shortest simple paths problem O(k|V| (|E| + |V|log|V|))
Dealing with “*” wildcards in twigs Can be expensive Query rewriting
15
Copyright 2008 by CEBT
ExperimentsExperiments Data
FICSR weighted graph data Query plans
HR-Join, M-way HR-Join 2 significantly different join selectivity distributions
– ~10% and 1 to 1
16
Copyright 2008 by CEBT
ResultsResults
17
Copyright 2008 by CEBT
ConclusionConclusion Twig query processing over weighted data graphs. Optimization using HR-Join based on sum-max
monotonicity HR-Join, MHR-Join
18