gStore: Answering SPARQL Queries Via Subgraph Matching
description
Transcript of gStore: Answering SPARQL Queries Via Subgraph Matching
![Page 1: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/1.jpg)
Lei Zou1, Jinghui Mo1, Lei Chen2, M. Tamer Özsu3, Dongyan Zhao1
1
gStore: Answering SPARQL Queries Via Subgraph Matching
1Peking University,2Hong Kong University of Science and
Technology,3University of Waterloo
![Page 2: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/2.jpg)
Outline
• Background & Related Work
• Overview of gStore
• Encoding Technique
• VS*-tree & Query Algorithm
• Experiments
• Conclusions
2
![Page 3: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/3.jpg)
Outline
• Background & Related Work
• Overview of gStore
• Encoding Technique
• VS*-tree & Query Algorithm
• Experiments
• Conclusions
3
![Page 4: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/4.jpg)
Semantic Web
4
“Semantic Web Technologies” is a collection of standard technologies to realize a Web of Data.
![Page 5: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/5.jpg)
RDF Data Model
5
URI
URI
Literals
![Page 6: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/6.jpg)
RDF Graph
6
Entity VertexLiteral Vertex
![Page 7: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/7.jpg)
SPARQL Queries
7
SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. }
SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. }
Query Graph
![Page 8: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/8.jpg)
Subgraph Match vs. SPARQL Queries
8
![Page 9: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/9.jpg)
Naïve Triple Store
9
SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. }
SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. }
SQL: Select T3.SubjectFrom T as T1, T as T2, T as T3Where T1.Predict=“BornOnDate” and T1.Object=“1809-02-12” and T2.Predict=“DiedOnDate” and T2.Object=“1865-04-15” and T3. Predict=“hasName” and T1.Subject = T2.Subject and T2. Subject= T3.subject
Too many Self-Joins
![Page 10: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/10.jpg)
Existing Solutions Three categories of solutions are proposed to speed up query
processing: 1. Property Table; Jena [K. Wilkinson et al. SWDB 03], …
2. Vertically Partitioned Solution; SW-store [D. J. Abadi et al. VLDB 07],…
3. Exhaustive-IndexingRDF-3x [T. Neumann et al. VLDB 08], Hexastore [C. Weiss et al. VLDB 08 ],…
10
![Page 11: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/11.jpg)
Existing Solutions-Property Table
11
SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. }
SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. }
SQL: Select People.hasName from People where People.BornOnDate = “1809-02-12” and People.DiedOnDate = “1865-04-15”.
Reducing # of join steps
![Page 12: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/12.jpg)
Existing Solutions-Vertically Partitioned Solution
12
Fast Merge Join
![Page 13: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/13.jpg)
Existing Solutions- Exhaustive-Indexing
Each SPARQL query statement can be translated into one “range query”.
SPARQL Query: Select ?name Where {
?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. }
13
Range query &
Merge Join
![Page 14: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/14.jpg)
Some Limitations
1. Difficult to handle ``wildcard queries’’.
2. Difficult to handle updates.
14
![Page 15: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/15.jpg)
Outline
• Background & Related Work
• Overview of gStore
• Encoding Technique
• VS*-tree & Query Algorithm
• Experiments
• Conclusions
15
![Page 16: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/16.jpg)
Intuition of gStore
16
Finding Matches over a Large Graph is not a trivial task.
![Page 17: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/17.jpg)
Preliminaries
17
Entity VertexLiteral Vertex
![Page 18: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/18.jpg)
Storage Schema in gStore
18
Encoding all neibhors into a “bit-string”, called signature.
![Page 19: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/19.jpg)
Encoding Technique (1)
19
“Abr”, “bra”,
”rah”,
”aha”,….,
( hasName, “Abraham Lincoln”)
0010 0000 0000
0000 0010 0000 0000
1000 0000 0000 0000
0000 0000 0100 0000
0000 0000 0000 0001
1000 0010 0100 0001
OR
1000 0010 0100 0001
( BornOnDate, “1809-02-12”)
0100 0000 0000 0100 0010 0100 1000
( DiedOnDate, “1865-04-15”)
0000 1000 0000 0000 0010 0100 0000
( DiedIn, “y:Washington_D.c”)
0000 0010 0000 1000 0010 0100 0001
0000 0010 0000 1100 0010 0100 1001
OR
![Page 20: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/20.jpg)
Encoding Technique (2)
20
![Page 21: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/21.jpg)
Encoding Technique (3)
21
![Page 22: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/22.jpg)
Outline
• Background & Related Work
• Overview of gStore
• Encoding Technique
• VS-tree & Query Algorithm
• Experiments
• Conclusions
22
![Page 23: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/23.jpg)
A Straightforward Solution (1)
23
001
004
006
002
003
006
u1 u2
L1 L2
![Page 24: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/24.jpg)
A Straightforward Solution (2)
24
001
004
006
002
003
006
Large Join Space !
L1 L2
![Page 25: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/25.jpg)
VS-tree
![Page 26: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/26.jpg)
Pruning Technique
26
u1 u2
31d
34d
34d
32d
3G
10010
001
004
006
002
003
006
*G
Reduced Join
Space!
![Page 27: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/27.jpg)
An Example for Pruning Effect
27
Query:?x1 y:hasGivenName ?x5 ?x1 y:hasFamilyName ?x6 ?x1 rdf:type <wordnet_scientist_110560637> ?x1 y:bornIn ?x2 ?x1 y:hasAcademicAdvisor ?x4 ?x2 y:locatedIn <Switzerland> ?x3 y:locatedIn <Germany> ?x4 y:bornIn ?x3
Before Pruning
After Pruning
x1 810 810
X2 424 197
x3 66 66
x4 36187 6686
![Page 28: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/28.jpg)
Query Algorithm-Top-Down
28
![Page 29: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/29.jpg)
Outline
• Background & Related Work
• Overview of gStore
• Encoding Technique
• VS*-tree & Query Algorithm
• Experiments
• Conclusions
29
![Page 30: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/30.jpg)
Datasets
30
Triple # Size
Yago 20 million 3.1GB
DBLP 8 million 0.8 GB
![Page 31: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/31.jpg)
Exact Queries
31
![Page 32: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/32.jpg)
Wildcard Queries
32
![Page 33: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/33.jpg)
Outline
• Background & Related Work
• Overview of gStore
• Encoding Technique
• VS*-tree & Query Algorithm
• Experiments
• Conclusions
33
![Page 34: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/34.jpg)
Conclusions
• Vertex Encoding Technique;
• An Efficient index Structure: VS-tree;
• A Novel Filtering Technique.
34
![Page 36: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/36.jpg)
Updates- Insertion in G*
36
![Page 37: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/37.jpg)
Updates- Insertion in VS*-tree
37
![Page 38: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/38.jpg)
Updates- Deletion in VS*-tree
38
To be deleted
![Page 39: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/39.jpg)
Framework in gStore
39
![Page 40: gStore: Answering SPARQL Queries Via Subgraph Matching](https://reader035.fdocuments.us/reader035/viewer/2022062723/56813b9d550346895da4d597/html5/thumbnails/40.jpg)
A Straightforward Solution (1)
40
0000 1000u u & 001 = u