1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland,...
-
date post
20-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland,...
![Page 1: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/1.jpg)
1
RDF Aggregate Queries and Views
Edward Hung, Yu Deng, V.S. Subrahmanian
University of Maryland, College Park
ICDE 2005, April 7, Tokyo, Japan
![Page 2: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/2.jpg)
2
Maintenance of RDF Aggregate Views Introduction of RDF and RDQL RDQL Extension for Aggregate Views Aggregate View Maintenance Algorithms
AMX Implementation and Experiments Related Work
![Page 3: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/3.jpg)
3
Introduction Resource Description Framework (RDF)
W3C RecommendationRepresents metadata about resources
identifiable on the web (by Uniform Resource Identifier (URI))
Triple: (Resource, Property, Value) (Artist, rdf:type, rdfs:Class) (Painter, rdf:type, rdfs:Class) (Painter, rdfs:subClassOf, Artist)
![Page 4: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/4.jpg)
<?xml version="1.0"?><!DOCTYPE rdf:RDF [<!ENTITY xsd "http://www.w3.org/2001/XMLSchema#">]><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xml:base="http://www.auctionschema.com/schema1#">
<rdfs:Class rdf:ID="Artist"/> <rdfs:Class rdf:ID="Painter"><rdfs:subClassOf
rdf:resource="#Artist"/></rdfs:Class> <rdfs:Datatype rdf:about="&xsd;string"/> <rdf:Property rdf:ID="fname"> <rdfs:domain rdf:resource="#Artist"/> <rdfs:range rdf:resource="&xsd;string"/> </rdf:Property></rdf:RDF>
<?xml version="1.0"?><!DOCTYPE rdf:RDF [<!ENTITY xsd "http://www.w3.org/2001/XMLSchema#">]><rdf:RDF xmlns:rdf ="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ns1="http://www.auctionschema.com/schema1#">
<rdf:Description rdf:about="http://www.artist.net#guyrose"> <rdf:type rdf:resource="ns1:Painter"/> <ns1:fname rdf:datatype="&xsd;string"> Guy </ns1:fname> </rdf:Description></rdf:RDF>
RDFSchema
RDFInstance
![Page 5: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/5.jpg)
<?xml version="1.0"?><!DOCTYPE rdf:RDF [<!ENTITY xsd "http://www.w3.org/2001/XMLSchema#">]><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xml:base="http://www.auctionschema.com/schema1#">
<rdfs:Class rdf:ID="Artist"/> <rdfs:Class rdf:ID="Painter"><rdfs:subClassOf
rdf:resource="#Artist"/></rdfs:Class> <rdfs:Datatype rdf:about="&xsd;string"/> <rdf:Property rdf:ID="fname"> <rdfs:domain rdf:resource="#Artist"/> <rdfs:range rdf:resource="&xsd;string"/> </rdf:Property></rdf:RDF>
<?xml version="1.0"?><!DOCTYPE rdf:RDF [<!ENTITY xsd "http://www.w3.org/2001/XMLSchema#">]><rdf:RDF xmlns:rdf ="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ns1="http://www.auctionschema.com/schema1#">
<rdf:Description rdf:about="http://www.artist.net#guyrose"> <rdf:type rdf:resource="ns1:Painter"/> <ns1:fname rdf:datatype="&xsd;string"> Guy </ns1:fname> </rdf:Description></rdf:RDF>
ArtistString
Painter
fname
subClassOf
&r1Guyfname
&r1 = http://www.artist.net#guyrose
![Page 6: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/6.jpg)
![Page 7: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/7.jpg)
7
RDQL: RDF Query Language
SELECT?highpriceWHERE (?artist, <ns1:lname>, "Rose"),(?artist, <ns1:fname>, "Guy"),(?artist, <ns1:creates>, ?artifact),(?artifact, <ns1:estimated>, ?price),(?price, <ns1:high>, ?highprice),(?artifact, <ns1:presented>, ?date)AND 2004-04-01 <= ?date <= 2004-04-30USING ns1 FOR http://www.auctionschema.com/schema1#>
graph pattern
![Page 8: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/8.jpg)
8
RDQL Extension for Aggregates and Views
CREATEVIEW AS SELECT max(?highprice)WHERE (?artist, <ns1:lname>, "Rose"),(?artist, <ns1:fname>, "Guy"),(?artist, <ns1:creates>, ?artifact),(?artifact, <ns1:estimated>, ?price),(?price, <ns1:high>, ?highprice),(?artifact, <ns1:presented>, ?date)AND 2004-04-01 <= ?date <= 2004-04-30USING ns1 FOR http://www.auctionschema.com/schema1#>
![Page 9: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/9.jpg)
9
Aggregate Query Aggregate operators, e.g. min, max, sum,
count, average GROUP BY clause Output a table of tuples
Output can be (i) an RDF instance or (ii) a tableAdvantage of (i): allows us to further query the
resultHowever, (ii) allows any forms of tables, which
include the possibility to output in the form of an RDF instance if the table consists of a set of RDF tuples.
![Page 10: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/10.jpg)
We are expanding the syntax of RDQL so that it allows constants in SELECT clauses which equivalently creates new resources using the constants.
For example, the previous query can be modified as followsCREATEVIEW AS
SELECTSELECT <ns1:works_by_guyrose>, <ns1:works_by_guyrose>, <ns1:maxprice>, <ns1:maxprice>, maxmax(?highprice)(?highprice)
WHERE (?artist, <ns1:lname>, "Rose"),(?artist, <ns1:fname>, "Guy"),(?artist, <ns1:creates>, ?artifact),(?artifact, <ns1:estimated>, ?price),(?price, <ns1:high>, ?highprice),(?artifact, <ns1:presented>, ?date)AND 2004-04-01 <= ?date <= 2004-04-30USING ns1 FOR http://www.auctionschema.com/schema1#>
The result is a valid RDF statement (<ns1:works_by_guyrose>,<ns1:maxprice>,``800000"^^ns1:USD)
![Page 11: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/11.jpg)
11
Aggregate View Maintenance
Relational Approach Store all triples in a relational table with schema
(Resource, Property, Value)OR Store resources and values of the same property in a
separate relational table with schema (Resource, Value)
#self-joins = (#triples in where-clause) – 1 Large number of delta rules during relational view
maintenance expensive
![Page 12: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/12.jpg)
12
Aggregate View Maintenance
Our ApproachLocalized search in RDF graphsModified version of breadth-first search
starting at the inserted/deleted edgeauxiliary data are needed for certain
aggregate views min, max, avg
![Page 13: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/13.jpg)
13
Distributive Aggregate Function An aggregate function f is distributive w.r.t a
source update operation if and only if the updated value is based on its old value and update
without reference to the source. Examples: count, sum, average w.r.t. insertion, deletion
and update For average, we will need an additional attribute size
which stores the size of intermediate result S in order to compute the correct updated value (or, we can use sum, count to calculate it)
max and min are distributive w.r.t. insertion, but not deletion and update Auxiliary data computed from S help to avoid the need to
refer to the source.
![Page 14: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/14.jpg)
graph pattern
![Page 15: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/15.jpg)
BAG
![Page 16: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/16.jpg)
BAG800000
![Page 17: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/17.jpg)
SELECT max(?highprice) BAG800000, 500000
![Page 18: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/18.jpg)
18
Compute Aggregates Algorithm CAA
Algorithm CAA(I, Q)/* Input: RDF graph I, query Q *//* Output: table T(Q, I) */1) GP BuildGP(Q); X aggregate variables
of Q;2) Y GROUP BY variables of Q;3) S [VRetrieve(θ, GP, X U Y) |
θMSearchAll(GP, Q, I)];4) Return T(Q, I) TCompute(S, Q);
![Page 19: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/19.jpg)
19
Aggregate View Maintenance Algorithms AMX AMI – Insertion AMD – Deletion AMT – Triple Modification AMR – Resource Modification
![Page 20: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/20.jpg)
Update: InsertionBAG
800000, 500000
paints
![Page 21: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/21.jpg)
BAG800000, 500000
paints
![Page 22: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/22.jpg)
SELECT max(?highprice) BAG800000, 500000, 60000
paints
![Page 23: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/23.jpg)
23
AMI for InsertionAlgorithm AMI(I, Q, A(Q, I), T(Q, I), t)/* Input: RDF graph I, query Q, auxiliary data A(Q, I),
query result T(Q, I), inserted triple t *//* Output: table T(Q, I U t), auxiliary data A(Q, I U t) *1) GP BuildGP(Q); 2) X aggregate variables of Q;3) Y GROUP BY variables of Q;4) If TMatch(GP, t) == TRUE, then
a) ΔS [VRetrieve(θ, GP, X U Y) | θMSearch(GP, Q, t, I U t)];
b) return (T(Q, I U t), A(Q, I U t)) TMaintainI(T(Q,I), ΔS, A(Q, I), Q);
5) else, return (T(Q, I U t), A(Q, I U t)) (T(Q, I), A(Q, I));
![Page 24: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/24.jpg)
24
Algorithm MSearch(GP, Q, t, I)
/* Input: graph pattern GP, query Q, triple t, RDF graph I */
/* Output: Θ = {θ | θ is a pattern matching} */
1) Θ ;
2) for each t’ GP s.t. θ’, t θ’ = t’ θ’,a) for each θ bSearch(t, t’, GP, I),
i. if θ satisfies the constraints in Q, then Θ Θ U θ;
3) return Θ;
![Page 25: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/25.jpg)
25
Handling GROUP BY
From GROUP BY clause, each tuple in ΔS affects a particular group.
TMaintainI only maintain each affected group (and its corresponding auxiliary data) using affecting tuples.
Delete empty groups and insert new groups.
![Page 26: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/26.jpg)
26
TMaintainI Handling sum, count, min, max
No auxiliary data requiredSuppose f(x) is an aggregate function on
attribute x, F the original result, F’ the new result
F’ = F + if f = sum F’ = F + |ΔS| if f = count F’ = min([F] U πx(ΔS)) if f = min
F’ = max([F] U πx(ΔS)) if f = max
πx(ΔS) projects a bag of values of x from ΔS
)( Sv xv
![Page 27: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/27.jpg)
27
TMaintainI
Handling averageWe need size of S
size’ = size+|ΔS|
'' )(
size
vsizeFF Sv x
![Page 28: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/28.jpg)
BAG800000, 500000, 60000Update: Deletion
paints
![Page 29: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/29.jpg)
BAG800000, 500000, 60000
paints
![Page 30: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/30.jpg)
SELECT max(?highprice) BAG500000, 60000
paints
![Page 31: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/31.jpg)
31
AMD for DeletionAlgorithm AMD(I, Q, A(Q, I), T(Q, I), t)/* Input: RDF graph I, query Q, auxiliary data A(Q, I),
query result T(Q, I), deleted triple t *//* Output: table T(Q, I - t), auxiliary data A(Q, I - t) *1) GP BuildGP(Q); 2) X aggregate variables of Q;3) Y GROUP BY variables of Q;4) If TMatch(GP, t) == TRUE, then
a) ΔS [VRetrieve(θ, GP, X U Y) | θMSearch(GP, Q, t, I)];
b) return (T(Q, I - t), A(Q, I - t)) TMaintainD(T(Q,I), ΔS, A(Q, I), Q);
5) else, return (T(Q, I - t), A(Q, I - t)) (T(Q, I), A(Q, I));
![Page 32: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/32.jpg)
32
TMaintainD
Handling min, maxMin and max are not distributive w.r.t. deletionWe need to store πx(S) which projects a bag
of values of x from SThe new aggregate value F’ is obtained by:
F’ = min(πx(S - ΔS)) if f = min
F’ = max(πx(S - ΔS)) if f = maxWe need to update πx(S) to become
πx(S) - πx(ΔS)
![Page 33: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/33.jpg)
33
Implementation and Experiment
Implemented in Java Jena – RDQL Engine of HP Comparison with Relational Approach (standard
view maintenance algorithm on relational tables) Counting Algorithm in Gupta et al. "Maintaining Views
Incrementally", SIGMOD 1993
Dataset: Chef Moz Project RDF dump Data stored in memory
![Page 34: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/34.jpg)
34
![Page 35: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/35.jpg)
35
Other Related Work Volz, Oberle, Studer [DBFUSION’02]
the first to introduce a view mechanism for RDF data Their views require that
1. the results contain class instances (i.e., a subject or object variable), or
2. the result itself has the pattern of RDF statement (i.e., a triple containing subject, predicate and object).
Magkanaraki et al [ISWC’03] proposed RVL, a view definition language that can
also create virtual RDF schemas and restructure class and property hierarchies such that new resources, property values, classes and property types can be created.
None of these works specifically address (i) aggregates in RDF or (ii) the problem of maintaining aggregate RDF views.
![Page 36: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/36.jpg)
36
Summary
Aggregate Views are important for RDF applications
RDQL Extension for Views and Aggregates
Aggregate View Maintenance Algorithms AMXLocalized search in RDF graphs
![Page 37: 1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.](https://reader030.fdocuments.us/reader030/viewer/2022032704/56649d405503460f94a1a56d/html5/thumbnails/37.jpg)
37
Thank you very much!
Questions and Answers