Representation of Curves & Surfaces Prof. Lizhuang Ma Shanghai Jiao Tong University.
Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.
-
Upload
lisbeth-trigg -
Category
Documents
-
view
218 -
download
0
Transcript of Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.
![Page 1: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/1.jpg)
Toward Scalable Reasoning over Annotated RDF Data Using MapReduce
Chang Liu1, Guilin Qi2
1Shanghai Jiao Tong University2Southeast University, China
![Page 2: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/2.jpg)
MotivationMore interests to represent additional
information on top of RDFTime, uncertainty, trust, and provenance=> Annotated RDF
Large amount of dataYAGO2
Problem: Large Scale Reasoning
![Page 3: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/3.jpg)
Motivation (cont’d)Recent work on scalable reasoning using
MapReduceWebPIE (ISWC ‘09, ESWC ‘10)Fuzzy pD* (ISWC ‘11)
Our ideaLarge scale annoated RDF reasoner using
MapReduce
![Page 4: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/4.jpg)
Background: Annotated RDFSyntax:
Deductive rules:Subproperty, Subclass, Domain, Range,
GeneralizationExample:
Subproperty (a)
Zimmermann et al.: A general framework for representing, reasoning and querying with annotated Semantic Web data. Journal of Web Semantics 11, 72-95 (2012)
![Page 5: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/5.jpg)
Background: MapReduce
![Page 6: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/6.jpg)
Naïve ImplementationSubproperty (a)
Mapper Mapper Mapper
Reducer Reducer Reducer
(X, P, Y) : (P,sp,Q) :
(X,Q,Y) :
Key Value
P1 X Y
2 Q
![Page 7: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/7.jpg)
Challenges and solutionsGeneralization Rule
Delete triples from the data set
Large data reconstruction cost
SolutionOnly perform at the beginning and at the endCombine Generalization Rule with other rules
E.g. when a reducer generates and , it generates instead.
![Page 8: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/8.jpg)
Challenges and solutions (cont’d)Unnecessary Derivation
E.g. Waste a lot of computation time
SolutionIncorporate the annotation into mapped keyE.g.
Map to ((t1, p), (1, s,o, [1,2])) Map to (t3, p), (2, q, [3,4])) They will not be grouped together!
![Page 9: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/9.jpg)
Challenges and solutions (cont’d)Fixpoint Calculation
Subproperty/subclass rules require fixpoint iteration
SolutionLoad subproperty/subclass schema triples into
memoryCalculate the closure
Shortest path calculation Floyd-Warshall style algorithm
(𝑥1 , sp , 𝑥2 ) : 𝜆1 , (𝑥2 , sp , 𝑥3 ): 𝜆2 ,…, (𝑥𝑛 , sp , 𝑥𝑛+1 ) :𝜆𝑛⇒ (𝑥1 , sp ,𝑥𝑛+1 ): 𝜆1⊗…⊗𝜆𝑛
𝑥1 𝑥2 𝑥𝑛+1…“Shortest”
path
![Page 10: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/10.jpg)
Experiment setupDataset
Fuzzified DBPedia core ontologyfpdLUBM 1000, 2000, 4000, 8000
Cluster25 machine with 75 mapper/reducer slots
Liu et al.: Reasoning with Large Scale Ontologies in Fuzzy pD* Using MapReduce. Computational Intelligence Magazine, IEEE 7(2), 54-66 (2012)
![Page 11: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/11.jpg)
Experiment result - fuzzy DBPedia
#units 128 64 32 16 8 4 2
Time(sec.)
122.653
136.861
146.393
170.859
282.802
446.917
822.269
Speedup
6.70 6.01 5.62 4.81 2.91 1.84 1.00
Dataset: fuzzified DBPedia core ontology
Results:
![Page 12: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/12.jpg)
Experiment result – fpdLUBM
Number of Universities
Time of FuzzyPD (minutes)
Time of WebPIE (minutes)
1000 38.8 41.32
2000 66.97 74.57
4000 110.40 130.87
8000 215.48 210.01
Experimental results of FuzzyPD and WebPIE
![Page 13: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/13.jpg)
Experiment result– fpdLUBM (cont’d)
Number of units Time(minutes) Speedup
128 38.80 4.01
64 53.15 2.93
32 91.58 1.70
16 155.47 1.00
Scalability over number of units
![Page 14: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/14.jpg)
Experiment result– fpdLUBM (cont’d)Scalability over number of units
![Page 15: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/15.jpg)
Experiment result– fpdLUBM (cont’d)
Number of universities
Input (Mtriples)
Output (Mtriples)
Time (minutes)
Throughput (Ktriples/second)
1000 155.51 92.01 38.8 39.52
2000 310.71 185.97 66.97 46.28
4000 621.46 380.06 110.40 57.37
8000 1243.20 792.54 215.50 61.29
Scalability over data volume
![Page 16: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/16.jpg)
Conclusion and Future workWe show how to design MapReduce
algorithms to achieve scalable annotated RDFS reasoning
Several challenges along with solutions
Future workMore experiments on annotated RDFS
ontologiesAnnotated OWL 2 RL
![Page 17: Chang Liu 1, Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University, China.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c745503460f94926aa6/html5/thumbnails/17.jpg)
Q&A