Post on 24-Sep-2018
MonetDB: the Challenges of a Scientific Database
Milena Ivanova, Niels Nes, Romulo Goncalves, Martin Kersten
CWI, Amsterdam
M. Ivanova et al., CWI
Outline
• MonetDB/SQL • SkyServer porting lessons • Query log lessons • Recycling • Evaluation • Outlook
M. Ivanova et al., CWI
MonetDB Background
H 0@0
T Ra
0@0 1@0 2@0
0.0645 0.1433 0.2811 …
H 0@0
T Dec
0@0 1@0 2@0
1.2079 1.0662 1.2495 …
H 0@0
T U
0@0 1@0 2@0
14.70872 11.71277 12.02889 …
Ra Dec U ... 0.0645 0.1433 0.2811 …
1.2079 1.0662 1.2495 …
14.70872 11.71277 12.02889 …
… … …
PhotoObjAll
Ra BAT Dec BAT U BAT
M. Ivanova et al., CWI
MonetDB Architecture
SQL
MonetDB Server
Tactical Optimizer
MonetDB Kernel
XQuery
MAL
MAL
function user.s3_1():void; X1:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",0); X6:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",1); X9:bat[:oid,:lng] := sql.bind("sys","photoobjall","objid",2); X13:bat[:oid,:oid] := sql.bind_dbat("sys","photoobjall",1); X8 := algebra.kunion(X1,X6); X11 := algebra.kdifference(X8,X9); X12 := algebra.kunion(X11,X9); X14 := bat.reverse(X13); X15 := algebra.kdifference(X12,X14); X16 := calc.oid(0@0); X18 := algebra.markT(X15,X16); X19 := bat.reverse(X18); X20 := aggr.count(X19); sql.exportValue(1,"sys.","count_","int",32,0,6,X20,""); end s3_1;
select count(*) from photoobjall;
M. Ivanova et al., CWI
SkyServer with MonetDB
Goal: To provide SkyServer mirror with similar functionality using MonetDB
Three phases: 1%, 10%, entire SDSS data set Can we • Do better in terms of performance and
functionality? • Improve query processing by novel parallelism and
query cracking techniques?
M. Ivanova et al., CWI
Portability Lessons
• Need for rich SQL environment (PSM) • Cast to SQL:2003 standard
– Replacement of data types and operations – Specific extensions ignored or replaced
• Avoid data redundancy – Auxiliary tables replaced by views:10%
size reduction
M. Ivanova et al., CWI
Spatial Search Lesson
• HTM (Hierarchical Triangular Mesh) – Implemented in C++, C# – Good for point-near-point and point-in-
region queries • Zones
– Implemented in SQL – Good for point-near-point (x3) – Efficient for batch-oriented spatial join(x32) – Enables SQL optimizer usage
M. Ivanova et al., CWI
Query Log Lessons
• Query logs important for both application and science
• Analysed 1.2M queries, August 2006 • Spatial access prevails (83%) • Small core of photo and spectro tables
accessed – 64% photo, 44% spectro, 27% both
M. Ivanova et al., CWI
Common Patterns
• Limited number of query patterns – Correlation to web site interface
• Most popular query (25%) SELECT top 10 p.objID, p.run, p.rerun, p.camcol, p.field, p.obj, p.type, p.ra, p.dec, p.u, p.g, p.r, p.i, p.z, p.Err_u, p.Err_g, p.Err_r, p.Err_i, p.Err_z
FROM fGetNearbyObjEq(195,2.5,3) n, PhotoPrimary p WHERE n.objID = p.objID;
M. Ivanova et al., CWI
Spatial Overlap
• 24% queries overlap • Mean sequence length of 9.4,
max of 6200 • Overlap and equality patterns for script-
based interaction • Zoom in/zoom out patterns for manual
interaction
M. Ivanova et al., CWI
Evaluation on 100GB
• ‘Color-cut’ for low-z quasars SELECT g, run, rerun, camcol, field, objID, FROM Galaxy WHERE ( ( g <= 22) and
(u - g >= -0.27) and (u - g < 0.71) and (g - r >= -0.24) and (g - r < 0.35) and (r - i >= -0.27) and (r - i < 0.57) and (i - z >= -0.35) and (i - z < 0.7) );
• Moving asteroids SELECT objID, sqrt(power(rowv,2) + power(colv,2)) as velocity
FROM PhotoObj WHERE power(rowv,2) + power(colv,2) > 50 and rowv >= 0 and colv >= 0;
M. Ivanova et al., CWI
Staircase to the sky
• Status nov 2008 – 1GB: done – 100GB: done – 2.7 TB DR6
• Platform – Dual quadcore 2.4Ghz,
64GB, 6TB raid5
• Web site
M. Ivanova et al., CWI
Moving ahead
• Progress 2009 – Download DR 7 and installation – Development of export/attach functionality – Development of partial result recycler
MonetDB Background • Tuple-at-a-time pipelined execution
– Materialized views and caches – Semi-automatic
• Operator-at-a-time – Materialized intermediates – Automatic management and low cost
Self-organizing cache of intermediates to speed up query streams
Run-time Support
Recycler Optimizer
MonetDB Architecture SQL
MonetDB Server
Tactical Optimizer
MonetDB Kernel
XQuery
MAL
MAL
Recycle Pool
function user.s1_2(A0:date, ...):void; X5 := sql.bind("sys","lineitem",...); X10 := algebra.select(X5,A0); X12 := sql.bindIdx("sys","lineitem",...); X15 := algebra.join(X10,X12); X25 := mtime.addmonths(A1,A2); ...
function user.s1_2(A0:date, ...):void; X5 := sql.bind("sys","lineitem",...); X10 := algebra.select(X5,A0); X12 := sql.bindIdx("sys","lineitem",...); X15 := algebra.join(X10,X12); X25 := mtime.addmonths(A1,A2); ...
Admission & Eviction
Instruction Matching Run time comparison of • instruction types • argument values
Name Value Data type Size X1 10 :bat[:oid,:date] T1 “sys” :str
T2 “orders” :str …
X1 := sql.bind("sys","orders","o_orderdate",0); …
Y3 := sql.bind("sys","orders","o_orderdate",0);
Exact matching
Instruction Subsumption
Name Value Data type Size X1 10 :bat[:oid,:int] 2000 X3 130 :bat[:oid,:int] 700 X5 150 :bat[:oid,:int] 350 …
X3 := algebra.select(X1,10,80); …
Y3 := algebra.select(X1,20,45);
X5 := algebra.select(X1,20,60); X5
a Cache with Lineage
algebra.join
sql.bind(“C1“)
algebra.select
sql.bind(“C2“)
…
sql.bind(“C1“) X1 :=
algebra.select(X1) X2 :=
sql.bind(“C2“)
X3 :=
algebra.join(X2,X3)
X4 :=
Q1
a Cache with Lineage
algebra.join
sql.bind(“C1“)
algebra.select
sql.bind(“C2“)
X1 := sql.bind(“C1“) X2 := algebra.select(X1) X3 := sql.bind(“C2“) X4 := algebra.join(X2,X3)
algebra.join
sql.bind(“C3“)
…
X1
X2
X3
X4
Q2
Mismatching
algebra.join
sql.bind(“C1“)
algebra.select
sql.bind(“C2“)
X1 := sql.bind(“C1“) X2 := algebra.select(X1) X3 := sql.bind(“C2“) X4 := algebra.join(X2,X3)
algebra.join
sql.bind(“C3“)
…
Y1
Y2
Y3
Y4
Y3 := sql.bind(“C2“)
Y2 := algebra.select(Y1)
Y1 := sql.bind(“C1“)
Y4 := algebra.join(Y2,Y3)
!=X2
!=X3
Q2
Admission Policies Decide about storing the results • KEEPALL
– all instructions advised by the optimizer • CREDIT
– instructions supplied with credits – storage ‘paid’ with 1 credit – reuse returns credits – lack of reuse limits admission and resource
claims
Cache Policies • Decide about eviction of intermediates • Filter ‘top’ instructions without dependents • Pick instructions with smallest utility
– LRU : time of computation or last reuse – BENEFIT : estimated contribution to
performance: CPU and I/O costs, recycling • Triggered by resource limitations (memory
or entries)
SkyServer Evaluation
• 100 GB subset of DR4
• 100-query batch from January 2008 log
• 1.5GB intermediates, 99% reuse
• Join intermediates major contributor to savings
Status aug 2009
• DR 7 fully loaded – Loading and integrity checking – Queries ran and traces collected – 01: real 1m43.142s 5 rows – 02: real 0m10.836s 310 rows – 03: real 9m55.870s 7805794 rows – 04: real 3m46.905s 2088794 rows – 05: real 3m54.591s 264954 rows – 06: real 7m11.867s 584884 rows – 07: real 0m0.501s 1148 rows – 08: real 1m43.935s 58599 rows – 09: real 26m23.968s 33086 rows – 11: real 13m25.067s 11 rows – 12: real 0m0.914s 44 rows – 13: real 0m0.893s 4 rows – 14: real 5m18.018s 207 rows
M. Ivanova et al., CWI
Query traces
[ 21472701 usec @0 _55<tmp_37313>[357175411] := algebra.uselect(_53=<tmp_37322>[585634220],1); ] [ 49 usec @0 _53 := nil:BAT; ] [ 38150716 usec @0 _56<tmp_37322>[357175411] := algebra.semijoin(_45=<tmp_37316>[585634220],_55=<tmp_37313>[357175411]); ] [ 38 usec @0 _45 := nil:BAT; ] [ 284808 usec @0 _55 := nil:BAT; ] [ 31309800 usec @0 _57<tmp_37313>[180377011] := algebra.uselect(_56=<tmp_37322>[357175411],6:sht); ] [ 2059575 usec @0 _56 := nil:BAT; ] [ 32340510 usec @0 _58<tmp_37322>[43684077] := algebra.semijoin(_36=<tmp_37330>[132480668],_57=<tmp_37313>[180377011]); ] [ 91705 usec @0 _36 := nil:BAT; ] [ 229621 usec @0 _59<tmp_37264>[585634220] := batcalc.flt(_26=<tmp_37332>[585634220]); ] [ 25123250 usec @0 _60<tmp_37316>[409973371] := algebra.thetauselect(_59=<tmp_37264>[585634220],A1=22.2999992,">"); ] [ 1208014 usec @0 _59 := nil:BAT; ] [ 30692604 usec @0 _61<tmp_37264>[121305794] := algebra.semijoin(_60=<tmp_37316>[409973371],_57=<tmp_37313>[180377011]); ] [ 295106 usec @0 _60 := nil:BAT; ] [ 4440129 usec @0 _57 := nil:BAT; ] [ 19408689 usec @0 _62<tmp_37313>[132511910] := algebra.kunion(_58=<tmp_37322>[43684077],_61=<tmp_37264>[121305794]); ] [ 105538 usec @0 _58 := nil:BAT; ] [ 320537 usec @0 _61 := nil:BAT; ] [ 49 usec @0 _64<tmp_37264>[132511910] := algebra.markT(_62=<tmp_37313>[132511910],0@0); ] [ 6 usec @0 _62 := nil:BAT; ] [ 5 usec @0 _65<tmpr_37264>[132511910] := bat.reverse(_64=<tmp_37264>[132511910]); ] [ 3
M. Ivanova et al., CWI
• 2264769425 32 algebra.join • 1535721272 285 algebra.leftjoin • 636128509 76 algebra.semijoin • 306371507 59 algebra.uselect • 112269101 10 batcalc.* • 106087174 36 batcalc.- • 101124840 28 algebra.thetauselect • 54344621 2 user.getnearbyobjectsmode • 19421076 391 algebra.kunion • 2577559 25 batcalc.flt • 1242179 8 batcalc.int • 1105366 9 batcalc.+ • 657450 30 bat.append • 402815 570 sql.bind
M. Ivanova et al., CWI
Summary
• Database architecture augmented with recycling intermediates
• Self-organizing technique • Extension to MonetDB transforming
materialization overhead into benefit
Future Work
• Refining cache policies • Opportunities by query class recognition • Automatic switch to suitable policies • Automatic database replication • Distributed processing (Octopus)
30/06/2009 SIGMOD'09 Providence, RI
An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves
31/20
Recycling
Is Green