Aims2012
-
Upload
ictseserv -
Category
Technology
-
view
232 -
download
0
description
Transcript of Aims2012
© 2012 UZH, CSG@IFI
Cooperative Database Caching within Cloud Environments
Andrei Vancea1, Guilherme Sperb Machado1, Laurent d’Orazio2, Burkhard Stiller1
1 Department of Informatics IFI, Communication Systems Group CSG, University of Zürich UZH, Switzerland
2Blaise Pascal University - LIMOS, Francevancea,[email protected], [email protected]
AIMS, Luxembourg, Luxembourg, June 6, 2012
© 2012 UZH, CSG@IFI
Background
Databases – Client: asks a query (SQL)– Server: returns the result (tuples)
Client-side caching– Page Caching, Tuple Caching – Semantic Caching
• Clients store the results of old queries
• Old results used for answering new queries
© 2012 UZH, CSG@IFI
Background - Semantic Caching
QUERYREWRITING
Query
Probe Remainder
Semanticcache
Server
Queriesdescriptions
Semantic Regions– Query description– Result set
Query rewriting– Probe– Remainder
© 2012 UZH, CSG@IFI
Database Caching & Cloud Computing
Most cloud providers charge data transfer between cloud environment and “outside world” in a pay-as-you-go matter
Database caching within cloud environment– Improves performance– Economic benefits
• Amount of data transferred decreases
Payments for data transferred reduced
© 2012 UZH, CSG@IFI
Approach
© 2012 UZH, CSG@IFI
Cooperative Semantic Caching
Share local semantic caches between clients
Use cache entries of other clients
Performance improvements
Sem
antic
Ca
che
Sem
antic
Ca
che
Sem
antic
Ca
che
© 2012 UZH, CSG@IFI
Cooperative Semantic Caching
Q1 : select * from persons where age > 10
Q3 : select * from persons where age > 7
result
select * from persons where age > 7 and age <= 10
R1 : age > 10
result
resultselect * from R1
© 2012 UZH, CSG@IFI
Potential Use Cases
GIS (Geographic Information System) storage– Large amount of data (e.g. seismic events)– Processing done on client side – Two-dimensional range selections (area)
NetFlow-based architectures– Routers collect flow records and store them in databases– Analyzers (intrusion detection, accounting,… ) access them– Range selections (Start Time, IP)
© 2012 UZH, CSG@IFI
Query Rewriting
Query rewriting– Probe– Remote probes– Remainder QUERY
REWRITING
Query
Probe Remainder
LocalSemantic
cache
Server
All queriesdescriptions
Remote probe
RemoteSemantic
cache
Remote probe
RemoteSemantic
cache
. . .
© 2012 UZH, CSG@IFI
System Design
© 2012 UZH, CSG@IFI
CoopSC
CoopCooperative SSemantic CCaching Query types
– Selection (n-Dimensional range predicates)– select id, name, age from persons where 20 < age and
age < 30 Cache organization
– Semantic regions– Distributed Index – built on top of a P2P overlay
© 2012 UZH, CSG@IFI
CoopSC - Query Rewriting
Local Rewriting– Probe
– Local Remainder
• Portion of the query which is
not available in the local cache
Distributed Rewriting– Remote Probes
– Remainder
Query
Local Cache
RemoteProbe
RemoteProbe
Remainder
…
Probe
Local Rewriting
Local Remainder
Distributed RewritingDistributed
Index
© 2012 UZH, CSG@IFI
Distributed Index
Built on top of P2P overlay Regions and queries represented as
rectangular shapes MX-CIF Quad Tree
– Efficiently find intersection between rectangular shapes
Each region is indexed in the smallest quad which totally contains it
Easy to adapt to n-Dimensional regions/queries
© 2012 UZH, CSG@IFI
Update Handling
Issues– Invalidation of old entries– Combining different snapshots can generate inconsistencies
Quad space division (specified update level) Virtual timestamps stored in database Each modification increments the virtual timestamp of
corresponding quad Regions store virtual timestamps of quads that they
intersect
© 2012 UZH, CSG@IFI
Cloud Computing Scenarios
© 2012 UZH, CSG@IFI
Cloud Scenario A
Database server running outside the cloud
Clients located inside in the cloud
Non-operational use cases– Example: cloud environment
used for running scientific experiments
© 2012 UZH, CSG@IFI
Cloud Scenario B
Database server running inside the cloud
Clients located inside in the cloud
Operational use cases– Example: corporation
using cloud environment as an alternative to building a datacenter
© 2012 UZH, CSG@IFI
Evaluation
© 2012 UZH, CSG@IFI
Experiment Design
Measurements– Response time– Amount of data transferred– Payments for data transfer
Experiments – Cache size– Update level
Testing sessions– 5 select testing sessions (50 queries each)– Update sessions interleaved
© 2012 UZH, CSG@IFI
Evaluation
Wisconsin benchmark dataset (10.000.000 tuples) Scenario A
– Database Server: Zurich testbed– 5 Client: Rackspace
Scenario B– Database server
• Amazon EC2
– 5 Clients: EmanicsLab Queries
– About 10.000 tuples– Semantic locality
© 2012 UZH, CSG@IFI
Scenario A
© 2012 UZH, CSG@IFI
Data transferred/Payments
CoopSC significantly reduces the number of tuples sent by database server
Amount of money also reduced
© 2012 UZH, CSG@IFI
Response Time
Rackspace behaves unstable
No performance improvements noticed
© 2012 UZH, CSG@IFI
Scenario B
© 2012 UZH, CSG@IFI
Data transferred/Payments
CoopSC significantly reduces the number of tuples sent by database server
Bandwidth payments also reduced
© 2012 UZH, CSG@IFI
Response Time
CoopSC improves response time
© 2012 UZH, CSG@IFI
Data transferred/Payments (Updates)
Good behavior for low update rate
Economic and performance benefits
© 2012 UZH, CSG@IFI
Response Times (Updates)
Response increases with the grow of update rate
© 2012 UZH, CSG@IFI
Summary & Conclusion
Summary– Cooperative caching approach used for reducing the load of
the database server
– Update statements supported
– CoopSC applied in the context of cloud environments CoopSC reduces the amount of data transferred
between cloud and outside world which has economic benefits
Performance benefits as long as cloud providers are stable
© 2012 UZH, CSG@IFI
Questions?
© 2012 UZH, CSG@IFI
Update Handling - Algorithm
procedure Execute(query)quads = query.getIntersecteQuad(updateLevel);
before = database.getTimestamps(quads);
plan = rewrite(query, before);result = plan.execute();
after = database.getTimestamps(quads);
if (before == after) return result;
elseresult database.execute(query);