CS848 Paper Presentaonkmsalem/courses/CS848W10/...CS848 Paper Presentaon Scalable Query Result...

CS848 Paper Presenta.on Scalable Query Result Caching for

Web Applica9ons

Garrod, Manjhi, Ailamaki, Maggs, Mowry, Olston, Tomasic

PVLDB 2008

Presented by Rehan Rauf

David R. Cheriton School of Computer Science University of Waterloo

18th January 2010

Problem

•  Central database server becomes a boQle‐neck in Content Distributed Networks

•  Replica.on of database among mul.ple servers removes boQle‐neck but … •  requires low latency consistency which conflicts with low latency between user and server •  does not scale linearly with cost.

•  Tradi.onal proxy‐cache based solu.ons remove boQle‐neck but … •  inefficiently maintain consistency •  scalability is limited by low cache hit‐rate

High‐level Architecture of Ferdinand

Proxy Server

DHT Overlay

Publish/ Subscribe

Central Server

Ferdinand Proxy Server

Opera.ons

Each query has a master proxy server

Proxy Server Master Proxy

Server

Central Server

Opera.ons

Server

Central Server

Opera.ons

Server

Central Server

Opera.ons

Server

Central Server

Opera.ons

Server

Central Server

Opera.ons

Server

Central Server

Opera.ons

Server

Central Server

Proxy Server

Opera.ons

Server

Central Server

Proxy Server

Update no.fica.on

Proxy Server Cache

•  When proxy server receives update no.fica.on, the consistency module invalidates the query result.

•  Even if results can be processed using previously cached results, it is not done

Advantages/ Disadvantages

•  Advantages •  High cache hit rate. •  More work offloaded from the backend server

• Disadvantage •  Latency cost of an over all cache miss is greater

Consistency Management

•  Any proxy caching a par.cular query must be no.fied when an update affects the result of that query.

•  Query Update Mul.cast Associa.on (QUMA) is used to ensure this requirement.

QUMA Solu.on

•  Mul.cast groups are created. •  Offline analysis of applica.on’s template database queries is performed. •  Independence analysis is done to determine independence of query‐update pairs.

•  Goals •  Any update no.fica.on published to a group should affect each query subscribed to that group •  Related queries should be clustered into same mul.cast group to reduce number of no.fica.ons.

•  Each cached query subscribes to appropriate mul.cast groups. •  Updates are published to these groups and hence reach the appropriate proxy server.

QUMA Example

QUMA Solu.on

•  Selec.on predicates are only prac.cal when they are equality based. •  The process is not automated yet.

•  For each mul.cast group, there is a master mul.cast group. •  used for communica.on with master proxies.

Cache Miss

Proxy Server

Cache Miss

Proxy Server

subscribe

Cache Miss

Proxy Server

Master Proxy

Cache Miss

Proxy Server

Master Proxy

Cache Miss

Proxy Server

Master Proxy

Cache Miss

Proxy Server

Master Proxy

Central Server

Update

Proxy Server

Master Proxy

Central Server

Update

Proxy Server

Master Proxy

Central Server

Update

Proxy Server

Master Proxy

Central Server

Update

Proxy Server

Master Proxy

Central Server

Update

Proxy Server

Master Proxy

Central Server

Proxy Server

•  Ensures that query cache remains coherent with the central DB but.. •  Guarantees full consistency only for single statement transac.ons.

•  Requires a reliable publish/subscribe system.

•  Requires serializability guarantees from central database

Implementa.on

•  JDBC Driver •  Proxy runs Apache tomcat as sta.c cache •  Pastry overlay as DHT ( is based on PRR tree) •  Scribe for publish scribe

•  does not guarantee reliable delivery. •  Ferdinand cache map is stored in MySQL4 •  Backend database is MYSQL4

Evalua.on

•  Performance comparison with several alterna.ve approaches.

•  Performance of DHT based coopera.ve caching in varying network latency scenarios.

•  Publish/subscribe vs simple broadcast‐based system

Evalua.on

•  Emulab testbed •  Proxy ran on 3 GHz Intel Pen.um Xeon, 1GB ram, 10,000 RPM SCSI disk. •  Benchmark clients on 850 MHz client servers. •  Benchmark

•  TPC‐W bookstore •  RUBiS auc.on •  RUBBos bulle.n board

• Conforms to TPC‐W model of emulated browsers

Evalua.on: Comparison to other approaches

Evalua.on: Cache Miss Rate

Evalua.on: Throughput as a func.on of latency

Evalua.on: Throughput compared to broadcast‐based consistency

Closing Observa.ons

•  No Scalability evalua.on shown.

•  Not suitable for update intensive applica.on.

•  Requires offline analysis of database requests.

•  Failure Scenarios need to be deal with.

•  No consistency for mul. statement transac.ons.

CS848 Paper Presentaonkmsalem/courses/CS848W10/...CS848 Paper Presentaon Scalable Query Result...

Documents

Transcript of CS848 Paper Presentaonkmsalem/courses/CS848W10/...CS848 Paper Presentaon Scalable Query Result...

22nd International Conference on Applied Electromagnetics ...€¦ · Boris Tomasic Digital Beamforming in Extremely Large Conformal Array Antennas Keynote lecture. ICECom 2016 Conference

HadoopDB: An Architectural Hybrid of MapReduce and DBMS ...kmsalem/courses/CS848W10/... · Hadoop is used : To store the data using the HDFS file system For task scheduling, Hadoop’sJobTracker

CS848 Advanced Topics in Databases Database Systems on ...kmsalem/courses/cs848S15/overview.pdf · CS848 Advanced Topics in Databases Database Systems on Modern Hardware Spring 2015

The “Big Data” Ecosystem at LinkedIntozsu/courses/CS848/W19/presentations/C… · Introduction LinkedIn +200 million users Data-driven features collaborative filtering (→ wisdom

Mixer: Mixed-initiative Data Retrieval and Integration …tomasic/doc/2011/GardinerEtAlInteract2011.pdf · Mixer: Mixed-initiative Data Retrieval and Integration by Example Steven

CS848 Paper Presentation MTCache: Transparent …kmsalem/courses/CS848W10/presentations/...CS848 Paper Presentation MTCache: Transparent Mid-Tier ... rather than the backend DBMS ...

DOGMA: A Disk-Oriented Graph Matching algorithmgweddell/cs848/presentations/Dogma... · • All –pair shortest path – not calculated ... •DOGMA_ipd •DOGMA_epd • Approximation

A Survey of Physical Design Techniques of Information Systemsplg.uwaterloo.ca/~karim/resources/pubs/courses/cs848-db.pdfA Survey of Physical Design Techniques of Information Systems

CS848 Similarity Search in Multimedia Databases Dr. Gisli Hjaltason Content-based Retrieval Using Local Descriptors: Problems and Issues from Databases.

Optimizing Queries Using Materialized Views Qiang Wang CS848.

· Web viewKaren Abboud, 1 Rosary Pang, 1 Ana-Katharina Gehrke, 1 Luka Tomasic, 1 Kenny Sidharta, 1 Fatma Mohamed, 1 Daniel Semmy 1 Pharmacy Education Committees, International Pharmaceutical

PROFESSIONAL EATRE AINING - Langara College Fenske... · Vocal Range: Baritone/Tenor ... 2017 Abner Dillion 42nd Street Stewart/Bramble/Warren Barbara Tomasic 2016 Agamemnon Troilus

SPARQL’Query’Answering’over’ OWL’Ontologies’’gweddell/cs848/... · SPARQL’Query’Answering’over’ OWL’Ontologies’’ Presented(By(Mohamed(Sabri(IliannaKollia

Paige Fraser Acting 2019 - langara.ca · • Peter Jorgensen Musical Theatre Masterclass • Singing (Sylvia Zaradic, Barbara Tomasic, Craig Tompkins, Peter Jorgensen) • Harbour

Distributed Model Verification Using Map-Reduce Presented ...kmsalem/courses/CS848W10/... · Distributed Model Verification Using Map-Reduce. 1. Project Objective Using cloud computing

SystemML: Declarative Machine Learning on Sparktozsu/courses/CS848/W19... · Accelerates model development Simplifies deployment DML Source: Spark Summit. Inside Apache SystemML.

CS848: Topics in Databases: Information Integration Topics covered Databases QL Query containment An evaluation of QL.

Implementation and Optimisation Techniquesgweddell/cs848/dlhb-09.pdf · Ian Horrocks Abstract ... detailed descriptions of implementation and optimisation techniques will assume,

CASSANDRA: A DECENTRALIZED STRUCTURED STORAGE SYSTEMdavid/cs848/pslides/Cassandra.pdf · CASSANDRA: A DECENTRALIZED STRUCTURED STORAGE SYSTEM ... Number of CFs not limited per table

Learning to Detect Phishing Emails Ian Fette Norman Sadeh Anthony Tomasic Presented by – Bhavin Madhani.