Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of...

35
Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei Guo University of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan University of Wisconsin

Transcript of Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of...

Page 1: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

Caching with “Good Enough” Currency, Consistency, and

Completeness

Hongfei Guo University of WisconsinPer-Åke Larson Microsoft ResearchRaghu Ramakrishnan University of Wisconsin

Page 2: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

2

Motivation — Scaling Google

Page 3: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

3

Updates

Backend DBMS

Problem: How to tell whether the cached data is “good enough” for an application?

NO data quality requirements from the applications! NO data quality guarantees from the caching DBMS!

Motivation — Scaling A DBMS By Caching

Application Server

Application Server

App specific code

Caching DBMS

Asynchronous Updates

Page 4: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

4

Apps: Specifies data quality requirements in queries[SIGMOD 2004] [SIGMOD 2004 Demo]

Fine-grained data quality-aware database caching model

Cache admin: Specifies local data quality Cache: Keeps track of local data quality

[VLDB 2005]

Query processing: Enforces data quality constraint[SIGMOD 2004] [VLDB 2005]

System performance evaluation[ongoing work]

Caching DBMS

Backend DBMS

Application ServerApplication Server

Big Picture

Page 5: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

5

Contributions

Goal: fine-grained data quality-aware cache management

Problems How does the cache track data quality? How does the admin specify cache

properties? How to maintain the cache efficiently? How to enforce data quality constraints for

queries?

A comprehensive solution Cache properties Dynamic cache model Efficient cache maintenance and “safety” Efficiently enforce data quality checking

Page 6: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

6

Review: Data Quality Metrics (informal)

Currency: The elapsed time since this copy becomes stale

Consistency: A query result is (snapshot) consistent iff it is as if evaluated from a snapshot of the master database

C&C: Currency & Consistency

Page 7: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

7

bid

title author

bid rid

text

1 databases

Raghu 1 1 …

1 databases

Raghu 1 2 …

2 databases

Ullman 2 3 …

CURRENCY BOUND 10 min ON (B, R) BY B.bid

CURRENCY BOUND 10 min ON (B), 30 min ON (R)

CURRENCY BOUND 10 min ON (B, R)

Review: Proposed SQL Syntax

Ullmandatabases2

Raghudatabases1

authortitlebid

BookCopy

…23

…12

…11

textbidrid

ReviewCopy

SELECT *FROM Books B, Reviews R WHERE B.bid = R.bid AND

B.title = “Databases“

Consistency class

Currency bound

Group by

Page 8: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

8

Roadmap

Background Cache data quality properties Cache property specification Enforcing data quality constraints Future directions and conclusions

Page 9: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

9

Cache Properties

Why Define Cache Properties?

Query processing

Cache maintenance

Queries with Relaxed C&C Requirements Results

= contract

Page 10: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

10

Cache Properties (P+3C)

Presence — per object Consistency — a set of objects Completeness — per predicate Currency — object staleness

Describe local data status

Page 11: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

11

Presence

Example: SELECT *

FROM Authors AWHERE authorId = 1

Question: Is an object present at the cache?

Page 12: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

12

Consistency and Currency

Example: SELECT *

FROM Authors AWHERE authorId in (1, 2, 3)CURRENCY BOUND 10 ON (A)

Question: Is a set of objects consistent and no more than 10 minutes old?

Page 13: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

13

Completeness

Example: SELECT *

FROM Authors AWHERE city = ‘Madison’

Question: Are ALL authors from Madison in the cache?

Page 14: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

View 1

View 2View 3

Basic Concepts

ObjectTables

Cache

H2

H1Master Database

Snapshots

Page 15: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

View 1

View 2View 3

Cache Property Examples

Cache

H2

H1Master Database

Present Complete

Currency = now – stale point

Consistent

Stale point

Page 16: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

16

Roadmap

Background Cache data quality properties Cache property specification Enforcing data quality constraints Future directions and conclusions

Page 17: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

17

Specifying Cache Properties

Specified as integrity constraints Presence constraint Consistency constraint Completeness constraint

Presence correlation constraint Consistency correlation constraint

Single view

Between two views

Page 18: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

18

AuthorList_PCT:

authorId name city

1 Alice Madison

2 Bob Madison

3 Cedric Seattle

Presence Constraint AuthorCopy:

authorId

1

2

3

Backend DBMS

Caching DBMS

Page 19: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

19

control-table

CREATE VIEW AuthorCopy AS SELECT * FROM Authors

CREATE TABLE AuthorList_PCT (authorId int)

ALTER VIEW AuthorCopy ADD

ON authorId IN (SELECTauthorId FROM authorId_PCT

Partially materialize

d view[Zhou et al 2005]

authorId name city

Presence ConstraintAuthorCopy:

authorId

AuthorList_PCT:

1 Alice Madison

2 Bob Madison

3 Cedric Seattle

1

2

3

control-key

PRESENCE

Page 20: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

20

CityList_CsCT:

authorId name city

1 Alice Madison

2 Bob Madison

3 Cedric Seattle

Consistency Constraint AuthorCopy:

city

Madison

authorId

AuthorList_PCT:

1

2

3

authorId

AuthorList_PCT:

1

2

3

CREATE TABLE CityList_CsCT (city string)

ALTER VIEW AuthorCopy ADD

ON city IN (SELECT city

FROM cityList_CsCT

Consistency

Backend DBMS

Cache Region

Page 21: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

21

authorId

AuthorList_PCT:CityList_CpCT:

authorId name city

1 Alice Madison

2 Bob Madison

3 Cedric Seattle

Completeness Constraint AuthorCopy:

city

Madison

New York

CREATE TABLE CityList_CpCT (city string)

ALTER VIEW AuthorCopy ADD

ON city IN (SELECT city

FROM cityList_CsCT

Completeness

Backend DBMS

authorId

AuthorList_PCT:

1

3

1

3

Page 22: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

22

111 1 aaa222 1 bbb333 2 ccc444 3 ddd555 3 eee

isbn authorId title

1 Alice Madison

2 Bob Madison3 Cedric Seattle

authorId name city

Presence Correlation Constraint

AuthorCopy:

BookCopy:

ALTER VIEW BookCopy ADD PRESENCE ON authorId IN (SELECT authorId

FROM AuthorCopy)

authorId

AuthorList_PCT:

1

2

3Backend

DBMS

authorId

authorId

Page 23: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

23

111 1 aaa222 1 bbb333 2 ccc444 3 ddd555 3 eee

isbn authorId title

1 Alice Madison

2 Bob Madison3 Cedric Seattle

authorId name city

Presence Correlation Constraint

AuthorCopy:

BookCopy:

authorId

AuthorList_PCT:

1

2

3

authorId

authorId

AuthorList_PCT

AuthorCopy

BookCopy

authorId

authorId

Page 24: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

24

111 1 aaa222 1 bbb333 2 ccc444 3 ddd555 3 eee

isbn authorId title

1 Alice Madison

2 Bob Madison3 Cedric Seattle

authorId name city

Consistency Correlation Constraint

AuthorCopy:

BookCopy:

authorId

AuthorList_PCT:

1

2

3

authorId

authorIdBackend

DBMS

ALTER VIEW BookCopy ADD CONSISTENCY ROOT

Page 25: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

25

111 1 aaa222 1 bbb333 2 ccc444 3 ddd555 3 eee

isbn authorId title

1 Alice Madison

2 Bob Madison3 Cedric Seattle

authorId name city

Consistency Correlation Constraint

AuthorCopy:

BookCopy:

authorId

AuthorList_PCT:

1

2

3

authorId

authorId

AuthorList_PCT

AuthorCopy

BookCopy

authorId

authorId

Page 26: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

26

Cache Schema Example

AuthorList_PCT

AuthorCopy

BookCopy

ReviewerList_PCT

ReviewerCopy

authorId

authorId

isbn

reviewId

reviewerId

ReviewCopy

CityList_CsCT

Page 27: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

27

Roadmap

Background Cache data quality properties Cache property specification Enforcing data quality constraints Future directions and conclusions

Page 28: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

28

Extension to the Optimizer

Compile-time consistency checking

Run-time currency and inexpensive consistency checking

Cost estimation

Page 29: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

29

Run-time C&C Checking

Currency guard:Check if local view V satisfies currency requirement

Consistency guard: Check if local view V satisfies consistency requirement

ChoosePlan

C&CGuard

Remote planrequesting E

Local plan using V

Page 30: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

30

Future Directions

Improve current prototype Read-write

transactions?

Adaptive data quality aware caching policies Control-table content? Refresh intervals?

Automate cache design/tuning How to get a good cache

schema? (i.e., cache region granularity, assignment)

Comprehensive performance evaluation Cache configurations? Comparison with other

replication solutions?

Page 31: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

31

Summary Goal: fine-grained data quality-

aware cache management A comprehensive solution

Four cache properties Dynamic cache model Efficient cache maintenance and “safety” Efficiently enforce C&C checking

Questions?

Page 32: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

32

So long, and thanks for all the fish!

Page 33: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

33

Page 34: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

34

Simple Consistency Guards Overhead

0

10

20

30

40

50

60

70

80

Qa Qb Qc Qa Qb Qc

Consistency guard

Query

Local

Remote

Execu

tion t

ime (

ms)

16.56%

14.00%

1.72%

1.59%1.66%

1.6%

Page 35: Caching with “Good Enough” Currency, Consistency, and Completeness Hongfei GuoUniversity of Wisconsin Per-Åke Larson Microsoft Research Raghu Ramakrishnan.

35

0

1

2

3

4

5

6

7

A11a A11b A12 S11 S12 A11a A11b A12 S11 S12

Consistency guard

Query

Single Table Consistency Guard Overhead

Local

Remote

Execu

tion t

ime (

ms)

62.85%

16.98% 71.41%

6.06% 8.79%7.48%2.33%4.95%

58.32%

23.77%

(Qa is used)