class 20 updates 2 - DASlab – Data Systems Laboratory...

43
updates 2.0 prof. Stratos Idreos HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/ class 20

Transcript of class 20 updates 2 - DASlab – Data Systems Laboratory...

Page 1: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

updates 2.0prof. Stratos Idreos

HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/

class 20

Page 2: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 2

updates

UPDATE table_name SET column1=value1,column2=value2,... WHERE some_column=some_value

INSERT INTO table_name VALUES (value1,value2,value3,...)

Page 3: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 3

the world has changed a little bit by now…

updates

Page 4: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 4

monitor CPU utilization

monitor memory hierarchy utilization

monitor clicks (frequency, locations, specific links, sequences)

Page 5: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 5

insert new entry (a,b,c,d,…) on table x

table xA B C D

update N columns, K trees, statistics, …

A

B

D

Page 6: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 6

write optimized-store

A B C D A B C D

read optimized-store

updates reads

periodic merge

and/or on-the-fly

merge

A case for fractured mirrorsRavishankar Ramamurthy, David J. DeWitt, Qi Su Very Large Databases Journal (VLDBJ), 2003

Page 7: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 7

L1

L2

Page 8: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 8

buffer

any problems

L1

L2

Page 9: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 9

L1

L2

Page 10: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 10

c0 c1 c3

c4

c5

c6

L1

L2

Page 11: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 11

The Transaction Concept:Virtues and LimitationsJim Gray, Tandem TR 81.3, 1981

in-place updates: the cardinal sin

A(t1) A(t2) A(t3)

Page 12: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 12

what can go wrong?failures & >1 queries

Page 13: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 13

update all rows where A=v1 & B=v2

to (a=a/2,b=b/4,c=c-3,d=d+2)

Page 14: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 13

A B C Dsearch (scan/index) to find row to update

select+project actions

update all rows where A=v1 & B=v2

to (a=a/2,b=b/4,c=c-3,d=d+2)

Page 15: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 13

A B C Dsearch (scan/index) to find row to update

select+project actions

list of rowIDs (positions)

A B C D

update all rows where A=v1 & B=v2

to (a=a/2,b=b/4,c=c-3,d=d+2)

(sort)

Page 16: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 13

A B C Dsearch (scan/index) to find row to update

select+project actions

list of rowIDs (positions)

A B C D

update all rows where A=v1 & B=v2

to (a=a/2,b=b/4,c=c-3,d=d+2)

we know what to update but nothing happened yet

(sort)

Page 17: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 14

level 1

CPU

A B C Dlevel 2

read page in L1 update

persist to L2

if problem (power/abort) before we write all pages

we are left with an inconsistent state

WAL: keep persistent notes as we go so we can resume or undo

update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

Page 18: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 15

level 1

CPU

A B C Dlevel 2

read page in L1 update

persist to L2

update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

Page 19: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 15

level 1

CPU

A B C Dlevel 2

read page in L1 update

persist to L2

update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

Page 20: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 15

level 1

CPU

A B C Dlevel 2

read page in L1 update

persist to L2

update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

Page 21: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 15

level 1

CPU

A B C Dlevel 2

read page in L1 update

persist to L2

apply all updates for this page

update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

Page 22: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 15

level 1

CPU

A B C Dlevel 2

read page in L1 update

persist to L2

apply all updates for this page

update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

N updates in persistent storage

Page 23: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 15

level 1

CPU

A B C Dlevel 2

read page in L1 update

persist to L2

apply all updates for this page

update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

N updates in persistent storage

(update in place/out of place) problems when failure

Page 24: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 16

recoveryclassic examplejoe owes mike 100$

both joe and mike have a Bank of Bla account

possible actionsjoe -100 mike + 100

mike + 100 joe - 100

what if there is a failure here?

Page 25: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 16

recoveryclassic examplejoe owes mike 100$

both joe and mike have a Bank of Bla account

possible actionsjoe -100 mike + 100

mike + 100 joe - 100

what if there is a failure here?

actually1)read mike; 2) mike+100; 3) write new mike;

Page 26: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 17

level 1

CPU

A B C Dlevel 2

read page in L1 update

persist to L2

update all rows where A=v1 & B=v2 to (a=a/2,b=b/4,c=c-3,d=d+2)

what do we need to remember (log)

Page 27: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 18

A B C Dsearch (scan/index) to find row to update

select+project actions

list of rowIDs (positions)

A B C D

update all rows where A=v1 & B=v2

to (a=a/2,b=b/4,c=c-3,d=d+2)

(sort)

restart: get set of positions again, and either resume from last written page or undo all previously written pages

Page 28: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 19

memory

disk

buffer updateslog

memory

disk

buffer updates

log

SSD

memory

SSD

buffer updates

log

non volatile memory

disk

Page 29: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 20

flash disks

MaSM: efficient online updates in data warehousesManos Athanassoulis, Shimin Chen, Anastasia Ailamaki, Phillip Gibbons, Radu Stoica ACM SIGMOD Conference, 2011

read/write asymmetry

need to erase a block to write it

finite number of erase cycles

Page 30: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 21

ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead LoggingC. Mohan, Donald J. Haderle, Bruce G. Lindsay, Hamid Pirahesh, Peter M. Schwarz ACM Transactions on Database Systems, 1992

C Mohan, IBM Research ACM SIGMOD Edgar F. Codd Innovations award 1993

(rumor has it he has his own Facebook server) (also check his noSQL lecture)

Page 31: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 22

classic examplejoe owes mike 100$

both joe and mike have a Bank of Bla accountΧwhat about logging and recovery during e-shopping

e.g., shopping cart, wish list, check out

recovery

Quantifying eventual consistency with PBSPeter Bailis, Shivaram Venkataraman, Michael J. Franklin, Joseph M. Hellerstein, Ion Stoica Communications of the ACM, 2014

Page 32: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 23

why noSQLdb noSQL

as apps become more complex

as apps need to be more scalable

complex legacy tuning

expensive …

simple clean

just enough …

newSQL

Mesa: Geo-Replicated, Near Real-Time, Scalable Data WarehousingAshish Gupta et al. International Conference on Very Large Databases (VLDB), 2014

F1: A Distributed SQL Database That ScalesJeff Shute et al. International Conference on Very Large Databases (VLDB), 2013

(more in 265)

Page 33: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 24

remember what we changed so we can undo or resume (depends on applications semantics)

all or nothing

Page 34: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 25

transaction any database query that can/should be seen as a single task

parse optimize

plan( read X write Z … )

Page 35: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 26

AtomicityConsistencyIsolationDurability

Jim Gray, IBM, Tandem, DEC, Microsoft ACM Turing award ACM SIGMOD Edgar F. Codd Innovations award

Page 36: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 27

AtomicityConsistencyIsolationDurability

>1 queries

all or nothing

correct

persistent

Page 37: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 28

queries arrive concurrently q1) select A>v1, B=B+b, C=C-c q2) select B>v2, avg(C) q3) select C<v3 & B>v4, A=C q4) write more queries

A B C D

Goal: process queries in parallel leave db in a consistent state return correct results

Page 38: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 29

query 1

parse optimize

plan( read X write Z … )

query 2

parse optimize

plan( read Z write Z … )

Page 39: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 30

locks

… read X …

… get_rlock(X) read X release_rlock(X) …

… write X …

… get_wlock(X) write X release_wlock(X) … commit

R WR yes noW no no

wait until lock is free

Page 40: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 31

2 phase locking1. get all locks

2. do all tasks

3. commit

4. release all locks

(and variations)

be positive! - optimistic concurrency control

HT

Page 41: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 32

lock granularity

databasetable1 table2

C1 C2

A survey of B-tree logging and recovery techniquesGoetz Graefe ACM Trans. Database Systems, 2012

Page 42: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

/34CS165, Fall 2015 Stratos Idreos 33

textbook: chapters 16, 17, 18

Page 43: class 20 updates 2 - DASlab – Data Systems Laboratory ...daslab.seas.harvard.edu/.../Class20CS165Fall2015.pdfbuffer updates log memory disk buffer updates log SSD memory SSD buffer

DATA SYSTEMSprof. Stratos Idreos

class 20

updates 2.0