basic db architectures & layouts - Harvard...
Transcript of basic db architectures & layouts - Harvard...
![Page 1: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/1.jpg)
basic db architectures & layoutsprof. Stratos Idreos
HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS165/
class 4
![Page 2: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/2.jpg)
CS165, Fall 2016 Stratos Idreos /402
videos for sections 3 & 4 are onlinecheck back every week (1-2 sections weekly)
there is a schedule but more will be added as we go
![Page 3: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/3.jpg)
CS165, Fall 2016 Stratos Idreos /403
database kernel
optimizer
parser
execution
storage buffer pool
in/out
thread pool
transactions
disk
memory
cpu
applications
sql
from last time
![Page 4: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/4.jpg)
CS165, Fall 2016 Stratos Idreos /404
query plan
database kernel
data data data
algo
rithm
s/op
erat
ors
![Page 5: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/5.jpg)
CS165, Fall 2016 Stratos Idreos /405
select name from student
where GPA>3.0
select GPA>3.0
project name
student(id,name,GPA,address,class,…)
result
![Page 6: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/6.jpg)
CS165, Fall 2016 Stratos Idreos /405
select name from student
where GPA>3.0
select GPA>3.0
project name
student(id,name,GPA,address,class,…)
result
scan all the data?
![Page 7: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/7.jpg)
CS165, Fall 2016 Stratos Idreos /405
select name from student
where GPA>3.0
select GPA>3.0
project name
student(id,name,GPA,address,class,…)
result
scan all the data?
logical plan
![Page 8: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/8.jpg)
CS165, Fall 2016 Stratos Idreos /406
select name from student
where GPA>3.0
scan GPA>3.0
project name
result
studentsstudents
index scan GPA>3.0
project name
result
physical plans
![Page 9: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/9.jpg)
CS165, Fall 2016 Stratos Idreos /407
select avg(GPA) from student
where class=2017
select year=2017
project GPA
student(id,name,GPA,address,class,…)
avg GPA
result
![Page 10: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/10.jpg)
CS165, Fall 2016 Stratos Idreos /407
select avg(GPA) from student
where class=2017
select year=2017
project GPA
student(id,name,GPA,address,class,…)
avg GPA
result
data layout
algorithms/ operators
interfaces
data flow
![Page 11: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/11.jpg)
CS165, Fall 2016 Stratos Idreos /408
professor(id,name,…)
course(id,name, profId,…)
student(id,name,…)
give me all students enrolled in cs165select student.name from student, enrolled, course where course.name=“cs165” and enrolled.courseId=course.id and student.id=enrolled.studentId
enrolled(studentId,
courseId,…)
![Page 12: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/12.jpg)
CS165, Fall 2016 Stratos Idreos /409
select course.name=“cs165”
join enrolled.courseid=course.id
student enrolled course
join student.id=enrolled.studentid
project student.name
good plan
select student.name from student, enrolled, course where course.name=“cs165” and enrolled.courseId=course.id and student.id=enrolled.studentId
![Page 13: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/13.jpg)
CS165, Fall 2016 Stratos Idreos /4010
select course.name=“cs165”
join enrolled.courseid=course.id
student enrolled course
join student.id=enrolled.studentid
project student.name
select student.name from student, enrolled, course where course.name=“cs165” and enrolled.courseId=course.id and student.id=enrolled.studentId
pushing selects down
![Page 14: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/14.jpg)
CS165, Fall 2016 Stratos Idreos /4011
select min(A) from R where B<10 and C<80
internal languagelogical plan
optimizer rules/cost model/statistics
physical plan execution
internal language
![Page 15: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/15.jpg)
CS165, Fall 2016 Stratos Idreos /4011
select min(A) from R where B<10 and C<80
internal languagelogical plan
optimizer rules/cost model/statistics
physical plan execution
internal language
![Page 16: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/16.jpg)
CS165, Fall 2016 Stratos Idreos /4011
select min(A) from R where B<10 and C<80
internal languagelogical plan
optimizer rules/cost model/statistics
physical plan execution
internal language
project interface/level of abstraction
![Page 17: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/17.jpg)
CS165, Fall 2016 Stratos Idreos /4012
optimizer
execution
storage
tuning
db kernel
can DBAs make wrong decisions?
can optimizers make
wrong decisions?
![Page 18: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/18.jpg)
CS165, Fall 2016 Stratos Idreos /4013
memory hierarchy
data layouts
column-stores basics
![Page 19: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/19.jpg)
CS165, Fall 2016 Stratos Idreos /4014
system where db runs
applications
sql
caches
cpu - cpu - cpu - cpu
memory
disk - disk - disk - disk
cpu registers
smal
ler/f
aste
r
+ flash
+ non volatile memory
memory hierarchy
![Page 20: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/20.jpg)
CS165, Fall 2016 Stratos Idreos /4015
Jim Gray, IBM, Tandem, DEC, Microsoft ACM Turing award ACM SIGMOD Edgar F. Codd Innovations Award
disk100Kx Pluto
2 years
memory100x New York1.5 hours
on board cache10x this building
10 min
on chip cache2x this room
1 min
registers my head~0
![Page 21: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/21.jpg)
CS165, Fall 2016 Stratos Idreos /4016
registers
on chip cache
on board cache
memory
disk
CPU
memory wall
chea
per
fast
er
SRAM
DRAM
~1ns
~10ns
~100ns
cache miss: looking for something which is not in the cache
memory miss: looking for something which is not in memory
time
speed cpu
mem
![Page 22: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/22.jpg)
CS165, Fall 2016 Stratos Idreos /40
data missesinstruction misses
17
design of storage/access methods/algorithms should minimize:
touch/access only what you need
![Page 23: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/23.jpg)
CS165, Fall 2016 Stratos Idreos /4018
random access & page-based access
…
need to only read x… but have to read all of page 1
page1 page2 page3
data value x
registers
on chip cache
on board cache
memory
disk
CPU
data
mov
e
![Page 24: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/24.jpg)
CS165, Fall 2016 Stratos Idreos /4019
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
5 10 6 4 12
(size=120 bytes)
2 8 9 7 6 7 11 3 9 6
![Page 25: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/25.jpg)
CS165, Fall 2016 Stratos Idreos /4019
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
scan
5 10 6 4 12(size=120 bytes)
2 8 9 7 6 7 11 3 9 6
![Page 26: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/26.jpg)
CS165, Fall 2016 Stratos Idreos /4019
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
scan
5 10 6 4 12(size=120 bytes)
2 8 9 7 6
4
7 11 3 9 6
![Page 27: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/27.jpg)
CS165, Fall 2016 Stratos Idreos /4019
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
scan
40 bytes
5 10 6 4 12(size=120 bytes)
2 8 9 7 6
4
7 11 3 9 6
![Page 28: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/28.jpg)
CS165, Fall 2016 Stratos Idreos /4019
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
scan scan
40 bytes
5 10 6 4 12(size=120 bytes) 2 8 9 7 6 4
7 11 3 9 6
![Page 29: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/29.jpg)
CS165, Fall 2016 Stratos Idreos /4019
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
scan scan
40 bytes
5 10 6 4 12(size=120 bytes) 2 8 9 7 6 4 2
7 11 3 9 6
![Page 30: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/30.jpg)
CS165, Fall 2016 Stratos Idreos /4019
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
scan scan
5 10 6 4 12(size=120 bytes) 2 8 9 7 6 4 2
7 11 3 9 6
80 bytes
![Page 31: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/31.jpg)
CS165, Fall 2016 Stratos Idreos /4019
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
(size=120 bytes) 2 8 9 7 6 4 27 11 3 9 6
scan
80 bytes
![Page 32: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/32.jpg)
CS165, Fall 2016 Stratos Idreos /4019
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
(size=120 bytes) 2 8 9 7 6 4 27 11 3 9 6
scan
3
80 bytes
![Page 33: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/33.jpg)
CS165, Fall 2016 Stratos Idreos /4019
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
(size=120 bytes) 2 8 9 7 6 4 27 11 3 9 6
scan
3
120 bytes
![Page 34: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/34.jpg)
CS165, Fall 2016 Stratos Idreos /4020
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
5 10 6 4 12
(size=120 bytes)
2 8 9 7 6 7 11 3 9 6
an oracle gives us the positions
![Page 35: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/35.jpg)
CS165, Fall 2016 Stratos Idreos /4020
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
oracle
5 10 6 4 12(size=120 bytes)
2 8 9 7 6 7 11 3 9 6
an oracle gives us the positions
![Page 36: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/36.jpg)
CS165, Fall 2016 Stratos Idreos /4020
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
oracle
5 10 6 4 12(size=120 bytes)
2 8 9 7 6
4
7 11 3 9 6
an oracle gives us the positions
![Page 37: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/37.jpg)
CS165, Fall 2016 Stratos Idreos /4020
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
oracle
40 bytes
5 10 6 4 12(size=120 bytes)
2 8 9 7 6
4
7 11 3 9 6
an oracle gives us the positions
![Page 38: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/38.jpg)
CS165, Fall 2016 Stratos Idreos /4020
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
oracle oracle
40 bytes
5 10 6 4 12(size=120 bytes) 2 8 9 7 6 4
7 11 3 9 6
an oracle gives us the positions
![Page 39: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/39.jpg)
CS165, Fall 2016 Stratos Idreos /4020
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
oracle oracle
40 bytes
5 10 6 4 12(size=120 bytes) 2 8 9 7 6 4 2
7 11 3 9 6
an oracle gives us the positions
![Page 40: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/40.jpg)
CS165, Fall 2016 Stratos Idreos /4020
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
oracle oracle
5 10 6 4 12(size=120 bytes) 2 8 9 7 6 4 2
7 11 3 9 6
80 bytesan oracle gives us the positions
![Page 41: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/41.jpg)
CS165, Fall 2016 Stratos Idreos /4020
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
(size=120 bytes) 2 8 9 7 6 4 27 11 3 9 6
oracle
80 bytesan oracle gives us the positions
![Page 42: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/42.jpg)
CS165, Fall 2016 Stratos Idreos /4020
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
(size=120 bytes) 2 8 9 7 6 4 27 11 3 9 6
oracle
3
80 bytesan oracle gives us the positions
![Page 43: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/43.jpg)
CS165, Fall 2016 Stratos Idreos /4020
…5 10 6 4 12 2 8 9 7 6 7 11 3 9 6
memory level N
memory level N-1
query x<5
page size: 5x8 bytes
(size=120 bytes) 2 8 9 7 6 4 27 11 3 9 6
oracle
3
120 bytesan oracle gives us the positions
![Page 44: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/44.jpg)
CS165, Fall 2016 Stratos Idreos /4021
when does it make sense to have an oracle
scan=120bytes vs oracle=120bytes (and there is no such thing as an Oracle so Oracle is not for free…)
![Page 45: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/45.jpg)
CS165, Fall 2016 Stratos Idreos /4022
sequential access: read one block; consume it completely; discard it; read next
what is next?
in parallel/prefetching
hardware/software can better predict/buffer sequential pages to be read
1 2 3 4
![Page 46: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/46.jpg)
CS165, Fall 2016 Stratos Idreos /4023
random access: read one block; consume it partially; discard it; might have to read it again in future; read “random” next;
![Page 47: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/47.jpg)
CS165, Fall 2016 Stratos Idreos /4024
level N-1
level Nbuffer pool remember hot blocks
why not use OS caching
![Page 48: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/48.jpg)
CS165, Fall 2016 Stratos Idreos /4025
device block size
os block size
dbms block size
os and db will typically refer to pages
![Page 49: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/49.jpg)
CS165, Fall 2016 Stratos Idreos /4026
employee(id:int, name:varchar(50), office:char(5), telephone:char(10), city:varchar(30), salary:int)
(1, name1, office1, tel1, city1, salary1) (2, name2, office2, tel2, city2, salary2) (3, name3, office3, tel3, city3, salary3) (4, name4, office4, tel4, city4, salary4) (5, name5, office5, tel5, city5, salary5) (6, name6, office6, tel6, city6, salary6) (7, name7, office7, tel7, city7, salary7) (8, name8, office8, tel8, city8, salary8) (9, name9, office9, tel9, city9, salary9)
…
data storage blocks < pages < files
file
remember: the way we store data defines the best possible way we can access it
![Page 50: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/50.jpg)
CS165, Fall 2016 Stratos Idreos /4027
employee(id:int, name:varchar(50), office:char(5), telephone:char(10), city:varchar(30), salary:int)
(1, name1, office1, tel1, city1, salary1) (2, name2, office2, tel2, city2, salary2) (3, name3, office3, tel3, city3, salary3) (4, name4, office4, tel4, city4, salary4) (5, name5, office5, tel5, city5, salary5) (6, name6, office6, tel6, city6, salary6) (7, name7, office7, tel7, city7, salary7) (8, name8, office8, tel8, city8, salary8) (9, name9, office9, tel9, city9, salary9)
…
header
row1row2
row3…
slotted pages
page
![Page 51: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/51.jpg)
CS165, Fall 2016 Stratos Idreos /4028
free_offset, N, offset1-length1, offset2-lenght2,…
free space
slotted page
scan null
update var length
…
![Page 52: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/52.jpg)
CS165, Fall 2016 Stratos Idreos /4029
some things to “worry” abouthow much data we transfer through the memory hierarchy how many computations we do
level Nlevel N-1…cpu
![Page 53: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/53.jpg)
CS165, Fall 2016 Stratos Idreos /4030
row-storeABCD
stored continuously
one page contains all fields of multiple attributes
select A,B,C,Dselect A
file
![Page 54: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/54.jpg)
CS165, Fall 2016 Stratos Idreos /4031
row-store column-storeABCD A B C D
stored continuously
one page contains fields of a single attribute
select A,B,C,Dselect A
![Page 55: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/55.jpg)
CS165, Fall 2016 Stratos Idreos /4032
history/timeline
~1960s
rows
1970: column storage ideas start
appearing
rows rows rows rows
1985: first rather complete column-store model
~2000: open source complete system
rows
2005-now: more ideas and industry adoption of column-
store designs
rows
monetdb
c-store,vertica,vectorwise and then
ibm,microsoft,oracle, and more
![Page 56: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/56.jpg)
CS165, Fall 2016 Stratos Idreos /4033
column-store with materialized IDs
ID A ID B ID Cheader
R(A,B,C)
row1row2
row3
good idea
![Page 57: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/57.jpg)
CS165, Fall 2016 Stratos Idreos /4034
a1 a2 a3 a4 a5 a6
b1 b2 b3 b4 b5 b6
c1 c2 c3 c4 c5 c6
virtual ids/ positional alignment
positional lookups/joinsA(i) = A + i * width(A)
tuple 1tuple 2tuple 3tuple 4tuple 5tuple 6
A B C
fixed-width + dense
columns do not need to have the
same width
![Page 58: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/58.jpg)
CS165, Fall 2016 Stratos Idreos /4035
disk memoryA B C D
A
ABCrow-store
engine
ok so now we can selectively read columns but how do we process them?
early tuple reconstruction/materialization
column-store
engine
option1
option2
![Page 59: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/59.jpg)
CS165, Fall 2016 Stratos Idreos /4036
registers
on chip cache
on board cache
memory
disk
CPU
memory wall
chea
per
fast
er
SRAM
DRAM
~1ns
~10ns
~100ns
it is not just memory and disk
we want to move as few data items as possible all the way up to the CPU
![Page 60: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/60.jpg)
CS165, Fall 2016 Stratos Idreos /4037
select min(C) from R where A<10 & B<20
A B C Ddisk memory
write the query plan and the code/logic of each operator
do not forget about intermediate results describe data layouts at each step
(milestone1 of project)
![Page 61: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/61.jpg)
CS165, Fall 2016 Stratos Idreos /4037
select min(C) from R where A<10 & B<20
A B C Ddisk memory
write the query plan and the code/logic of each operator
do not forget about intermediate results describe data layouts at each step
(milestone1 of project)
no precise final answer is OKunderstanding what matters is key
concepts & designs will be repeated >>1
![Page 62: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/62.jpg)
CS165, Fall 2016 Stratos Idreos /4038
select min(C) from R where A<10 & B<20
A B C Ddisk memory
late reconstruction/materialization
![Page 63: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/63.jpg)
CS165, Fall 2016 Stratos Idreos /4038
select min(C) from R where A<10 & B<20
A<10A B C Ddisk memory
late reconstruction/materialization
![Page 64: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/64.jpg)
CS165, Fall 2016 Stratos Idreos /4038
select min(C) from R where A<10 & B<20
A<10A B C D1: int *input=A2: for (i=0;i<tuples;i++,input++) 3: if *input<104: *output=i5: output++
A<10
disk memory
late reconstruction/materialization
![Page 65: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/65.jpg)
CS165, Fall 2016 Stratos Idreos /4038
select min(C) from R where A<10 & B<20
A<10A B C D IDsdisk memory
late reconstruction/materialization
![Page 66: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/66.jpg)
CS165, Fall 2016 Stratos Idreos /4038
select min(C) from R where A<10 & B<20
A<10A B C D IDs Bdisk memory
late reconstruction/materialization
![Page 67: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/67.jpg)
CS165, Fall 2016 Stratos Idreos /4038
select min(C) from R where A<10 & B<20
B<20A<10A B C D IDs Bdisk memory
late reconstruction/materialization
![Page 68: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/68.jpg)
CS165, Fall 2016 Stratos Idreos /4038
select min(C) from R where A<10 & B<20
B<20A<10A B C D IDs B CIDsdisk memory
late reconstruction/materialization
![Page 69: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/69.jpg)
CS165, Fall 2016 Stratos Idreos /4038
select min(C) from R where A<10 & B<20
B<20 minCA<10A B C D IDs B CIDsdisk memory
late reconstruction/materialization
![Page 70: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/70.jpg)
CS165, Fall 2016 Stratos Idreos /4038
select min(C) from R where A<10 & B<20
B<20 minCA<10A B C D IDs B CIDsdisk memory
late reconstruction/materialization
always sequential access patterns memory contains only what is needed at any point in time
![Page 71: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/71.jpg)
CS165, Fall 2016 Stratos Idreos /4039
column-stores vs row-stores
it all starts with how we store the data
still basic concepts are the same
moving data is a major cost component
it is not just about disk…
the whole memory hierarchy matters
Notes to remember
![Page 72: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/72.jpg)
CS165, Fall 2016 Stratos Idreos /4040
Architecture of a Database System (Sections 1,2,3,4)by J. Hellerstein, M. Stonebraker and J. Hamilton
The Design and Implementation of Modern Column-store Database Systemsby D. Abadi, P. Boncz, S. Harizopoulos, S. Idreos, S. Madden
please keep up with reading!
![Page 73: basic db architectures & layouts - Harvard Universitydaslab.seas.harvard.edu/.../CS165Fall2016Class4.pdf · CS165, Fall 2016 Stratos Idreos 2 /40 videos for sections 3 & 4 are online](https://reader033.fdocuments.us/reader033/viewer/2022050420/5f8f3ca8134c17431e12011b/html5/thumbnails/73.jpg)
DATA SYSTEMSprof. Stratos Idreos
class 4
basic db architectures & layouts