CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a...
Transcript of CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a...
![Page 1: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/1.jpg)
Andy Pavlo // Carnegie Mellon University // Spring 2016
Lecture #02 – In-Memory Databases
DATABASE SYSTEMS
15-721
[Image Source]
![Page 2: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/2.jpg)
CMU 15-721 (Spring 2016)
TODAY ’S AGENDA
Background In-Memory DBMS Architectures Historical Systems Peloton Overview Project #1
2
![Page 3: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/3.jpg)
CMU 15-721 (Spring 2016)
BACKGROUND
Much of the history of DBMSs is about avoiding the slowness of disks. Hardware was much different when the original DBMSs were designed: → Uniprocessor (single-core CPU) → RAM was severely limited. → The database had to be stored on disk.
3
![Page 4: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/4.jpg)
CMU 15-721 (Spring 2016)
BACKGROUND
But now DRAM capacities are large enough that most databases can fit in memory. So why not just use a “traditional” disk-oriented DBMS with a really large cache?
4
![Page 5: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/5.jpg)
CMU 15-721 (Spring 2016)
DISK-ORIENTED DBMS
The primary storage location of the database is on non-volatile storage (e.g., HDD, SSD). → The database is organized as a set of fixed-length
blocks called slotted pages.
The system uses an in-memory (volatile) buffer pool to cache blocks fetched from disk. → Its job is to manage the movement of those blocks
back and forth between disk and memory.
5
![Page 6: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/6.jpg)
CMU 15-721 (Spring 2016)
BUFFER POOL
When a query accesses a page, the DBMS checks to see if that page is already in memory: → If it’s not, then the DBMS has to retrieve it from disk
and copy it into a frame in its buffer pool. → If there are no free frames, then find a page to evict. → If the page being evicted is dirty, then the DBMS has
to write it back to disk.
Once the page is in memory, the DBMS translates any on-disk addresses to their in-memory addresses.
6
![Page 7: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/7.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
7
Buffer Pool
page6
page4
Index Database (On-Disk)
Slotted Pages
Page Table
page0
page1
page2
page2
![Page 8: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/8.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
7
Buffer Pool
page6
page4
Index
Page Id + Slot #
Database (On-Disk)
Slotted Pages
Page Table
page0
page1
page2
page2
![Page 9: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/9.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
7
Buffer Pool
page6
page4
Index
Page Id + Slot #
Database (On-Disk)
Slotted Pages
Page Table
page0
page1
page2
page2
![Page 10: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/10.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
7
Buffer Pool
page6
page4
Index
Page Id + Slot #
Database (On-Disk)
Slotted Pages
Page Table
page0
page1
page2
page2
![Page 11: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/11.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
7
Buffer Pool
page6
page4
Index
Page Id + Slot #
Database (On-Disk)
Slotted Pages
Page Table
page0
page1
page2
page2
![Page 12: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/12.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
7
Buffer Pool
page6
page4
Index
Page Id + Slot #
Database (On-Disk)
Slotted Pages
Page Table
page0
page1
page2
page2
![Page 13: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/13.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
7
Buffer Pool
page6
page4
Index
Page Id + Slot #
Database (On-Disk)
Slotted Pages
Page Table
page0
page1
page2
page2
![Page 14: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/14.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
7
Buffer Pool
page6
page4
Index
Page Id + Slot #
Database (On-Disk)
Slotted Pages
Page Table
page0
page1
page2
![Page 15: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/15.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
7
Buffer Pool
page6
page4
Index
Page Id + Slot #
Database (On-Disk)
Slotted Pages
Page Table
page0
page1
page2
page1
![Page 16: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/16.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
7
Buffer Pool
page6
page4
Index
Page Id + Slot #
Database (On-Disk)
Slotted Pages
Page Table
page0
page1
page2
page1
![Page 17: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/17.jpg)
CMU 15-721 (Spring 2016)
SLOT TED PAGES
8
header blob1
tuple1 tuple2 tuple3
blob2 blob3
· · · free space · · ·
![Page 18: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/18.jpg)
CMU 15-721 (Spring 2016)
SLOT TED PAGES
8
header blob1
tuple1 tuple2 tuple3
blob2 blob3
· · · free space · · ·
![Page 19: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/19.jpg)
CMU 15-721 (Spring 2016)
SLOT TED PAGES
8
header blob1
tuple1 tuple2 tuple3
blob2 blob3
· · · free space · · ·
Fixed-length Data Slots
![Page 20: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/20.jpg)
CMU 15-721 (Spring 2016)
SLOT TED PAGES
8
header blob1
tuple1 tuple2 tuple3
blob2 blob3
· · · free space · · ·
Fixed-length Data Slots
Variable-length Data
![Page 21: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/21.jpg)
CMU 15-721 (Spring 2016)
SLOT TED PAGES
8
header blob1
tuple1 tuple2 tuple3
blob2 blob3
· · · free space · · ·
Fixed-length Data Slots
Variable-length Data
![Page 22: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/22.jpg)
CMU 15-721 (Spring 2016)
SLOT TED PAGES
8
header blob1
tuple1 tuple2 tuple3
blob2 blob3
· · · free space · · ·
Fixed-length Data Slots
Variable-length Data
![Page 23: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/23.jpg)
CMU 15-721 (Spring 2016)
SLOT TED PAGES
8
header blob1
tuple1 tuple2 tuple3
blob2 blob3
· · · free space · · ·
Fixed-length Data Slots
Variable-length Data
![Page 24: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/24.jpg)
CMU 15-721 (Spring 2016)
SLOT TED PAGES
8
header blob1
tuple1 tuple2 tuple3
blob2 blob3
· · · free space · · ·
Fixed-length Data Slots
Variable-length Data
![Page 25: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/25.jpg)
CMU 15-721 (Spring 2016)
BUFFER POOL
Every tuple access has to go through the buffer pool manager regardless of whether that data will always be in memory. → Always have to translate a tuple’s record id to its
memory location. → Worker thread has to pin pages that it needs to make
sure that they are not swapped to disk.
9
![Page 26: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/26.jpg)
CMU 15-721 (Spring 2016)
CONCURRENCY CONTROL
In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries to access data that is not in memory. Execute other txns at the same time so that if one txn stalls then others can keep running. → Has to set locks and latches to provide ACID
guarantees for txns. → Locks are stored in a separate data structure to avoid
being swapped to disk.
10
![Page 27: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/27.jpg)
CMU 15-721 (Spring 2016)
LOGGING & RECOVERY
Most DBMSs use STEAL + NO-FORCE buffer pool policies, so all modifications have to be flushed to the WAL before a txn can commit. Each log entry contains the before and after image of record modified.
11
![Page 28: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/28.jpg)
CMU 15-721 (Spring 2016)
DISK-ORIENTED DBMS OVERHEAD
12
Measured CPU Cycles
OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE SIGMOD, pp. 981-992, 2008.
![Page 29: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/29.jpg)
CMU 15-721 (Spring 2016)
DISK-ORIENTED DBMS OVERHEAD
12
BUFFER POOL
LOCKING
RECOVERY
REAL WORK
Measured CPU Cycles
OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE SIGMOD, pp. 981-992, 2008.
![Page 30: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/30.jpg)
CMU 15-721 (Spring 2016)
DISK-ORIENTED DBMS OVERHEAD
12
BUFFER POOL
LOCKING
RECOVERY
REAL WORK 30%
Measured CPU Cycles
OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE SIGMOD, pp. 981-992, 2008.
![Page 31: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/31.jpg)
CMU 15-721 (Spring 2016)
DISK-ORIENTED DBMS OVERHEAD
12
BUFFER POOL
LOCKING
RECOVERY
REAL WORK
30%
30%
Measured CPU Cycles
OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE SIGMOD, pp. 981-992, 2008.
![Page 32: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/32.jpg)
CMU 15-721 (Spring 2016)
DISK-ORIENTED DBMS OVERHEAD
12
BUFFER POOL
LOCKING
RECOVERY
REAL WORK
28% 30%
30%
Measured CPU Cycles
OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE SIGMOD, pp. 981-992, 2008.
![Page 33: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/33.jpg)
CMU 15-721 (Spring 2016)
DISK-ORIENTED DBMS OVERHEAD
12
BUFFER POOL
LOCKING
RECOVERY
REAL WORK
28% 30%
30% 12%
Measured CPU Cycles
OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE SIGMOD, pp. 981-992, 2008.
![Page 34: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/34.jpg)
CMU 15-721 (Spring 2016)
IN-MEMORY DBMSS
Assume that the primary storage location of the database is permanently in memory. Early ideas proposed in the 1980s but it is now feasible because DRAM prices are low and capacities are high.
13
![Page 35: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/35.jpg)
CMU 15-721 (Spring 2016)
WHY NOT MMAP?
Memory-map a database file into DRAM and let the OS be in charge of swapping data in and out as needed. Use madvise and msync to give hints to the OS about what data is safe to flush. Notable mmap DBMSs: → MongoDB (pre WiredTiger) → MonetDB → LMDB
14
![Page 36: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/36.jpg)
CMU 15-721 (Spring 2016)
WHY NOT MMAP?
Using mmap gives up fine-grained control on the contents of memory. → Cannot perform non-blocking memory access. → The “on-disk” representation has to be the same as
the “in-memory” representation. → The DBMS has no way of knowing what pages are in
memory or not.
A well-written DBMS always knows best.
15
![Page 37: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/37.jpg)
CMU 15-721 (Spring 2016)
BOT TLENECKS
If I/O is no longer the slowest resource, much of the DBMS’s architecture will have to change account for other bottlenecks: → Locking/latching → Cache-line misses → Pointer chasing → Predicate evaluations → Data movement & copying → Networking (between application & DBMS)
16
![Page 38: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/38.jpg)
CMU 15-721 (Spring 2016)
STORAGE ACCESS LATENCIES
17
L3 DRAM SSD HDD
Read Latency ~20 ns 60 ns 25,000 ns 10,000,000 ns
Write Latency ~20 ns 60 ns 300,000 ns 10,000,000 ns
LET’S TALK ABOUT STORAGE & RECOVERY METHODS FOR NON-VOLATILE MEMORY DATABASE SYSTEMS SIGMOD, pp. 707-722, 2015.
![Page 39: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/39.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
An in-memory DBMS does not need to store the database in slotted pages but it will still organize tuples in blocks: → Direct memory pointers vs. record ids → Fixed-length vs. variable-length data pools → Use block checksums to detect software errors from
trashing the database.
18
![Page 40: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/40.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
19
Fixed-Length Data Blocks
Index Variable-Length Data Blocks
![Page 41: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/41.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
19
Fixed-Length Data Blocks
Index
Memory Address
Variable-Length Data Blocks
![Page 42: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/42.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
19
Fixed-Length Data Blocks
Index
Memory Address
Variable-Length Data Blocks
![Page 43: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/43.jpg)
CMU 15-721 (Spring 2016)
DATA ORGANIZATION
19
Fixed-Length Data Blocks
Index
Memory Address
Variable-Length Data Blocks
![Page 44: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/44.jpg)
CMU 15-721 (Spring 2016)
CONCURRENCY CONTROL
Observation: The cost of a txn acquiring a lock is the same as accessing data. In-memory DBMS may want to detect conflicts between txns at a different granularity. → Fine-grained locking allows for better concurrency
but requires more locks. → Coarse-grained locking requires fewer locks but
limits the amount of concurrency.
20
![Page 45: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/45.jpg)
CMU 15-721 (Spring 2016)
CONCURRENCY CONTROL
The DBMS can store locking information about each tuple together with its data. → This helps with CPU cache locality. → Mutexes are too slow. Need to use CAS instructions.
New bottleneck is contention caused from txns trying access data at the same time.
21
![Page 46: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/46.jpg)
CMU 15-721 (Spring 2016)
INDEXES
Main-memory indexes were proposed in 1980s when cache and memory access speeds were roughly equivalent. But then caches got faster than main memory: → Memory-optimized indexes performed worse than
the B+trees because they were not cache aware. Indexes are usually rebuilt in an in-memory DBMS after restart to avoid logging overhead.
22
![Page 47: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/47.jpg)
CMU 15-721 (Spring 2016)
QUERY PROCESSING
The best strategy for executing a query plan in a DBMS changes when all of the data is already in memory. → Sequential scans are no longer significantly faster
than random access.
The traditional tuple-at-a-time iterator model is too slow because of function calls. → This problem is more significant in OLAP DBMSs.
23
![Page 48: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/48.jpg)
CMU 15-721 (Spring 2016)
QUERY PROCESSING
Tuple-at-a-time → Each operator calls next on their child to
get the next tuple to process.
24
SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND B.value > 100
A B
A.id=B.id
value>100
A.id, B.value
⨝ σ
π
![Page 49: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/49.jpg)
CMU 15-721 (Spring 2016)
QUERY PROCESSING
Tuple-at-a-time → Each operator calls next on their child to
get the next tuple to process.
Operator-at-a-time → Each operator materializes their entire
output for their parent operator.
24
SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND B.value > 100
A B
A.id=B.id
value>100
A.id, B.value
⨝ σ
π
![Page 50: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/50.jpg)
CMU 15-721 (Spring 2016)
QUERY PROCESSING
Tuple-at-a-time → Each operator calls next on their child to
get the next tuple to process.
Operator-at-a-time → Each operator materializes their entire
output for their parent operator.
Vector-at-a-time → Each operator calls next on their child to
get the next chunk of data to process.
24
SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND B.value > 100
A B
A.id=B.id
value>100
A.id, B.value
⨝ σ
π
![Page 51: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/51.jpg)
CMU 15-721 (Spring 2016)
LOGGING & RECOVERY
The DBMS still needs a WAL on non-volatile storage since the system could halt at anytime. → Use group commit to batch log entries and flush
them together to amortize fsync cost. → May be possible to use more lightweight logging
schemes if using coarse-grained locking (redo only).
25
![Page 52: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/52.jpg)
CMU 15-721 (Spring 2016)
LOGGING & RECOVERY
The system also still takes checkpoints to speed up recovery time. Different methods for checkpointing: → Old idea: Maintain a second copy of the database in
memory that is updated by replaying the WAL. → Switch to a special “copy-on-write” mode and then
write a dump of the database to disk. → Fork the DBMS process and then have the child
process write its contents to disk.
26
![Page 53: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/53.jpg)
CMU 15-721 (Spring 2016)
LARGER-THAN-MEMORY DATABASES
DRAM is fast, but data is not accessed with the same frequency and in the same manner. → Hot Data: OLTP Operations → Cold Data: OLAP Queries
We will study techniques for how to bring back disk-resident data without slowing down the entire system.
27
![Page 54: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/54.jpg)
CMU 15-721 (Spring 2016)
NON-VOLATILE MEMORY
Emerging hardware that is able to get almost the same read/write speed as DRAM but with the persistence guarantees of an SSD. → Also called storage class memory → Examples: Phase-Change Memory, Memristors
It’s not clear how to build a DBMS to operate on this kind memory. Again, we’ll cover this topic later.
28
![Page 55: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/55.jpg)
CMU 15-721 (Spring 2016)
NOTABLE IN-MEMORY DBMSs
Oracle TimesTen P*TIME Dali / DataBlitz Altibase SAP HANA VoltDB / H-Store
29
Microsoft Hekaton Harvard Silo TUM HyPer MemSQL IBM DB2 BLU Apache Geode
![Page 56: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/56.jpg)
CMU 15-721 (Spring 2016)
NOTABLE IN-MEMORY DBMSs
Oracle TimesTen P*TIME Dali / DataBlitz Altibase SAP HANA VoltDB / H-Store
29
Microsoft Hekaton Harvard Silo TUM HyPer MemSQL IBM DB2 BLU Apache Geode
![Page 57: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/57.jpg)
CMU 15-721 (Spring 2016)
TIMESTEN
Originally SmallBase from HP Labs in 1995. Multi-process, shared memory DBMS. → Single-version database using two-phase locking. → Dictionary-encoded columnar compression.
Bought by Oracle in 2005.
30
ORACLE TIMESTEN: AN IN-MEMORY DATABASE FOR ENTERPRISE APPLICATIONS VLDB, pp. 1033-1044, 2004.
![Page 58: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/58.jpg)
CMU 15-721 (Spring 2016)
DALI / DATABLITZ
Developed at AT&T Labs in the early 1990s. Multi-process, shared memory storage manager using memory-mapped files. Employed additional safety measures to make sure that erroneous writes to memory do not corrupt the database. → Meta-data is stored in a non-shared location. → A page’s checksum is always tested on a read; if the
checksum is invalid, recover page from log.
31
DALI: A HIGH PERFORMANCE MAIN MEMORY STORAGE MANAGER VLDB, pp. 48-59, 1994.
![Page 59: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/59.jpg)
CMU 15-721 (Spring 2016)
P*TIME
Korean in-memory DBMS from the 2000s. Performance numbers are still impressive. Lots of interesting features: → Uses differential encoding (XOR) for log records. → Hybrid storage layouts. → Support for larger-than-memory databases.
Sold to SAP in 2005. Now part of HANA.
32
P*TIME: HIGHLY SCALABLE OLTP DBMS FOR MANAGING UPDATE-INTENSIVE STREAM WORKLOAD VLDB, pp. 1033-1044, 2004.
![Page 60: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/60.jpg)
CMU 15-721 (Spring 2016)
PELOTON DBMS
CMU’s in-memory hybrid relational DBMS → Multi-version concurrency control. → Tile-based storage manager. → Multi-threaded architecture. → Based on PostgreSQL 9.3
Currently supports most of SQL-92.
33
![Page 61: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/61.jpg)
CMU 15-721 (Spring 2016)
PELOTON DBMS
CMU’s in-memory hybrid relational DBMS → Multi-version concurrency control. → Tile-based storage manager. → Multi-threaded architecture. → Based on PostgreSQL 9.3
Currently supports most of SQL-92.
33
![Page 62: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/62.jpg)
CMU 15-721 (Spring 2016)
TILE STORAGE ARCHITECTURE
34
Logical Relation attr4 attr3 attr2 attr1
tuple1
tuple2
tuple3
tuple4
tuple5
tuple6
![Page 63: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/63.jpg)
CMU 15-721 (Spring 2016)
TILE STORAGE ARCHITECTURE
34
Logical Relation attr4 attr3 attr2 attr1
tuple1
tuple2
tuple3
tuple4
tuple5
tuple6
Tile Group B attr4 attr3
Tile B-2
attr2 attr1
Tile B-1
tuple3
tuple4
tuple5
tuple6
Tile Group A attr2 attr1
Tile A-1
tuple1
tuple2
attr4 attr3
Physical Representation
![Page 64: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/64.jpg)
CMU 15-721 (Spring 2016)
TILE STORAGE ARCHITECTURE
35
SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND B.value > 100
A B
A.id=B.id
value>100
A.id, B.value
⨝ σ
π
![Page 65: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/65.jpg)
CMU 15-721 (Spring 2016)
TILE STORAGE ARCHITECTURE
35
Physical Tile Group
attr2 attr1
Tile A-1
attr4 attr3
Tile A-2
SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND B.value > 100
A B
A.id=B.id
value>100
A.id, B.value
⨝ σ
π
![Page 66: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/66.jpg)
CMU 15-721 (Spring 2016)
TILE STORAGE ARCHITECTURE
35
Physical Tile Group
attr2 attr1
Tile A-1
attr4 attr3
Tile A-2
SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND B.value > 100
A B
A.id=B.id
value>100
A.id, B.value
⨝ σ
π
Logical Tile Group
Tile A1 [id] Tile B3 [value]
![Page 67: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/67.jpg)
CMU 15-721 (Spring 2016)
TILE STORAGE ARCHITECTURE
35
Physical Tile Group
attr2 attr1
Tile A-1
attr4 attr3
Tile A-2
SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND B.value > 100
A B
A.id=B.id
value>100
A.id, B.value
⨝ σ
π
Logical Tile Group
Tile A1 [id] Tile B3 [value]
![Page 68: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/68.jpg)
CMU 15-721 (Spring 2016)
PROJECT #1
Implement an in-memory hash join operator that supports four different join types: → INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN,
and FULL OUTER JOIN
You are free to implement either the “classic” algorithm or the GRACE hash join algorithm.
36
![Page 69: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/69.jpg)
CMU 15-721 (Spring 2016)
PROJECT #1 – TESTING
We are providing you with a C++ unit test for you check your implementation. We also have a SQL batch script that will execute a couple different queries. We strongly encourage you to do your own additional testing. → Make sure that you disable the other join types to
force the optimizer to always pick hash join plans.
37
![Page 70: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/70.jpg)
CMU 15-721 (Spring 2016)
PROJECT #1 – GRADING
We will run additional tests beyond what we provided you for grading. → Bonus points will be given to the student with the
fastest implementation. → We will use Valgrind when testing your code.
All source code must pass ClangFormat syntax formatting checker. → See Peloton documentation for formatting guidelines
38
![Page 71: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/71.jpg)
CMU 15-721 (Spring 2016)
DEVELOPMENT ENVIRONMENT
Peloton only builds on 64-bit Linux. But you can do development on either Linux or OSX (through a VM). → We have a Vagrant config file to automatically create
a development Ubuntu VM for you.
This is CMU so I’m going to assume that each of you are capable of getting access to a machine.
39
![Page 72: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/72.jpg)
CMU 15-721 (Spring 2016)
GITHUB PRIVATE REPO
If you want to use Github for your projects, you must use a private repo for Projects #1 and #2. Sign up for a student account on Github to get five free private repositories: https://education.github.com/pack
40
![Page 73: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/73.jpg)
CMU 15-721 (Spring 2016)
PROJECT #1
Due Date: February 8th, 2016 @ 11:59pm Projects will be turned in using Autolab. Full description and instructions: http://15721.courses.cs.cmu.edu/spring2016/project1.html
41
![Page 74: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/74.jpg)
CMU 15-721 (Spring 2016)
PARTING THOUGHTS
Disk-oriented DBMSs are a relic of the past. → Most databases fit entirely in DRAM on a single machine.
The world has finally become comfortable with in-memory data storage and processing. Never use mmap for your DBMS.
42
![Page 75: CMU SCS 15-721 :: In-Memory Databases · CMU 15-721 (Spring 2016) CONCURRENCY CONTROL . In a disk-oriented DBMS, the systems assumes that a txn could stall at any time when it tries](https://reader036.fdocuments.us/reader036/viewer/2022070814/5f0d40a77e708231d4396cea/html5/thumbnails/75.jpg)
CMU 15-721 (Spring 2016)
NEXT CLASS
Transactions & Concurrency Control
43