Database Architecture 2 & Storage - Stanford...
Transcript of Database Architecture 2 & Storage - Stanford...
![Page 1: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/1.jpg)
Database Architecture 2 & Storage
Instructor: Matei Zahariacs245.stanford.edu
![Page 2: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/2.jpg)
Outline
Relational DBMS architecture
Alternative architectures & tradeoffs
Storage hardware
CS 245 2
![Page 3: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/3.jpg)
Summary from Last Time
System R mostly matched the architecture of a modern RDBMS» SQL» Many storage & access methods» Cost-based optimizer» Lock manager» Recovery» View-based access control
CS 245 3
![Page 4: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/4.jpg)
Differentiating by Workload
Two big classes of commercial RDBMS today
Transactional DBMS: focus on concurrent, small, low-latency transactions (e.g. MySQL, Postgres, Oracle, DB2) → real-time apps
Analytical DBMS: focus on large, parallel but mostly read-only analytics (e.g. Teradata, Redshift, Vertica) → “data warehouses”
CS 245 4
![Page 5: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/5.jpg)
How To Design Components for Transactional vs Analytical DBMS?
Component Transactional DBMS
Analytical DBMS
Data storage B-trees, row oriented storage
Column-oriented storage
Locking Fine-grained, very optimized
Coarse-grained (few writes)
Recovery Log data writes, minimize latency
Log queries
CS 245 5
![Page 6: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/6.jpg)
Outline
Relational DBMS architecture
Alternative architectures & tradeoffs
Storage hardware
CS 245 6
![Page 7: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/7.jpg)
How Can We Change the DBMS Architecture?
CS 245 7
![Page 8: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/8.jpg)
Decouple Query Processing from Storage ManagementExample: “data lake” architecture (Hadoop, S3, etc)
Large-scalefile systems or
blob storesGFS
File formats& metadata
Processingengines
MapReduce
CS 245 8
![Page 9: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/9.jpg)
Decouple Query Processing from Storage ManagementPros:» Can scale compute independently of storage
(e.g. in datacenter or public cloud)» Let different orgs develop different engines» Your data is “open” by default to new tech
Cons:» Harder to guarantee isolation, reliability, etc» Harder to co-optimize compute and storage» Can’t optimize across many compute engines» Harder to manage if too many engines!
CS 245 9
![Page 10: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/10.jpg)
Change the Data Model
Key-value stores: data is just key-value pairs, don’t worry about record internals
Message queues: data is only accessed in a specific FIFO order; limited operations
ML frameworks: data is tensors, models, etc
CS 245 10
![Page 11: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/11.jpg)
Change the Compute Model
Stream processing: Apps run continuously and system can manage upgrades, scaleup, recovery, etc
Eventual consistency: handle it at app level
CS 245 11
![Page 12: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/12.jpg)
Different Hardware Setting
Distributed databases: need to distribute your lock manager, storage manager, etc, or find system designs that eliminate them
Public cloud: “serverless” databases that can scale compute independently of storage (e.g. AWS Aurora, Google BigQuery)
CS 245 12
![Page 13: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/13.jpg)
Example: AWS Aurora Serverless
CS 245 13
![Page 14: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/14.jpg)
Outline
Relational DBMS architecture
Alternative architectures & tradeoffs
Storage hardware
CS 245 15
![Page 15: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/15.jpg)
CPU
DRAM
StorageDevices
Typical Server
CPU
...
I/O Controller
CS 245 16
Network Card
![Page 16: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/16.jpg)
Storage Performance Metrics
CS 245 17
latency (s)
throughput (bytes/s)
storage capacity(bytes, bytes/$)
CPU
![Page 17: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/17.jpg)
CS 245 18
![Page 18: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/18.jpg)
Storage Latency
RegistersL1 CacheL2 Cache
Memory
Disk
12
10
150
Tape /Optical Robot
109
106
This CampusThis Room
My Head
10 min
2 hr
2 Years
1 min
Pluto
2,000 Years
Andromeda
CS 245 19
Sacramento
![Page 19: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/19.jpg)
Max Attainable Throughput
Varies significantly by device» 100 GB/s for RAM» 2 GB/s for NVMe SSD» 130 MB/s for hard disk
Assumes large reads (≫1 block)!
CS 245 20
![Page 20: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/20.jpg)
Storage Cost
$1000 at NewEgg today buys:» 0.25 TB of RAM» 9 TB of NVMe SSD» 50 TB of magnetic disk
CS 245 21
![Page 21: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/21.jpg)
Hardware Trends over Time
Capacity/$ grows exponentially at a fast rate (e.g. double every 2 years)
Throughput grows at a slower rate (e.g. 5% per year), but new interconnects help
Latency does not improve much over time
CS 245 22
![Page 22: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/22.jpg)
Terms: Platter, Head, ActuatorCylinder, TrackSector (physical),Block (logical), Gap
…
Most Common Permanent Storage: Hard Disks
CS 245 24
![Page 23: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/23.jpg)
Top View
CS 245 25
![Page 24: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/24.jpg)
block xin memory
?
I wantblock X
Disk Access Time
CS 245 26
![Page 25: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/25.jpg)
Time = Seek Time +Rotational Delay +Transfer Time +Other
Disk Access Time
CS 245 27
![Page 26: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/26.jpg)
3-5X
X
1 N
Cylinders Traveled
Time
CS 245 28
Seek Time
![Page 27: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/27.jpg)
Typical Seek TimeRanges from» 4 ms for high end drives» 15 ms for mobile devices
In contrast, SSD access time ranges from» 0.02 ms: NVMe» 0.16 ms: SATA
CS 245 29
![Page 28: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/28.jpg)
Head Here
Block I Want
CS 245 30
Rotational Delay
![Page 29: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/29.jpg)
R = 1/2 revolution
HDDSpindle[rpm]
Averagerotational
latency [ms]4,200 7.145,400 5.567,200 4.17
10,000 3.0015,000 2.00
Typical HDD figures
Source: Wikipedia, "Hard disk drive performance characteristics"
R=0 for SSDs
CS 245 31
Average Rotational Delay
![Page 30: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/30.jpg)
Transfer rate T is around 50-130 MB/s
Transfer time: size / T for contiguous read
Block size: usually 512-4096 bytes
CS 245 32
Transfer Rate
![Page 31: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/31.jpg)
So Far: Random Block Access
What about reading the “next” block?
CS 245 33
![Page 32: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/32.jpg)
If We Do Things Right (Double Buffer, etc)
Time to get = block size / t + negligible
Potential slowdowns:» Skip gap» Next track» Discontinuous block placement
CS 245 34
Sequential access generally much faster than random access
![Page 33: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/33.jpg)
…. unless we want to verify!need to add (full) rotation + block size / t
CS 245 35
Cost of Writing: Similar to Reading
![Page 34: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/34.jpg)
To Modify Block:(a) Read Block(b) Modify in Memory(c) Write Block[(d) Verify?]
CS 245 36
Cost To Modify a Block?
![Page 35: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/35.jpg)
Performance of DRAM
The same basic issues with “lookup time” vs throughput apply to DRAM
Min read from DRAM is a cache line (64 bytes)
Even 64-byte random reads may not be as fast as sequential ones due to prefetching, page table, controllers, etc
CS 245 37
Place co-accessed data together!
![Page 36: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/36.jpg)
Example
Suppose we’re accessing 8-byte records in a DRAM with 64-byte cache line sizes
How much slower is random vs sequential?
CS 245 38
In the random case, we are reading 64 bytes for every 8 bytes we need, so we expect to max out the throughput at least 8x sooner.
![Page 37: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/37.jpg)
Storage Hierarchy
Typically want to cache frequently accessed data at a high level of the storage hierarchy to improve performance
CS 245 39
CPU
CPU Cache
DRAM
Disk
(KBs-MBs)
(GBs)
(TBs)
![Page 38: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/38.jpg)
Sizing Storage Tiers
How much high-tier storage should we have?
Can determine based on workload & cost
CS 245 40
The 5 Minute Rule for Trading MemoryAccesses for Disc AccessesJim Gray & Franco PutzoluMay 1985
![Page 39: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/39.jpg)
The Five Minute RuleSay a page is accessed every X seconds
Assume a disk costs D dollars and can do Ioperations/sec; cost of keeping this page on disk is
Cdisk = Ciop / X = D / (I X)
Assume 1 MB of RAM costs M dollars and holds Ppages; then the cost of keeping it in DRAM is:
Cmem = M / P
CS 245 41
![Page 40: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/40.jpg)
Five Minute Rule
This tells us that the page is worth caching when Cmem < Cdisk, i.e.
CS 245 42
X <
Source: The Five-minute Rule Thirty Years Later and its Impact on the Storage Hierarchy
![Page 41: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/41.jpg)
Disk ArraysMany flavors of “RAID”: striping, mirroring, etcto increase performance and reliability
logically one disk
CS 245 43
![Page 42: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/42.jpg)
Common RAID Levels
CS 245 44Image source: Wikipedia
Striping across2 disks: addsperformance butnot reliability
Mirroring across2 disks: addsreliability but notperformance (except for reads)
Striping + 1 parity disk: addsperformance and reliability atlower storage cost
![Page 43: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/43.jpg)
Coping with Disk Failures
Detection» E.g. checksum
Correction» Requires redundancy
CS 245 45
![Page 44: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/44.jpg)
Single Disk» E.g., error-correcting codes on read
Disk Array
Logical Physical
CS 245 46
At What Level Do We Cope?
![Page 45: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/45.jpg)
Logical Block Copy A Copy B
CS 245 47
Operating System
E.g., network-replicated storage
![Page 46: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/46.jpg)
Database System
E.g.,
Log
Current DB Last week’s DB
CS 245 48
![Page 47: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/47.jpg)
SummaryStorage devices offer various tradeoffs in terms of latency, throughput and cost
In all cases, data layout and access pattern matter because random ≪ sequential access
Most systems will combine multiple devices
CS 245 49
![Page 48: Database Architecture 2 & Storage - Stanford …web.stanford.edu/.../slides/03-System-Architecture-p2.pdfSummary from Last Time System R mostly matched the architecture of a modern](https://reader035.fdocuments.us/reader035/viewer/2022081612/5f442e6d676d563b1b741865/html5/thumbnails/48.jpg)
Assignment 1
Explores the effect of data layout for a simple in-memory database» Fixed set of supported queries» Implement a row store, column store,
indexed store, and your own custom store!
CS 245 50
Now posted on website!