Consuming history in a political context: Motivations of ...
History & Motivations –RDBMS History & Motivations (cont’d) … … Concurrent Access Handling...
-
Upload
benjamin-barber -
Category
Documents
-
view
221 -
download
2
Transcript of History & Motivations –RDBMS History & Motivations (cont’d) … … Concurrent Access Handling...
2015.11
Jae Hyung KimPh.D. Candidate , Department of Computer Science, Yonsei University
Big Data Platforms- History and Motiva-tions
Index
1. Introduction
2. RDBMS vs Big Data Platforms
3. Growing Big Data Platforms
1. Introduction
Introduction
• History & Motivations– RDBMS
• History & Motivations (cont’d)
Introduction
…
…
Concurrent AccessHandling Failures
Shared DataUser
• Transaction– Powerful abstraction concept which forms the “interface
contract” between an application program and a transac-tional server
Introduction
Program Start
Begin Transac-
tion...
Commit Transac-
tion
Program End
ApplicationLifecycle
TransactionBoundary
• Transaction (cont’d)
Introduction
The core requirement on a DBMS isACID guarantees for set of opera-tions in the same transaction
concurrency control component to guarantee the isolation properties of transactions, for both committed and aborted trans-actions
recovery component to guarantee the atomicity and durability of transactions
• RDBMS Architecture – Heavy!!!
Introduction
Language and Interface Layer Query Decomposition and
Optimization Layer Query Execution Layer
Access Layer
Storage Layer
Requestexecutionthreads
Requests
Clients
DatabaseServer
Data Access
Database
To facilitate disk I/O parallelism be-tween different requests
…
• RDBMS Architecture – How data is storedIntroduction
Page1) The minimum unit of data
transfer between disk and main memory
2) The unit of caching in memorySlot= A page number + A slot number
Database usually hasa cretain amount of preallocated disk space consists of one or more extentsEach extent is a range of pages that are contiguous on disk
A page number A disk number + A physical address on diskby looking up an entry in an extent ta-bleand adding a relative offset
• RDBMS Computational Model – Page model
Introduction
Parallelized transaction execution
Requests Processing of pages (read or write)
ACID Properties of Transaction Page based
Concurrency Control and Recovery should be based on page model
t = r(x)r(y)r(z)w(u)w(x)r(x) r(y) r(z)
w(u) w(x)Partial Or-
der
※ The details of how data is manipulated within the lo-cal variables of the executing programs are mostly irrelevant
• Needs for huge data from Google– More than 15,000 commodity-class PC's– Multiple clusters distributed worldwide– Thousands of queries served per second– One query reads 100's of MB of data– One query consumes 10's of billions of CPU cycles– Google stores dozens of copies of the entire Web!
Introduction
Conclusion: Need large, distributed, highly fault tolerant file system Traditional DBMS cannot tolerate
RDBMS vs Big Data Platforms
• Problems of RDBMS– RDBMS’s clustering
RDBMS vs Big Data Platforms
Data Copy Cost
Transac-tion
Maintain cost
Performance does not increase as we expected
• Problems of RDBMS– Scale-up vs Scale-out (Cost perspective)
RDBMS vs Big Data Platforms
인텔 제온 E5-2697V3 ( 하스웰 -EP)인텔 ( 소켓 2011-V3) / 테트라데카(14) 코어 / 쓰레드 28 개 / 64(32)비트 / 2.6GHz / DDR4 / PCI-Ex-press 40 개 레인
인텔 코어 i5-6 세대 6600 (스카이레이크 )인텔 ( 소켓 1151) / DDR4 / DDR3L / 64 비트 / 쿼드 코어 / 쓰레드 4 개 / 3.3GHz / 인텔 HD 530 / PCI-Express 16 개 레인
\250,000
\3,400,000
• Google File System– Beginning of the big data platforms– Affects to Hadoop– Chunk : Analogous to block, except larger (typically 64MB)
RDBMS vs Big Data Platforms
• Google File System– Read Algorithm (1/2)
RDBMS vs Big Data Platforms
• Google File System– Read Algorithm (2/2)
RDBMS vs Big Data Platforms
• Google File System– Write Algorithm (1/4)
RDBMS vs Big Data Platforms
• Google File System– Write Algorithm (2/4)
RDBMS vs Big Data Platforms
• Google File System– Write Algorithm (3/4)
RDBMS vs Big Data Platforms
• Google File System– Write Algorithm (4/4)
RDBMS vs Big Data Platforms
• Hadoop– HDFS + MapReduce
RDBMS vs Big Data Platforms
128MB file (e.g. /data/hdfs/block1)on Local Filesystem
• Hadoop– HDFS + MapReduce (Computational Model)
RDBMS vs Big Data Platforms
On Local Filesys-
tem
Growing Bigdata Platforms
Growing Big Data Platforms
• Gartner’s hype cycle 2012
Growing Big Data Platforms
• Gartner’s hype cycle 2013
Growing Big Data Platforms
• Gartner’s hype cycle 2014
Growing Big Data Platforms
• Gartner’s hype cycle 2015– Big data dropped from cycle, Big data is now into practice
Growing Big Data Platforms
Q&A
• Thank you