Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big...
-
Upload
nguyendung -
Category
Documents
-
view
222 -
download
1
Transcript of Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big...
![Page 1: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/1.jpg)
Big Data in a Relational World Presented by: Kerry Osborne JPMorgan Chase – December, 2012
![Page 2: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/2.jpg)
3
whoami –
Never Worked for Oracle Worked with Oracle DB Since 1982 (V2) Working with Exadata since early 2010 Work for Enkitec (www.enkitec.com) (Enkitec owns an Exadata Half Rack – V2/X2) (Enkitec owns an Oracle Big Data Appliance) Exadata Book (recently translated to Chinese)
Blog: kerryosborne.oracle-guy.com Twitter: @KerryOracleGuy
Hadoop Aficionado
![Page 3: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/3.jpg)
4
What’s the Point?
Data Volumes are Increasing Rapidly Cost of Processing / Storing is High Scalability is Big Concern
And …
![Page 4: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/4.jpg)
5
Hadoop Is A Virus
* Stolen from Orbitz
![Page 5: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/5.jpg)
6
Google Trends
![Page 6: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/6.jpg)
7
Google Trends
![Page 7: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/7.jpg)
8
Google Trends
![Page 8: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/8.jpg)
9
Disjointed Presentation ???
Big Data Basics Oracle Stuff Architectures Integration Approaches Products Exadoop Case Study
![Page 9: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/9.jpg)
10
So What is “Big Data”
Not My Favorite Term Lot’s of Hype Not the Right Tool for Every Job
* Many describe it using 3 (or occasionally 4) V’s
![Page 10: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/10.jpg)
11
How Many V’s?
Volume Velocity Variety Value (Value Density)
![Page 11: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/11.jpg)
12
Well, How Did We Get Here?
![Page 12: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/12.jpg)
13
Website Growth
![Page 13: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/13.jpg)
Google Stack
Google File System (GFS)
Map Reduce BigTable Chubby
Google Applications
![Page 14: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/14.jpg)
15
Open Source Hadoop Stack
Hadoop File System (HDFS)
Hadoop Map Reduce Hbase ZooKeeper
Applications Hive Pig
![Page 15: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/15.jpg)
16
GPFS Design Goals
• Inexpensive Commodity Components – failure expected • Optimize for Large Files • High Bandwidth More Important than Low Latency • Typical Workload - Write Once Read Many • High Append Concurrency
![Page 16: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/16.jpg)
17
Map Reduce Design Goals
• Provide Scalability - add more machines and it goes faster
• Minimize Network Usage - they realized network resources are scarce - Move the Work to the Data!
• Simplify Parallel Distributed Programming - hides the details of - parallelization - fault-tolerance - locality optimization - load balancing
![Page 17: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/17.jpg)
Hadoop Meets Exadata
Oracle Open World – October, 2012
Presented by: Kerry Osborne
Hi
![Page 18: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/18.jpg)
19
Traditional RDBMS Architecture
DB Server
Storage
Compute work
Storage
Plumbing
![Page 19: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/19.jpg)
20
Traditional Oracle Architecture
Cache
Storage
dbwr lgwr etc…
workers
RAC
Block Mapper (ASM)
(SGA) work
![Page 20: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/20.jpg)
tasktracker tasktracker
21
HDFS/Hadoop Architecture
Name Node Job Tracker work
Storage
workers
datanode
Storage
workers
datanode
HA ?
![Page 21: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/21.jpg)
tasktracker tasktracker
22
HDFS/Hadoop Architecture HA ?
Block Mapper (namenode)
Job Tracker work
Storage
workers
datanode
Storage
workers
datanode
![Page 22: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/22.jpg)
23
Digression: Internode Communication
![Page 23: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/23.jpg)
24
Exadata Architecture RAC
Block Mapper (ASM)
Cache work
Storage
workers
Storage Node
Storage
workers
Storage Node
workers
![Page 24: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/24.jpg)
tasktracker tasktracker
25
HDFS/Hadoop Architecture HA ?
Block Mapper (namenode)
Job Tracker work
Storage
workers
datanode
Storage
workers
datanode
![Page 25: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/25.jpg)
26
Oracle + Hadoop Integration
![Page 26: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/26.jpg)
27
Obligatory Marketing Slide
![Page 27: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/27.jpg)
28
Oracle Big Data Appliance
Prebuilt Hadoop Stack in a Rack Engineered System Open Source Software Includes Cloudera Distribution
![Page 28: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/28.jpg)
29
Oracle Big Data Appliance
![Page 29: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/29.jpg)
30
BDA Software
![Page 30: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/30.jpg)
31
Top Secret Feature of BDA
![Page 31: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/31.jpg)
32
Integration Options
Many Ways to Skin the Cat
• Fuse • Sqoop • Oracle Big Data Connectors
![Page 32: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/32.jpg)
33
Fuse – External Tables
![Page 33: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/33.jpg)
34
Sqoop (SQL-to-Hadoop)
• Graduated from Incubator Status in March 2012 • Slower (no direct path?) • Quest has a plug-in (oraoop) • Bi-Directional
![Page 34: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/34.jpg)
35
Oracle Big Data Connectors
Oracle Loader for Hadoop - OLH
Oracle Direct Connector for HDFS - ODCH
Oracle R Connector for Hadoop – ORHC
Oracle Data Integrator Application Adapter for Hadoop
Note:
All Connectors are One Way
![Page 35: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/35.jpg)
36
Oracle Data Integrator Application Adapter for Hadoop
ODIAAH ?
![Page 36: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/36.jpg)
37
Oracle R Connector for Hadoop (ORHC)
• Provides ability to pull data from Oracle RDBMS • Provides ability to pull data from HDFS • Provides access to local file system • Not really a loader tool • Most useful for analysts
![Page 37: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/37.jpg)
38
Oracle Loader for Hadoop (OLH) • Implemented as a MapReduce job (oraloader.jar) • Saves CPU on DB Server • Can convert to Oracle datatypes • Can partition data and optionally sort it • Online – direct into Oracle tables
• Can load into Oracle via JDBC or OCI Direct Path • Offline – generate preprocessed files in HDFS (DP format)
![Page 38: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/38.jpg)
39
Oracle Direct Connector for HDFS (ODCH)
• Uses External Tables • Fastest - 12T per hour • Can load DP files preprocessed by OLH • Allows Oracle SQL to query HDFS data • Doesn’t require loading into Oracle • Downside – uses DB CPU’s
![Page 39: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/39.jpg)
Exadoop
40
* Mad Scientist Project
![Page 40: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/40.jpg)
Exadoop
41
Unusual Situation! Exadata Half Rack with 4 Spare Storage Servers Company Playing with “Big Data” Technology Exadata Cells Very Similar to BDA Servers 4 Cells ≈ Mini BDA! (happy face)
![Page 41: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/41.jpg)
Exadoop Layout
42
- Exa Compute Nodes
- Exa Storage Nodes (108TB raw)
- Hadoop Cluster (144TB raw)
Big
Fat
Pip
e
- 4 Compute Nodes
- 7 Storage Nodes (252TB)
Exa Half Rack
X X X X X X X X X X X X
Exadoop
![Page 42: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/42.jpg)
Exadoop Applications
43
Telecom Company Call Detail Records Dumped by Switches Loaded into HDFS via Flume
![Page 43: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/43.jpg)
Exadoop – Proposed Architecture
44
- Exa Compute
- Exa Storage
- Hadoop Cluster
SIP Server
Flume Agent
CDR HDFS
Packet Sniffer
Hbase
Error Codes
Apex App
Java App
![Page 44: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/44.jpg)
Exadoop Applications
45
![Page 45: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/45.jpg)
Wrap Up
46
Is Hadoop the right tool for the job?
Maybe
All the Cool Kids Are Doing It!
![Page 46: Big Data in a Relational World - Kerry Osborne's Oracle Blogkerryosborne.oracle-guy.com/papers/Big Data in the Relational World... · Exadoop Case Study . 10 ... Kerry Osborne kerry.osborne@enkitec.com](https://reader033.fdocuments.us/reader033/viewer/2022051105/5aa0289d7f8b9a67178de08f/html5/thumbnails/46.jpg)
47
Questions? Contact Information : Kerry Osborne
[email protected] kerryosborne.oracle-guy.com
www.enkitec.com