BDAAS on the Cloud
-
Upload
abhishek-somani -
Category
Data & Analytics
-
view
7 -
download
0
Transcript of BDAAS on the Cloud
BDaaS On The Cloud:Challenges And Optimizations
Abhishek Somani20th January 2017
Why Cloud?
Where Big Data falls
short:
• 6-18 month implementation time
• Only 27% of Big Data initiatives are classified as “Successful” in 2014
Rigid and inflexible
infrastructure
Non adaptive software services
Highly specialized
systems
Difficult to build and operate
• Only 13% of organizations achieve full-scale production
• 57% of organizations cite skills gap as a major inhibitor
3
1. Flexible Infrastructure2. Pay only for what you actually use3. Shared Storage4. Heterogenous Clusters
4
Why Cloud?
• Cloud Compute(Cluster) management– Challenges– Solutions– Advanced Optimizations
• Cloud Storage– Challenges– Solutions and Optimizations
5
Agenda
1. Properties:a. Ephemeralb. Volatile(Spot for AWS, Preemptible for GCP)
2. Challenges:a. Scale as per workloadb. Separation of compute and storagec. Job histories, log files, results all need to be persisted.d. Adapting YARN/HDFS to take into account ephemeral cloud nodes.
6
Cloud Compute
Up-scaling for MR jobs
Resource Manager
Node 1
Node 2
User
Submit Job
Launches MR AM
NodeManager
MR AppMaster
ContainerRequest
Allocate Resources
NodeManager
C1 C2
Task Progress
Up Scale Request
Cluster Manager
Add Node
NodeManager
C3 C4Node 3
Generic Up-scaling
Resource Manager
ClusterManager
MR AppMaster
Spark AppMaster
Tez AppMaster
Up Scale Request
Add Node
Node 2
Down-scaling
Resource Manager
NodeManager
C1 C2
C3 C4
NodeManager
C1 C2
C3 C4
NodeManager
C1 C2
C4C3
Status Update
Evaluates cluster is being underutilized and
can be down scaled
Selects node whose estimated task
completion time is lowest
Graceful Shutdown
User
Submits Job
Allocates container
Job1 Completes
Cluster Manager
Remove Node
Job 1Job 2Job 3
DecommissionNode
Node 1
Node 3
C3
C1
C1
C3
1. Upscalinga. Engine specific algorithmsb. Cannot just look at expected time(parallelism matters)
2. Downscalinga. Decommissioning takes timeb. Need to consider hour boundariesc. Stuck on mapper output
10
Why is it hard?
Job History – Terminated Cluster
Job History – Terminated Cluster
QuboleUIUser Cluster
Proxy
Job History Server
Clicks UI link
Authenticates the request
Finds cluster is down
Fetches jhist file from cloud
Jhist file
Rendered JobHist
Proxifies Link
1. Volatile Nodesa. Lower priced nodes bought in an auction (Spot Nodes in AWS, Preemptible in
GCE)2. Hybrid Clusters
a. Mix of stable and volatile nodes to improve stability3. Heterogenous Clusters
a. Preferred machine types may not be availableb. Preferred machine types may be more expensive than larger machines
4. Autoscaling Optimizationsa. Packing of tasksb. Upload intermediate data to cloud storagec. Recommission nodes
13
Advanced Optimizations
1. Cloud Compute(Cluster) managementa. Challengesb. Scalingc. Advanced Optimizations
2. Cloud Storagea. Challengesb. Solutions and Optimizations
14
Agenda
1. Properties:a. Simple key value storeb. Inexpensive.c. Accessed via REST APIs/SDKd. Is the source of truth.
2. Challenges:a. Connection establishment is expensiveb. Copying/Moving is expensive... no rename
3. Some positives:a. Prefix listing.b. PUTs are atomic: File is created when file is uploaded, unlike HDFS where it is
created on first write.c. Multipart
15
Cloud Storage
• Naive
• Smart
• Up to 1000x improvement
16
Prefix Listing
for path in [‘/x/y/a’, ‘/x/y/b’, ‘/x/z/c’, … ]:result << listObject(path)
pathList = listPrefix(‘/x’)while (entry = pathList.next()):
if entry in [‘/x/y/a’, ‘/x/y/b’, ‘/x/z/c’, … ]:result << entry
Storage OptimizationsC
1. Split Computation : Divide input files into tasks for Map-Reduce/Spark/Presto
2. Recovering Partitions
3. List Paths matching regex pattern (‘/x/y/z/*/*’)
4. and many more ..
17
Prefix Listing - Use Cases
Storage OptimizationsC
• Normally:
– Write data to temporary location - atomically rename to final location
• With S3:
– Write data to final location
– Atomic PUTs deal with speculation/retries
• By default in Hive, DirectFileOutputCommitter in MR/Spark
• Tricky: retries/speculation must use same path
18
Direct Writes
Storage OptimizationsC
• Object caches(per bucket): High gain for roles based accounts• Connection pools• Read ahead optimizations• Streaming upload
19
S3 Optimizations
• RubiX: Block level file cache• Metadata caching for ORC and Parquet
20
Cache! Cache! Cache!
Storage OptimizationsC
• Cache blocks on local disks
• Open Source
• Engine agnostic
• Works well with auto-scaling
• Consistent Hashing to assign files or blocks to nodes.
21
RubiX
Storage OptimizationsC
22
RubiX
Storage OptimizationsC
23
Metadata CachingORC File Format
24
Metadata CachingParquet File Format
• Cache on a Redis server running on master• Effective and efficient split computation with PPD• ORC and Parquet• Engine agnostic
25
Metadata Caching
Thank You!
20th January 2017