BDAAS on the Cloud

26
BDaaS On The Cloud: Challenges And Optimizations Abhishek Somani 20th January 2017

Transcript of BDAAS on the Cloud

Page 1: BDAAS on the Cloud

BDaaS On The Cloud:Challenges And Optimizations

Abhishek Somani20th January 2017

Page 2: BDAAS on the Cloud

Why Cloud?

Page 3: BDAAS on the Cloud

Where Big Data falls

short:

• 6-18 month implementation time

• Only 27% of Big Data initiatives are classified as “Successful” in 2014

Rigid and inflexible

infrastructure

Non adaptive software services

Highly specialized

systems

Difficult to build and operate

• Only 13% of organizations achieve full-scale production

• 57% of organizations cite skills gap as a major inhibitor

3

Page 4: BDAAS on the Cloud

1. Flexible Infrastructure2. Pay only for what you actually use3. Shared Storage4. Heterogenous Clusters

4

Why Cloud?

Page 5: BDAAS on the Cloud

• Cloud Compute(Cluster) management– Challenges– Solutions– Advanced Optimizations

• Cloud Storage– Challenges– Solutions and Optimizations

5

Agenda

Page 6: BDAAS on the Cloud

1. Properties:a. Ephemeralb. Volatile(Spot for AWS, Preemptible for GCP)

2. Challenges:a. Scale as per workloadb. Separation of compute and storagec. Job histories, log files, results all need to be persisted.d. Adapting YARN/HDFS to take into account ephemeral cloud nodes.

6

Cloud Compute

Page 7: BDAAS on the Cloud

Up-scaling for MR jobs

Resource Manager

Node 1

Node 2

User

Submit Job

Launches MR AM

NodeManager

MR AppMaster

ContainerRequest

Allocate Resources

NodeManager

C1 C2

Task Progress

Up Scale Request

Cluster Manager

Add Node

NodeManager

C3 C4Node 3

Page 8: BDAAS on the Cloud

Generic Up-scaling

Resource Manager

ClusterManager

MR AppMaster

Spark AppMaster

Tez AppMaster

Up Scale Request

Add Node

Page 9: BDAAS on the Cloud

Node 2

Down-scaling

Resource Manager

NodeManager

C1 C2

C3 C4

NodeManager

C1 C2

C3 C4

NodeManager

C1 C2

C4C3

Status Update

Evaluates cluster is being underutilized and

can be down scaled

Selects node whose estimated task

completion time is lowest

Graceful Shutdown

User

Submits Job

Allocates container

Job1 Completes

Cluster Manager

Remove Node

Job 1Job 2Job 3

DecommissionNode

Node 1

Node 3

C3

C1

C1

C3

Page 10: BDAAS on the Cloud

1. Upscalinga. Engine specific algorithmsb. Cannot just look at expected time(parallelism matters)

2. Downscalinga. Decommissioning takes timeb. Need to consider hour boundariesc. Stuck on mapper output

10

Why is it hard?

Page 11: BDAAS on the Cloud

Job History – Terminated Cluster

Page 12: BDAAS on the Cloud

Job History – Terminated Cluster

QuboleUIUser Cluster

Proxy

Job History Server

Clicks UI link

Authenticates the request

Finds cluster is down

Fetches jhist file from cloud

Jhist file

Rendered JobHist

Proxifies Link

Page 13: BDAAS on the Cloud

1. Volatile Nodesa. Lower priced nodes bought in an auction (Spot Nodes in AWS, Preemptible in

GCE)2. Hybrid Clusters

a. Mix of stable and volatile nodes to improve stability3. Heterogenous Clusters

a. Preferred machine types may not be availableb. Preferred machine types may be more expensive than larger machines

4. Autoscaling Optimizationsa. Packing of tasksb. Upload intermediate data to cloud storagec. Recommission nodes

13

Advanced Optimizations

Page 14: BDAAS on the Cloud

1. Cloud Compute(Cluster) managementa. Challengesb. Scalingc. Advanced Optimizations

2. Cloud Storagea. Challengesb. Solutions and Optimizations

14

Agenda

Page 15: BDAAS on the Cloud

1. Properties:a. Simple key value storeb. Inexpensive.c. Accessed via REST APIs/SDKd. Is the source of truth.

2. Challenges:a. Connection establishment is expensiveb. Copying/Moving is expensive... no rename

3. Some positives:a. Prefix listing.b. PUTs are atomic: File is created when file is uploaded, unlike HDFS where it is

created on first write.c. Multipart

15

Cloud Storage

Page 16: BDAAS on the Cloud

• Naive

• Smart

• Up to 1000x improvement

16

Prefix Listing

for path in [‘/x/y/a’, ‘/x/y/b’, ‘/x/z/c’, … ]:result << listObject(path)

pathList = listPrefix(‘/x’)while (entry = pathList.next()):

if entry in [‘/x/y/a’, ‘/x/y/b’, ‘/x/z/c’, … ]:result << entry

Storage OptimizationsC

Page 17: BDAAS on the Cloud

1. Split Computation : Divide input files into tasks for Map-Reduce/Spark/Presto

2. Recovering Partitions

3. List Paths matching regex pattern (‘/x/y/z/*/*’)

4. and many more ..

17

Prefix Listing - Use Cases

Storage OptimizationsC

Page 18: BDAAS on the Cloud

• Normally:

– Write data to temporary location - atomically rename to final location

• With S3:

– Write data to final location

– Atomic PUTs deal with speculation/retries

• By default in Hive, DirectFileOutputCommitter in MR/Spark

• Tricky: retries/speculation must use same path

18

Direct Writes

Storage OptimizationsC

Page 19: BDAAS on the Cloud

• Object caches(per bucket): High gain for roles based accounts• Connection pools• Read ahead optimizations• Streaming upload

19

S3 Optimizations

Page 20: BDAAS on the Cloud

• RubiX: Block level file cache• Metadata caching for ORC and Parquet

20

Cache! Cache! Cache!

Storage OptimizationsC

Page 21: BDAAS on the Cloud

• Cache blocks on local disks

• Open Source

• Engine agnostic

• Works well with auto-scaling

• Consistent Hashing to assign files or blocks to nodes.

21

RubiX

Storage OptimizationsC

Page 22: BDAAS on the Cloud

22

RubiX

Storage OptimizationsC

Page 23: BDAAS on the Cloud

23

Metadata CachingORC File Format

Page 24: BDAAS on the Cloud

24

Metadata CachingParquet File Format

Page 25: BDAAS on the Cloud

• Cache on a Redis server running on master• Effective and efficient split computation with PPD• ORC and Parquet• Engine agnostic

25

Metadata Caching

Page 26: BDAAS on the Cloud

Thank You!

20th January 2017