Three Perspectives & Two Problems Shivnath Babu Duke University.

21
Three Perspectives & Two Problems Shivnath Babu Duke University

Transcript of Three Perspectives & Two Problems Shivnath Babu Duke University.

Page 1: Three Perspectives & Two Problems Shivnath Babu Duke University.

Three Perspectives &Two Problems

Shivnath BabuDuke University

Page 2: Three Perspectives & Two Problems Shivnath Babu Duke University.

Outline• I want to highlight two problems / thoughts• First some context

Page 3: Three Perspectives & Two Problems Shivnath Babu Duke University.

Three Perspectives

• The Cloud era is ringing in interesting changes• Increasingly overlapping roles• Joe Schmoe can now provision a 100-node Hadoop

cluster in minutes• Administrators in traditional roles are getting laid

off

System Designers /Developers

Users of theSystem

SystemAdministrators

Page 4: Three Perspectives & Two Problems Shivnath Babu Duke University.

Three Perspectives

• The Cloud era is ringing in interesting changes• Software abstractions / packing / release cycle

have changed• More visibility into how users use the software

System Designers /Developers

Users of theSystem

SystemAdministrators

Page 5: Three Perspectives & Two Problems Shivnath Babu Duke University.

Problem 1:

Automated Experiment-driven

System Management

Page 6: Three Perspectives & Two Problems Shivnath Babu Duke University.

Taking the (Next) Bite Out of

System Administration

• Cloud has automated some system administration tasks

• Can we automate others:• System tuning (configuration parameters, SQL

queries, MapReduce jobs)• Detecting and repairing data corruption (disaster

recovery)• Software /service testing

Page 7: Three Perspectives & Two Problems Shivnath Babu Duke University.

Database Performance Tuning2-dim Projection of a 11-dim Surface

Page 8: Three Perspectives & Two Problems Shivnath Babu Duke University.

MapReduce Job Tuning in Hadoop

2-dim Projection of a 13-dim Surface

Page 9: Three Perspectives & Two Problems Shivnath Babu Duke University.

Taking the (Next) Bite Out of

System Administration

• Cloud has automated some system administration tasks

• Can we automate others:• System tuning (configuration parameters, SQL

queries, MapReduce jobs)• Detecting and repairing data corruption (disaster

recovery)• Software /service testing

Page 10: Three Perspectives & Two Problems Shivnath Babu Duke University.

Data Corruption

• Stored data becomes different from what it is supposed to be• Bugs in software /

firmware• Alpha particles, bit rot • Human mistakes

• Bad things have happened• Data loss• System unavailability• Incorrect results

Stored Data

Applications

File-System

Storage

Database

Page 11: Three Perspectives & Two Problems Shivnath Babu Duke University.

Taking the (Next) Bite Out of

System Administration

• Cloud has automated some system administration tasks

• Can we automate others:• System tuning (configuration parameters, SQL

queries, MapReduce jobs)• Detecting and repairing data corruption (disaster

recovery)• Software /service testing

Page 12: Three Perspectives & Two Problems Shivnath Babu Duke University.

Key Insight: Need to Run “Experiments”• System tuning:• Running workload under

various system settings• Detecting data corruption:• Running integrity checks

to verify data correctness

• Software /service testing:• Running the tests

Stored Data

Applications

File-System

Storage

Database

Challenge: Where / How / When to run experiments?

Page 13: Three Perspectives & Two Problems Shivnath Babu Duke University.

Cloud is Part of the Answer• Take snapshots of

production data at low overhead

• Fire up production-like instances of the system• Pay-as-you-go, elasticity

• Run the experimentsProduction Data

Applications

File-System

Storage

Database

Applications

File-System

Storage

Database

Data on system for doing experiments

Page 14: Three Perspectives & Two Problems Shivnath Babu Duke University.

Power of Experiments to the People

Resources

Declarative Language

Plan optimizedsequence of expts

Conduct exptsautomatically

Declarativebenchmarking

& tuning

Protectingagainst datacorruption

Page 15: Three Perspectives & Two Problems Shivnath Babu Duke University.

Problem 2:

Data-Parallel Computing for the Masses

Page 16: Three Perspectives & Two Problems Shivnath Babu Duke University.

Challenges• Joe Schmoe can now provision a 100-node

Hadoop cluster in minutes. Is that enough?• Joe may need to answers to:

o How many reduce tasks to use in MapReduce job J for getting the best perf. on my 8-node production cluster?

o My current cluster needs more than 6 hours to process 1 day’s worth of data. Want to reduce that to under 3 hours. How many and what type of Amazon EC2 nodes to use?

Page 17: Three Perspectives & Two Problems Shivnath Babu Duke University.

Performance Vs. Price Tradeoff

m1.small m1.large m1.xlarge0

2000

4000

6000

8000

10000

12000

2 nodes 4 nodes 6 nodes

Node Type on Amazon EC2

Exe

cuti

on T

ime

(sec

)

m1.small m1.large m1.xlarge$0.00

$1.00

$2.00

$3.00

$4.00

$5.00

$6.00

2 nodes 4 nodes 6 nodes

Node Type on Amazon EC2

Cos

t ($

)

Page 18: Three Perspectives & Two Problems Shivnath Babu Duke University.

SpectrumDatabaseSystems

SQL

Known data-accesspatterns

Fixed set ofoperators

Cost-based optimizers,What-if engines

GridComputing

Python / R / Java

Unknown data-accesspatterns

Black-boxfunctions

Newer Data-Parallel

Systems

Page 19: Three Perspectives & Two Problems Shivnath Babu Duke University.

Starfish: Self-Tuning Analytics on Big Data

What-if Engine

Workflow-level tuning

Workflow-aware Optimizer/Scheduler

Workload-level tuning

Workload Optimizer Elastisizer

Data ManagerMetadata

Mgr.Intermediate

Data Mgr.Data Layout & Storage Mgr.

Just-in-Time Optimizer

Profiler

Job-level tuning

Sampler

Page 20: Three Perspectives & Two Problems Shivnath Babu Duke University.

MapReduce Job Tuning in Hadoop

True Surface Estimated Surface

Page 21: Three Perspectives & Two Problems Shivnath Babu Duke University.

Summary• Three perspectives: Developer, User, &

Administrator• Two problems:• Automated Experiment-driven System

Management• Data-Parallel Computing for the Masses