Presentation by Carl Erhard & Zahid Mian

Post on 22-Feb-2016

43 views 0 download

Tags:

description

HaLoop: Efficient Iterative Data Processing On Large Scale Clusters by Yingyi Bu, Bill Howe, Magdalena Balazinska , & Michael D. Ernst. Presentation by Carl Erhard & Zahid Mian. Citations. - PowerPoint PPT Presentation

Transcript of Presentation by Carl Erhard & Zahid Mian

1

HaLoop: Efficient Iterative Data Processing On Large Scale Clusters by Yingyi Bu, Bill Howe, Magdalena Balazinska, & Michael D. Ernst

Presentation by Carl Erhard & Zahid Mian

2

Many of the slides in this presentation were taken from the author’s website, which can be found here: http://www.ics.uci.edu/~yingyib/

Citations

3

• Introduction / Motivation• Example Algorithms Used• Architecture of HaLoop• Task Scheduling• Caching and Indexing• Experiments & Results• Conclusion / Discussion

Agenda

4

Motivation

• MapReduce can’t express recursion/iteration• Lots of interesting programs need loops

– graph algorithms– clustering– machine learning– recursive queries

• Dominant solution: Use a driver program outside of MapReduce

• Hypothesis: making MapReduce loop-aware affords optimization– …and lays a foundation for scalable implementations of

recursive languages

Bill Howe, UW

5

Thesis – Make a Loop Framework

• Observation: MapReduce has proven successful as a common runtime for non-recursive declarative languages– HIVE (SQL)– Pig (RA with nested types)

• Observation: Many people roll their own loops– Graphs, clustering, mining, recursive queries – iteration managed by external script

• Thesis: With minimal extensions, we can provide an efficient common runtime for recursive languages– Map, Reduce, Fixpoint

Bill Howe, UW

7

PageRank in MapReduce

M

M

M

M

M

r

r

Ri

L-split1

L-split0M

M

r

r

i=i+1 Converged?

Join & compute rank Aggregate fixpoint evaluation

Client

done

r

r

Bill Howe, UW

8

PageRank in MapReduce

1. L is loaded on each iteration2. L is shuffled on each iteration3. Fixpoint evaluated as a separate MapReduce job per iteration

m

m

m

Ri

L-split1

L-split0M

M

r

r

1.2.

3.

L is loop invariant, but

r

r M

M

r

r

What’s the Problem?

Bill Howe, UW

9

Example 2: Descendant Query

Friend Find all friends within two hops of Eric

{Eric, Elisa}

{Eric, Tom Eric, Harry}

{}

R1

R0 {Eric, Eric}

R2

R3

Bill Howe, UW

10

Descendant Query in MapReduce

M

M

M

M

M

r

r

Si

Friend1

Friend0

i=i+1

Anything new?

JoinDupe-elim

Client

done

r

r

(compute next generation of friends) (remove the ones we’ve already seen)

Bill Howe, UW

11

Descendant Query in MapReduce

1. Friend is loaded on each iteration2. Friend is shuffled on each iteration

Friend is loop invariant, but

M

M

M

M

M

r

r

Si

Friend1

Friend0

JoinDupe-elim

r

r

(compute next generation of friends) (remove the ones we’ve already seen)

1.2.

What’s the Problem?

Bill Howe, UW

12

HaLoop – The Solution

HaLoop offers the following solutions to these problems:

1. A New Programming Model & Architecture for Iterative Programs

2. Loop-Aware Task Scheduling3. Caching for Loop Invariant Data4. Caching for Fixpoint Evaluation

13

HaLoop Architecture

14

HaLoop Architecture

Note: The loop control (i.e. determining when execution has finished) is pushed from the application into the infrastructure.

15

• HaLoop will work given the following is true:

• In other words, the next result is a join of the previous result and loop-invariant data L.

HaLoop Architecture

16

• AddMap and AddReduce Used to add a Map Reduce loop

• SetFixedPointThreshold Set a bound on the distance between

iterations• ResultDistance

A function that returns the distance between iterations

• SetMaxNumOfIterations Set the maximum number of iterations the

loop can take.

HaLoop Programming Interface

17

• SetIterationInput A function which returns the input for a

certain iteration• AddStepInput

A function will allows injection of additional data in between the Map and Reduce

• AddInvariantTable Add a table which is loop invariant

18

API’s in Action – PageRank in HaLoop

19

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

R0

L

20

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

R0

L

Source URL

Dest/Rank Source File

a.com 1.0 #2

a.com b.com,c.com,d.com

#1

b.com 1.0 #2

b.com #1

c.com 1.0 #2

c.com a.com, e.com #1

d.com 1.0 #2

d.com b.com #1

e.com 1.0 #2

e.com d.com,c.com #1

R0 U L

Only these values are given to reducer

21

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

R0

L

R0 U L

Destination New Rank

b.com 1.5

c.com 1.5

d.com 1.5

a.com 1.5

e.com 1.5

b.com 1.5

d.com 1.5

c.com 1.5

Calculate New Rank

22

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

R0

L

R0 U L

Destination New Rank

b.com 1.5

c.com 1.5

d.com 1.5

a.com 1.5

e.com 1.5

b.com 1.5

d.com 1.5

c.com 1.5

Calculate New Rank

Identity

23

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

R0

L

R0 U L

Destination New Rank

a.com 1.5

b.com 3.0

c.com 3.0

d.com 3.0

e.com 1.5

Calculate New Rank

Identity Aggregate

R1

24

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

R0

L

R0 U L

Destination New Rank

a.com 1.5

b.com 3.0

c.com 3.0

d.com 3.0

e.com 1.5

Calculate New Rank

Identity Aggregate

R1

Compare R0 and R1. If not under

threshold, repeat.

Destination New Rank

a.com 1.0

b.com 1.0

c.com 1.0

d.com 1.0

e.com 1.0

25

PageRank In HaLoop

MapRank

ReduceRank

MapAggregate

ReduceAggregate

L

R1 U L Calculate New Rank

Identity AggregateR1

Source URL

Dest/Rank Source File

a.com 1.5 #2

a.com b.com,c.com,d.com

#1

b.com 1.5 #2

b.com #1

c.com 1.5 #2

c.com a.com, e.com #1

d.com 1.5 #2

d.com b.com #1

e.com 1.5 #2

e.com d.com,c.com #1

26

• One goal of HaLoop is to schedule map/reduce tasks on the same machine as the data.– Scheduling the first

iteration is no different than Hadoop.

– Subsequent iterations put tasks that access the same data on the same physical node.

HaLoop Inter-Iteration Locality

27

• The master node keeps a map of node ID Filesystem Partition

• When a node becomes free, the master tries to assign a task related to data contained on that node.

• If a task is required on a node with a full load, it will utilize a nearby node.

HaLoop Scheduling

28

• Mapper Input Cache• Reducer Input Cache• Reducer Output Cache• Why is there no Mapper Output

Cache?• Haloop Indexes Cached Data

Keys and values stored in separate local files

Reduces I/O seek time (forward only)

Caching And Indexing

29

Approach: Inter-iteration caching

Mapper input cache (MI)

Mapper output cache (MO)

Reducer input cache (RI)

Reducer output cache (RO)

M

M

M

r

r

Loop body

Bill Howe, UW

30

RI: Reducer Input Cache

Bill Howe, UW

• Provides:– Access to loop invariant data without

map/shuffle• Used By:

– Reducer function• Assumes:

1. Mapper output for a given table constant across iterations

2. Static partitioning (implies: no new nodes)

• PageRank– Avoid shuffling the network at every step

• Transitive Closure– Avoid shuffling the graph at every step

• K-means– No help

31

RO: Reducer Output Cache

Bill Howe, UW

• Provides:– Distributed access to output of previous

iterations• Used By:

– Fixpoint evaluation• Assumes:

1.Partitioning constant across iterations2.Reducer output key functionally

determines Reducer input key

• PageRank– Allows distributed fixpoint evaluation– Obviates extra MapReduce job

• Transitive Closure– No help

• K-means– No help

32

MI: Mapper Input Cache

Bill Howe, UW

• Provides:– Access to non-local mapper input on later

iterations• Used:

– During scheduling of map tasks• Assumes:

1. Mapper input does not change

• PageRank– Subsumed by use of Reducer Input Cache

• Transitive Closure– Subsumed by use of Reducer Input Cache

• K-means– Avoids non-local data reads on iterations > 0

33

• Cache Must be Reconstructed: Hosting Node Fails Hosting Node has Full Node (M/R Job Needs to be Scheduled

on a Different Substitution Node)

• Process is Transparent

Cache Rebuilding

34

Results: Page Rank

-Run only for 10 iterations-Join and aggregate in every iteration-Overhead in first step for caching input

-Catches up soon and outperforms Hadoop.-Low shuffling time: time between RPC invocation by reducer and sorting of keys.

35

Results: Descendant Query

-Join and duplicate elimination in every iteration.-Less striking Performance on LiveJournal: social network, high fan out, excessive duplicate generation which dominates the join cost and reducer input caching is less useful.

36

Reducer Input Cache Benefit

Transitive Closure

Billion Triples Dataset (120GB)

90 small instances on EC2

Overall run time

Bill Howe, UW

37

Reducer Input Cache Benefit

Transitive Closure

Billion Triples Dataset (120GB)

90 small instances on EC2

Overall run time

Livejournal, 12GB

Bill Howe, UW

38

Reducer Input Cache Benefit

Bill Howe, UW

Transitive Closure

Billion Triples Dataset (120GB)

90 small instances on EC2

Reduce and Shuffle of Join Step

Livejournal, 12GB

39

Reducer Input Cache Benefit

Join & compute rank

M

M

M

M

M

r

r

Ri

L-split1

L-split0M

M

r

r

Aggregate fixpoint evaluation

r

r

40

Reducer Output Cache Benefit

Bill Howe, UW

Fixp

oint

eva

luat

ion

(s)

Iteration # Iteration #

Livejournal dataset

50 EC2 small instances

Freebase dataset

90 EC2 small instances

41

Mapper Input Cache Benefit

Bill Howe, UW

5% non-local data reads; ~5% improvement

42

Conclusions

• Relatively simple changes to MapReduce/Hadoop can support arbitrary recursive programs– TaskTracker (Cache management) – Scheduler (Cache awareness)– Programming model (multi-step loop bodies, cache control)

• Optimizations– Caching loop invariant data realizes largest gain– Good to eliminate extra MapReduce step for termination checks– Mapper input cache benefit inconclusive; need a busier cluster

• Future Work– Analyze expressiveness of Map Reduce Fixpoint– Consider a model of Map (Reduce+) Fixpoint

43

The Good …

• Haloop extends MapReduce:– Easier programming of iterative algorithms– Efficiency improvement due to loop

awareness and caching– Lets users reuse major building blocks from

existing application implementations in Hadoop.

– Fully backward compatible with Hadoop.

44

The Questionable …

• Only useful for algorithms which can be expressed as:

• Imposes constraints: fixed partition function for each iteration.

• Does not improve asymptotic running time. Still O(M+R) scheduling decisions, keep O(M*R) state in memory. And more overhead…

• Not completely novel: iMapReduce and Twister.• People still do iteration using traditional Map Reduce.

Google, Nutch, Mahout…

45

BACKUP

46

47

API’s in Action – Descendant Query

48

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S0

F

49

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S0

F

Eric Eric #2

Eric Elisa #1

Elisa Tom #1

Elisa Harry #1

50

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S0

F

Eric Eric 1

Eric Elisa 1

Elisa Tom 1

Elisa Harry 1

Eric Tom 1

Eric Harry 1

Elisa Eric 1

51

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S0

F

Eric Eric

Eric Elisa

Elisa Tom

Elisa Harry

Eric Tom

Eric Harry

Elisa Eric

∆S1

52

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S1

F

Eric Eric #2

Elisa Eric #2

Tom Eric #2

Harry Eric #2

Tom Elisa #2

Harry Elisa #2

53

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S1

F

Eric Eric 2

Elisa Eric 2

Tom Eric 2

Harry Eric 2

Tom Elisa 2

Harry Elisa 2

54

Descendant Query in HaLoop

MapJoin

ReduceJoin

MapDistinct

ReduceDistinct

∆S1

F

Eric Eric 2

Elisa Eric 2

Tom Eric 2

Harry Eric 2

Tom Elisa 2

Harry Elisa 2

Step Input

Eric Eric 1

Eric Elisa 1

Elisa Tom 1

Elisa Harry 1

Eric Tom 1

Eric Harry 1

Elisa Eric 1Eric Eric 0

Eric Eric

Tom Eric

Harry Eric

Tom Elisa

Harry Elisa

Eric Elisa

Eric Tom

Eric Harry

Elisa Tom

Elisa Harry

Step Input allows the union of all iterations to be input

55

Fixpoint Algorithm Example