Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable...

36
Ning Li Jordan Pa rker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status Report CS 736 – Fall 2000

Transcript of Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable...

Page 1: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

1

Cluster Resource Management: Scalable Approaches

Ning Li

Jordan Parker

Mid-semester Status Report

CS 736 – Fall 2000

Page 2: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

2

Why Study Cluster Resource Management?

• Clusters have become increasingly popular for large parallel computing.– Web Servers

• Clusters are becoming increasingly large to the order of thousands of nodes.

• Clusters are providing multiple services.

Page 3: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

3

Multiple Services: Example

• An Internet Service Provider is hosting many different websites for clients– How do you schedule according to the amount

of bandwidth a client is paying for?• Proportional Share

• Cluster Reserves

• Our technique more scalable.

Page 4: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

4

Overview

• Introduction / Reason for Research

• Related Work

• Infrastructure

• Evaluation

Page 5: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

5

Related Work

• Andrea C. Arpaci-Dusseau, David E. Culler, Alan Mainwaring, Scheduling with Implicit Information in Distributed Systems, Sigmetrics'98 Conference on the Measurement and Modeling of Computer Systems

• Armando Fox, Steven D. Gribble, Yatin Chawathe, Eric A. Brewer, Cluster-Based Scalable Network Services, Proc. 1997 Symposium on Operating Systems Principles (SOSP-16), St-Malo, France, Oct. 1997.

• M. Aron, P. Druschel, and W. Zwaenepoel. Cluster reserves: A mechanism for resource management in cluster-based network servers. In Proceedings of ACM SIGMETRICS 2000, June 2000.

• Waldspurger, C.A. and Weihl, W.E., Lottery Scheduling: Flexible Proportional-Share Resource Mangement, Proceedings of the First Symposium on Operating Systems Design and Implementation, Monterey CA, November 1994, pp. 1-11.

• NS – Network Simulator Manual, http://www.isi.edu/nsnam/ns/ns-documentation.html.

Page 6: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

6

What make us different?

• Goal: to provide a scalable solution for resource management.

• Other papers focused primarily on just having good management– This often meant 1 manager for all the nodes.– Clearly this could present a scalable bottleneck

• Effectiveness: Other solutions probably better for smaller clusters, we hope to be better for large (>1000 nodes) clusters.

Page 7: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

7

The Management Scheme

• Cluster Reserves with multiple managers– Mainly a comparison

• A new Lottery like algorithm (Banks)

• A hierarchal management network

Page 8: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

8

Infrastructure

• The Hierachal Algorithms

• Use NS to simulate our algorithms

Page 9: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

9

Hierarchal View

5 6 7 8 9 10 11 12

2 3 4

1

Page 10: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

10

A Problem and a Solution

• Problem: not scalableSolution: Hierarchy! + Fault Tolerance (a nice little example, perhaps with 2 level managers)

Page 11: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

11

Approach 1:  

• modify "Cluster Reserves" optimization algorithm – use it when manager manages nodes

– AND when level_n+1 manager manages level_n managers.

Page 12: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

12

Approach 2:  

• introduce bank account mechanism– use bank algorithm for manager managing nodes

– use transfer strategy for level_n+1 manager managing level_n managers

Page 13: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

13

Problem Specification:

N: # of nodes in a clusterS: # of service classesT: a vector of N elements, T_i: resource (# of tickets) on node IT_total: total resource in cluster (not in "cluster" paper)r and u: NxS matrices, r_ij and u_ij: the percentage resource allocation

and resource usage, respectively, at node i for service class j.D: a vector of S elements, D_j: the desired percentage resource allocation

for service class j over the cluster.

Input: r and u and the vector T and DOutput: a NxS matrix R, R_ij: the new percentage resource

allocation for service class j on node i.

Page 14: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

14

Solution Step 1:

• Compute the least feasible deviation between desired and actual allocations. S | N | Minimize sum|sum R_ij*T_i - T_total*D_j| (1) j=l|i=1 |

• Resource allocations on any cluster-node should sum to no more than 100. S for any i in 1..N, sum R_ij <= 100 j=1

• On any node, new allocation should be no more than the usage if the node is not a resource sink, i.e. if previous allocation exceeds the usage. for any i,j R_ij <= u_ij if r_ij > u_ij

Page 15: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

15

Solution Step 2:

Compute the new resource allocations s.t.

1) the deviation computed in the first step is achieved, and

2) the computed resource allocations are close to the ideal allocation (D) (different from paper, to see which is better) N S Minimize sum sum(R_ij - D_j)^2 (2) i=l j=1

Page 16: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

16

A New Idea/Addition

• Distribute unassigned cluster resource to service classes who need it

• Since manager has the knowledge of when and how much resource a service class contributed before, it can give appropriate priorities to those classes when assigning unused resource.

Page 17: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

17

Approach 2: Bank Account Mechanism

• Each manager has a bank.

• Each bank has an account for each service class.

• In the account is the # of tickets saved and when they are deposited.

• Depositing, drawing, and transferring tickets together are used to achieve both performance isolation and resource utilization. 

Page 18: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

18

Bank Algorithm: part 1

Checking each service class j on each node i:compare previous ticket usage u_ij, allocation r_ij and desired allocation D_j  1 u_ij < r_ij and r_ij <= D_j: R_ij = u_ij deposit D_j - R_ij to its back account  2 u_ij < r_ij and r_ij > D_j: R_ij = min(u_ij,D_j) deposit D_j - R_ij to its bank account if it's greater than 0  3 u_ij = r_ij and r_ij < D_j: R_ij = D_j (or R_ij = u_ij + k,

where k is a small #)  4 u_ij = r_ij and r_ij >= D_j: R_ij = D_j

 

Page 19: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

19

Bank Algorithm: part 2

let t_i be # of tickets currently allocated on node i IF t_i >= T_i normalize the tickets so that t_i = T_i  ELSE check balance B_ij in bank account for class j in case 4 above

Page 20: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

20

Bank Algorithm: part 2 (continued)

option 1: check classes in decreasing balance order let b_ij = min(B_ij, h), where h is a

relatively small # R_ij += b_ij, and draw b_ij from j's

bank account t_i += b_ij until t_i >= T_i

option 2: check all classes in case 4 above with balance >= 0allocate T_i - t_i tickets to these classes proportional to their bank account, and draw from bank account accordingly

Page 21: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

21

Bank Algorithm: part 3

assign to classes in case 4 above proportional to their share or their need if there are still unassigned tickets.

Page 22: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

22

Notes and Other Strategies:

• Note: Tickets in bank account has a time-stamp associated with it, and will expire after getting certain age.

• Strategy: Manager could force some compensation if t_i >= T_i on all the nodes before adjustment, and some classes have high balance in their accounts. Manager could allocate a reasonable amount of tickets as in option 2 above, then normalize so that t_I gets equal to T_i.

•  Strategy: Some class on some node may choose to reserve some tickets for its use on this same node in the near future, but not deposit them in the bank. We'll check this option.

Page 23: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

23

Transfer Strategy: Very simple

• Based on the previous usage report from lower-level managers, current manager transfers from one account to another where tickets are badly needed.

Page 24: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

24

Transfer Strategy: More detailed (if needed)

check class-manager <j,i> pair in decreasing usage/share order, i.e.

check those classes that need more tickets most

check j's account on other managers l, where usage/share is low

transfer min(B_lj,b) tickets from j's acccount on manager l

to j's account on manager i, where b is a constant

Page 25: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

25

Thinking of better strategies. :-)

• Any Ideas

Page 26: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

26

5 6 7 8 9 10 11 122 3 41

Network View

5 6 7 8 9 10 11 122 3 41

Page 27: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

27

Full Network Overview

5 6 7 8 9 10 11 122 3 41

WAN

Page 28: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

28

Failure Design

• Essentially tried to create a structure similar to a tree structure

• Thus we try to delete nodes and deal with the recovery similar to removing a node from a tree

Page 29: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

29

Minor Node(6) Failure

5 6 7 8 9 10 11 122 3 41

5 6 7 8 9 10 11 12

2 3 4

1

5 6 7 8 9 10 11 12

2 3 4

1

Page 30: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

30

1st Level Manger(2) Failure

5 6 7 8 9 10 11 122 3 415

5 6 7 8 9 10 11 12

2 3 4

1

6 7 8 9 10 11 12

5 3 4

1

Page 31: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

31

2nd Level Manger(1) Failure

5 6 7 8 9 10 11 122 3 4125

5 6 7 8 9 10 11 12

2 3 4

1

6 7 8 9 10 11 12

5 3 4

2

Page 32: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

32

Node Insertion

• Simply find a manager with nodes to fill

• If there is no space simply make a leaf node into a manager

Page 33: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

33

Why discuss failure?

• Not relevant to the performance of our scheduler, we don’t even plan to simulate it (unless we have lots of free time), but …

• It does show that the network layout we’ve designed could easily handle failures

• Making the tree balance itself and handling failures could be relatively straight forward

Page 34: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

34

Network Simulator - NS

• Our Components– A new Agent Class: RsrcAgent

• Agents are servers running on a node

– A script to create ns input file• Specifies network layout

– Number of Nodes

– Nodes per Manager

• Specifies the request trace

Page 35: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

35

NS implementation status

• Look at code

Page 36: Ning Li Jordan Parker Scalable Cluster Resource Management 1 Cluster Resource Management: Scalable Approaches Ning Li Jordan Parker Mid-semester Status.

Ning Li Jordan Parker

Scalable Cluster Resource Management

36

Evaluation

• NS should make it easy

• Just extract information from nodes about load balance

• More importantly look at the rate queries get handled by the nodes