Chapter 5 Design and Implementation of Scheduling...
Transcript of Chapter 5 Design and Implementation of Scheduling...
Chapter 5
Design and Implementation of Scheduling
algorithms
In this chapter we present the implementation of grid scheduling algorithms. The first section
gives static load balancing algorithms[43][19]. Secondly we implement heuristic based algo-
rithm as improvement over static methods[19]. Min-Min and Min-Max are implemented here.
Further the nature inspired algorithms are experimented so as to present the comparative results
for these algorithms[19][66].
5.1 Design and Implementation of Static Algorithms
Static algorithms are mainly used for load balancing among grid resources. These algorithms
consider the pool of resources of pre known properties available once they are detected. Op-
posed to dynamic algorithms static algorithms lack the awareness of changing resources or
their properties. Load balancing is concerned with the distribution of workload among the grid
resources in the system.The distribution of jobs to resources can be simulated by design of
scheduler based on:
• Push model: Server distributes the jobs to known resources.
• Pull model: Resources demand for the jobs from the server.
• Combined model: The individual grid resources may decide when more work can be
taken, and send a request for work to a grid job and server decides to send the jobs.
104
The execution of jobs may fail in the context of a)resource failed after discovery b)Resource
is not free for execution c)Job requirement exceeds the capability of resource. The jobs are then
rescheduled as next iteration of execution.
As discussed in chapter 4, the hierarchical scheduler gives the scheduling opportunity at
two points, Global and local level. Applications here is a problem consisting of large number of
computations ranging from 1K to 100K and above. These applications are received in multiples.
Global scheduler makes the queue of such applications and prepares schedule of applications.
Local scheduler as discussed consists of pool of resources. The Global scheduler has
collection of multiple local schedulers for application distribution. The local scheduler has local
scheduling algorithm. Local scheduler decomposes application into set of jobs using problem
decomposition techniques. These techniques are listed in[78]:
• Task Decomposition
• Data Decomposition
• hybrid Decomposition
The jobs are placed in queue for scheduling and local scheduling algorithm prepares the
mapping table among jobs and resources.
5.1.1 FCFS
First come First Serve algorithm[57] is the simplest form for scheduling. At Global level sched-
uler picks the application form the queue of applications. Global scheduler also maintains the
queue of Local clusters along with their status. The next free cluster get the incoming applica-
tion for execution. The Local scheduler applies FCFS to distribute jobs in first job in the queue
to the first resource in the resource queue.
Global-FCFS
1. Initialize queue of applications
2. Initialize queue of local schedulers
3. repeat
105
4. receive application
5. find the next free Local scheduler in queue.
6. allocate application
7. change status to busy for application
Local-FCFS
1. Initialize queue of jobs
2. Initialize queue of resources
3. repeat
4. extract job from queue
5. find the next free resource in queue
6. allocate job
7. change status to busy
if resources are not free, jobs are not scheduled until the queue has the free resource.
5.1.2 RR
Round Robin algorithm (RR)[56] has slight deviation from First come First Serve algorithm.
Similar to FCFS ,at Global level scheduler picks the application form the queue of applications.
Global scheduler also maintains the queue of Local clusters along with their status. Scheduler
also keeps the queue per Local scheduler. Each queue of Local scheduler is given the application
in circular manner. The incoming application queue remains empty since scheduler allocates
all applications.
The Local scheduler applies RR in very similar manner to jobs to be scheduled after de-
composition. Each resource has a que of jobs scheduled to it. All jobs are scheduled at one go.
Global-RR
106
1. Initialize queue of applications
2. Initialize circular queue of local schedulers
3. Initialize queue for each local scheduler
4. repeat
5. receive application
6. allocate it to next Local scheduler in queue
7. update respective local scheduler queue
8. until application queue is empty
Local-RR
1. Initialize queue of jobs
2. Initialize circular queue of resources
3. Initialize queue for each resource
4. repeat
5. extract job from queue
6. allocate it to next resource in queue
7. update respective resource queue
8. until job queue is empty
5.2 Design and Implementation of heuristic based algorithms
Heuristics algorithms[19][116] use the knowledge bases for successive scheduling decisions.
This category of algorithms use matching criteria between resource and the job to be assigned
we have two implementations of this class Min-Min heuristics and Min-Max heuristics.
107
5.2.1 Min-Min Heuristics
In this approach the job with minimum size is assigned to the resource which can produce the
output in shortest span of time. Thus more jobs could be assigned to faster processors to in-
crease system throughput. While there are scheduling requests from applications, the scheduler
allocates the application to the host by selecting the best match from the pool of applications
and pool of the available hosts. The selecting strategy can be based on the prediction of the
computing power of the host. The algorithm is as follows
Min-Min Heuristics
1. for all tasks ti in meta-task Mv (in an arbitrary order)
2. for all hosts mj (in a fixed arbitrary order)
3. CTij = ETij + dj
4. do until all tasks with high QoS request in Mv are mapped
5. for each task with high QoS in Mv, find a host in the QoS qualified host set that obtains
the earliest completion time
6. find the task tk with the minimum earliest completion time
7. assign task tk to the host ml that gives it the earliest completion time
8. delete task tk from Mv
9. update dl
10. update CTil for all i
11. end do
12. do until all tasks with low QoS request in Mv are mapped
13. for each task in Mv find the earliest completion time and the corresponding host
14. find the task tk with the minimum earliest completion time
15. assign task tk to the host ml that gives it the earliest completion time
16. delete task tk from Mv
108
17. update dl
18. update CTil for all i
19. end do
5.2.2 Min-Max Heuristics
Min-Max Scheduling: With this kind of job scheduling approach, the job with minimum size
is assigned to the resource which will take highest time to compute the result. The algorithm is
similar to Min-Min with the difference in matching criteria.
5.3 Design and Implementation of Nature Inspired algorithms
We find the attempt in research to convert the natural phenomenon to scheduling algorithm.
These algorithms are Ant Colony Optimization(ACO), Tabu Search(TS), Simulated Anneal-
ing(SA) and Genetic Algoritm(GA) ACO Ant Colony Algorithm. We attempt to experiment
with these algorithms[108][72].
5.3.1 Ant Colony Optimization
The ACO algorithm uses a colony of artificial ants that behave as co-operative agents in a
mathematical space were they are allowed to search and reinforce pathways (solutions) in order
to find the optimal ones. Solution that satisfies the constraints is feasible. After initialization
of the pheromone trails, ants construct feasible solutions, starting from random nodes, then the
pheromone trails are updated. At each step ants compute a set of feasible moves and select the
best one (according to some probabilistic rules) to carry out the rest of the tour. The transition
probability is based on the heuristic information and pheromone trail level of the move. The
higher value of the pheromone and the heuristic information, the more profitable it is to select
this move and resume the search. In the beginning, the initial pheromone level is set to a small
positive constant value and then ants update this value after completing the construction stage.
ACO
1. Initialize the pheromone
2. while stopping criterion not satisfied do
109
3. Position each ant in a starting node
4. repeat
5. for each ant do
6. Chose next node by applying the state transition rate
7. end for
8. until every ant has build a solution
9. Update the pheromone 10. end while
5.3.2 GA
GA is analogically applied to formulate the decision from set of chromosomes. Mutation and
crossover of these chromosomes generates the optimized final state. Genetic algorithms com-
bine exploitation of past results with the exploration of new areas of the search space by using
survival of the fittest techniques combined with a structure randomized information exchange.
The generalized procedure of GA for scheduling is
GA
1. Generate initial population of jobs
2. Select random two parents from initial population
3. Perform crossover to produce child
4. Perform Mutation for child
5. Find the fitness of child
6. Schedule is best chromosome
5.3.3 Fuzzy Algorithm
Fuzzy adopt the size of the job and the CPU utilization as the input variables for fuzzy sets and
define a set membership function. This composed of three design phases:
110
1. Information Policy
The information policy indicates the significance of information regarding the system.
From which, information gathering fuzzy rules is used to determine the system workload
is heavy or not. We use the CPU utilization value and the number of the rows (job size)
as the values to be found out. The CPU Utilization is found out using the Simple Net-
work Management Protocol(SNMP) component and Advent package, and the job size is
obtained from the application division module.
item Negotiation Policy:
The Negotiation Policy consists of the fuzzy logic module, which decides which job
should be given to which processor. It takes the job size as input and using the ’Best
Distribution’ logic draws up the mapping.
2. Migration Policy:
Using file transfer we send the data and the code files to the clients connected in the grid
and the clients then perform the required calculations.
The Fuzzy Logic Module consists of two functions: a)Schedule: This module accepts the
jobs from the job division module,initializes all the variables, and gives a call to the Solve.
b)Solve: This function calculates the best possible distribution for the given array of jobs by
trying to find out by which distribution the time required by all the clients will be the same.
Fuzzy Logic is based upon the mathematical function
TαJobSize
TαCPUUtilization
The solve function tries to find the best possible distribution by trying to bring T1, T2, till
Tn as equal to each other as possible. The best distribution is the least difference between the
times required by all the clients.
T1 ∼= T2 ∼= T3 .
Thus it achieves the best possible load balancing. Based on Time required a System Load
Time required a Application/Job size Finds optimal schedule by finding T1 ∼= T2 ∼= T3 ∼= .Tn
Evaluates all possible divisions by considering CPU Utilization and Job Size
111
Figure 5.1: User Interface to submit applications and selecting algorithms
5.3.4 Results
The execution time is measured in job completion time on a grid of 3 clusters; of first of 4
processors, second of 3 processors and third of 3. The problem selected for testing is matrix
chain multiplication. Following table gives the result details in term of time requirement to
complete all the jobs. In the table 5.1 J indicates jobs and M indicates Matrix with size in the
range of 50 to 500. The table gives the required time in seconds to execute the job. Table 5.2 to
5.6 shows the results obtained from the execution of ACO, GA and Fuzzy algorithm.
112
Table 5.1: Execution time of algorithms applied at Global and Local level scheduler
Figure 5.2: Comparative performance of algorithms for number of jobs and applications
113
Table 5.2: (a)Measured time for ACO, GA and Fuzzy
Table 5.3: (b)Measured time for ACO, GA and Fuzzy
114
Table 5.4: (c)Measured time for ACO, GA and Fuzzy
Table 5.5: (d)Measured time for ACO, GA and Fuzzy
115
Table 5.6: (e)Measured time for ACO, GA and Fuzzy
5.4 Observations
Implementation shows the successful implementation of the proposed grid. This allows flexi-
bility to add algorithms and select them for scheduling. Performance of ACO, GA and Fuzzy is
seen improved as compared to RR and FCFS. Former algorithms are more accurate and hence
job failure and denial rate is lower than classical methods. We find nature inspired algorithms
as candidate algorithms for further research.
116