A Real-Time Cloud Based Client Job · PDF fileA Real-Time Cloud Based Client Job Scheduler ......

A Real-Time Cloud Based Client JobScheduler

Yash [email protected]

A Project Report Submitted inPartial Fulfillment of the Requirements for the Degree of

Master of Science in Computer Science

Supervised ByDr. Minseok KwonAssociate Professor

Department Of Computer ScienceRochester Institute Of Technology

Rochester, NY

May 6, 2013

The project “A Real-Time Cloud Based Client Job Scheduler” has been examined andapproved by the following examination committee:

Dr. Hans-Peter BischofProfessorProject Committee Chair

Dr. Xumin LiuAssistant ProfessorProject Committee Reader

Dr. Stanislaw RadziszowskiProfessorProject Committee Observer

1

Acknowledgement

My accomplishment depends largely on the guidance and encouragement of many others,who in some way or the other, contributed and provided their valuable assistance. First andforemost, I would like to take this opportunity to express the deepest appreciation to mycommittee chair Dr. Minseok Kwon who not only gave me the golden opportunity to workon this project but also provided immense support throughout the process. In addition,my sincere thanks to the rest of the committee members Dr. Xumin Liu, Dr. Hans-PeterBischof, Dr. Stanislaw Radziszowski for their supervision as well as for providing necessaryinformation regarding the project. Last but not the least, special thanks to my parents,friends and loved ones for their cooperation which helped me in completion of this project.

2

Abstract

Inspite of recent advancements in hardware technology, mobile devices are still considered tobe resource constrained devices in terms of memory, computation power and battery life. Dueto the limitation of resources, compute intensive tasks cannot be solely executed on a singlemobile device.Thus, compute intensive tasks from resource constrained mobile devices can beforwarded to cloud servers to achieve better response time. These servers are geographicallydistributed and can therefore induce varied response time based on their location and jobbeing executed. Currently mobile devices do not have a mechanism to distribute a jobonto cloud server by evaluating processing and response time. Hence, there is a need fora job scheduler to be installed on such devices. The job schedulers responsibilities includemaintaining a job queue and distributing them among the available cloud servers. Schedulersare also expected to take network latency and processing time of individual cloud server intoconsideration while evaluating the suitability of a particular server to execute a task. Thisproject proposes a job scheduling architecture which selects a particular cloud server basedon network latency between the mobile client and the server along with the time to processa task. The optimum cloud server is selected based on a predefined objective of executinga task in minimal possible time. Such scheduling will not only improve the response timefor a given task but also minimize the cost of processing, thereby improving the overall enduser experience.

Contents

1 Introduction 4

2 Background 62.1 Client devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Cloud Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Problem Statement 11

4 Solution 124.1 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

5 Experiment 165.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165.2 The Round Trip time Experiment . . . . . . . . . . . . . . . . . . . . . . . . 17

5.2.1 Weeday Vs Weekend Time test . . . . . . . . . . . . . . . . . . . . . 175.2.2 3G Vs Wi-Fi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

5.3 Measuring Process Execution Time Experiment . . . . . . . . . . . . . . . . 215.4 Face recognition Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6 Conclusion 286.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Biblography 30

1

List of Figures

2.1 Distributed System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Grid Computing with collaborating Meta Schedulers. . . . . . . . . . . . . . 92.3 Meta Scheduler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4.1 Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

5.1 Round Trip Time Graph for Virginia weekend. . . . . . . . . . . . . . . . . . 175.2 Round Trip Time Graph for Virginia weekday. . . . . . . . . . . . . . . . . . 185.3 Average Round Trip Time Graph. . . . . . . . . . . . . . . . . . . . . . . . . 185.4 Round Trip Time Graph for Virginia 3G. . . . . . . . . . . . . . . . . . . . . 195.5 Round Trip Time Graph for Virginia Wi-Fi. . . . . . . . . . . . . . . . . . . 205.6 Average Round Trip Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205.7 Execution Time Graph for Virginia N=8000. . . . . . . . . . . . . . . . . . . 215.8 Execution Time Graph for Virginia N=10,000. . . . . . . . . . . . . . . . . . 225.9 Average Execution Time N=5000. . . . . . . . . . . . . . . . . . . . . . . . . 235.10 Average Execution Time N=8000. . . . . . . . . . . . . . . . . . . . . . . . . 235.11 Average Execution Time N=10,000. . . . . . . . . . . . . . . . . . . . . . . . 245.12 Response Time Graph for Schedulers. . . . . . . . . . . . . . . . . . . . . . . 255.13 Job distribution N=500. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265.14 Job distribution N=1000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2

List of Tables

4.1 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

5.1 Client Devices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165.2 AWS Servers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3

Chapter 1

Introduction

Mobile devices have applications that require number-crunching power and memory. Suchapplications include security encryption-decryption, face recognition, character recognition,image processing, DNA computation, natural language processing, machine learning, etc.Although mobile devices enable easier information exchange, running high-end applicationson a mobile device can easily exhaust all of its computational resources. Therefore, oneway to process a job on such application is to divide the job into server side and clientside processes, thereby reducing load on the mobile device. Thus, outsourcing the job toa cloud server is one solution. Cloud services works on a pay-as-you-go system. The costassociated with a cloud server is proportional to its usage. For optimum use, almost all thecloud service providers offer a set of schedulers. These schedulers are designed to utilize theprocessing power of all available computers or resources within the system and thereby sig-nificantly improves the performance. However, all the scheduling is done at the cloud server.This means the client device has already selected a server from a pool of various cloud servers.

Since cloud servers are geographically distributed and each server has a unique set ofresources, this gives an opportunity for the client device to select the most viable server. Forinstance, if the chosen cloud server is located far away from the client or if network latencyis high, it impacts the communication between them, thus affecting the response time andthereby affecting the processing cost. Likewise, if the selected server is overloaded with re-quests, it takes comparatively higher processing time which affects the overall response timeand cost. Hence, it is essential to have a job scheduler which is intelligent enough to selectthe optimum available server at the client side that eventually help in reducing the over-all response time. There are various parameters like network bandwidth, network latency,processing speed, memory availability, cost, etc that a scheduler can consider in selecting acloud server. The challenge for a scheduler designed for a mobile device is that it should notconsume lot of resources while offering its scheduling capabilities.

We can define job scheduling as a set of rules that control the workflow and executequeue of tasks among the available servers. The scheduler described here is a real-time clientside job scheduler for mobile devices. It will note the network latency and processing timerequired by each job for a particular server and also calculate the approximate overall execu-tion time required by the server for the next job. The scheduler uses this prediction to select

4

the most efficient available cloud server at a given point in time from a pool of cloud servers.This will help minimize the overall response time which is proportional to the economic costof the cloud usage. Also the prediction does not add any overhead thus making it suitablefor mobile devices. Therefore, this scheduler will enable mobile devices to process variouscompute-intensive tasks by overcoming resource restrictions in a cost effective way.

5

Chapter 2

Background

Recently, with the advent in technology, the ways of communication has undergone a phe-nomenal change. The usage of mobile devices is becoming popular day by day. The authorsof the paper [4] have proposed an architecture that talks about partially off-loading execu-tion from a smartphone to a computational infrastructure hosted by a cloud of smartphoneclones. I concur with the idea of offloading part of job execution, however in the real worldscenario a person usually carries only one phone at any given time. Thus, to form a cloudof smart phones, one would have to request others around in possession of a smart phoneor devices to permit the use of available computing resources. This, however, is restrictedsince the type of job that needs to be executed may contain confidential and sensitive infor-mation which one may not wish to share. Therefore, this architecture would be applicablein scenarios where sharing of data is not a concern.

2.1 Client devices

There are many different mobile phone operating systems currently being used. Androidis one of most widely used device. It is based off the Linux operating system. It includesmiddle-ware and other key applications by Google Inc. [2]. Android is an open sourceoperating system. For this project, the experiment is conducted using an Android deviceand also a windows machine which helps evaluate the impact of the job scheduler on theandroid device in comparison to a windows machine.

2.2 Cloud Servers

For cloud servers, I will be using Amazon Web Services (AWS). Amazon Web Service providesan infrastructure web service platform in the cloud. It is a collection of remote computingmachines that together make a cloud based platform. It is an online service which can beaccessed from anywhere with the help of a web browser. They offer various functionalitiesto the developers to create an environment which the end users can use. This web servicecan be accessed via HTTP, using REST and SOAP protocols. Users of these services are

6

charged on usage. It is a pay-as-you-go service.

AWS Elastic Cloud Compute (EC2) [1] is a one of the main component of AWS cloudcomputing platform. Amazon EC2 is a web service that provides resizable compute capacityin the cloud. EC2 allows its users to rent virtual instances to run their own applications andto scale their deployment. User creates an Amazon Machine Image (AMI) of these virtualinstances which contains any software you desire. These AMIs can be customized by theusers. Various functions like launching an instance, terminating, rebooting an instance canbe performed on the instance.

Amazon Simple Storage Service (S3) is an online storage services offered by Amazon WebServices. AWS provides online storage space through web service interface like REST, SOAP,etc. The main aim of this service is to provide scalability and high availability at low latencycost. This web service offered by Amazon charges you as per the storage space used by theuser. User can easily scale up or scale down the storage space. There are various pluginsavailable to use this web service. AWS provides good authentication system to access theirweb services. It makes use of private key, public key and X.509 certificate to authenticatethe user. AWS creates two types of IP addresses, public IP address and private IP addresswhen an instance is launched. I will use this public IP address in my implementation.

2.3 Scheduler

Scheduling is a process that decides allotment of resources to the tasks. In distributed sys-tems, the primary responsibility of a job scheduler is to queue and map different tasks tocomputing resources while ensuring they do not exhaust the available resources. The jobscheduler presented in this project, schedules a job based on a predefined objective. Theseobjectives are execution time and network latency. It is significant to recognize the roleand behavior of a scheduler as the experiment deals with the design of a job scheduler in adistributed environment.

7

Figure 2.1: Distributed System.

The figure 2.1 shows a heterogeneous distributed system representing multiple nodes withtheir respective computing resources. When a list of task is submitted to a node, it eitherprocesses the job or forwards it to other nodes. In other words, the node needs to be intel-ligent enough to make this decision. Each node, therefore, has a scheduler which assists inthe decision-making. So now let us consider the first case of processing the job locally. Sincethe scheduler will be responsive of all the local resources and its availability, it will easilyschedule the job accordingly. In the second case, the scheduler decides to forward the taskin hand. For this purpose the scheduler performs the following:

• Connectivity processhe scheduler needs to know the number of nodes that exists in the network. To achievethis, scheduler maintains a record of nodes or get the information from an externalsource. The scheduler also needs to determine the connectivity of each node. Thescheduler could achieve this by periodic latency checks.

• Resource Knowledge GatheringFor efficient scheduling, the scheduler should be aware of the resources that are availableat each node. The easiest way for a scheduler to do is by maintaining data about eachnode and its resources. The scheduler will keep the other nodes informed about locallyavailable resources by communicating and exchanging information periodically.

8

• Job Submission ManagementThis step involves the actual job forwarding to a node and retrieving the failure orsuccess output. However, before this happens, the scheduler also needs to retain thestatus of each node i.e. whether the node is busy or available. This is significant asit needs to maintain fairness in the system not overloading any particular node at anygiven instance of time.

2.4 Related Work

In paper [3], the authors have explained a similar scenario about distributing job schedulingand data flow management in a grid based system .

Figure 2.2: Grid Computing with collaborating Meta Schedulers.

The paper discusses two different Schedulers viz., Local Scheduler (LS) and Meta Sched-uler (MS)

• Local SchedulerLocal Scheduler, during a job process request, examines the locally available resourcesand determines if the available resources will suffice for job process and completion.LS takes on the job request only if the resource conditions are satisfied. Following jobcompletion, LS communicates and updates MS about resource availability.

9

• Meta SchedulerA job request is always received by the Meta Scheduler. Based on the constant infor-mation provided by LS, MS determines if the job can be processed locally. In the casethat local resources are already taken, MS reaches out to MS of other nodes for as-signment and completion of the pending job request, for which it needs to perform thefollowing - Connectivity process, Resource Knowledge Gathering and Job SubmissionManagement.

Figure 2.3: Meta Scheduler.

10

Chapter 3

Problem Statement

Smartphones are resource constrained in terms of battery life, CPU power and memory.Hence, in order to perform a compute-intensive task, an application offloads this task tocloud servers using a scheduler. The cost of these geographically distributed cloud serversdepends on the consumption of the computing resources. If the chosen cloud server is situatedfar away from the client or is currently under load from other requests, the response timefor the service increases which results in a bad user experience. Also if the chosen server isexpensive for a relatively smaller job size, the overall processing cost increases. Hence, theselection process of the cloud server is important as it determines the response time and thecost to perform computation. Considering limited resources on mobile devices, it becomesnecessary to design an intelligent but a lightweight job scheduler.The job scheduler presentedin this project aims to solve the above problem by selecting an optimum cloud server froma pool of cloud servers to improve the response time and minimize the cost.

11

Chapter 4

Solution

4.1 System Design

The scheduler in my experiment does not deal with any local scheduling. Thus, it needs tobe similar to Meta scheduler described earlier. Let us consider the three processes that thescheduler performs namely, connectivity, resource knowledge gathering and job submissionmanagement with regards to the experiment being conducted.

• Connectivity processA network map of the system is not loaded dynamically but instead, the scheduler willmaintain a static list of the nodes in the system.

• Resource Knowledge GatheringTo request each cloud server about its current strength and to maintain a list of re-sources is a complex process. This would incur communication overhead. Thus, insteadit would predict the resource based on the previous response.

• Job Submission ManagementThe primary function of a job scheduler is to forward the job. However, keeping trackof failure or success and also maintaining the availability of each node will overloadthe scheduler. Therefore on failure it would simply reschedule the job.

The scheduler performs all the duties of a Meta scheduler, but the above mentioned dif-ferences due to limited resources on a mobile device. Following explains each schedulingapproach used in the experiment.

• RandomThis approach randomly selects a server from a given pool of cloud servers. By as-suming it gives good performance in the average case, it will form a base case forcomparison.

• Greedy ApproachGreedy algorithm makes locally optimal choice per iteration. The criteria for greedy

12

approach would be minimum network latency and minimum response of time of eachtask. In each iteration, it will note the latency and response time based on which futurescheduling will be done. This would enable us to evaluate and analyze the merits anddemerits of client side job scheduler.

• Randomized Greedy approachRandomized greedy approach is the modified greedy approach with a hint of random-ness. In simple terms, it would randomly select few servers from the pool and startapplying greedy approach on the selected servers to the queue. Depending on the over-all performance it will select a new cloud server and may drop the worst performingone.

The proposed solution also includes a threshold value for each server maintained by thescheduler. This threshold value defines the capacity of each server to handle the number ofjobs at a given instance of time. If a particular server exceeds this threshold value, schedulerwould shift the server to a lower priority.

4.2 Architecture

The client device will have a set of tasks to be processed which the scheduler running on thedevice itself will distribute using the cloud and network information, which the connectionmanager is responsible for maintaining.

• Cloud InformationThe scheduler will maintain a list of available cloud servers and their corresponding IPaddress and depending on the algorithm it will maintain the following:

– Execution timeTime to execute a single task. This helps the scheduler to estimate the computingresource of a particular cloud.

– Wait timeDepending on the execution time and number of jobs allotted to the particularcloud, the scheduler will calculate an estimated wait time. The purpose of waittime is to avoid overloading a particular cloud.

• Network InformationThe scheduler will maintain a record of current network latency by periodically sendingand receiving a chunk of data over the network. This helps the scheduler to select acloud which has minimum network latency, eventually avoiding network overload. Withregards to the experiment, chunk of data is a test image.

Figure 4.1 illustrates the architecture of the experiment.

13

Figure 4.1: Architecture.

In order to use the above information, the scheduler has a dedicated resource calculatorwhich is responsible for estimating the process time which helps the scheduler to estimateand schedule the jobs to different serves. To further understand the job of a resource calcu-lator, consider the following example for greedy approach. Let us say there are 4 jobs of sizen waiting to be processed by two Cloud servers C1 and C2. Assume response time for bothserver as R1 = 1 second and R2 = 2 seconds for one job of size ’n’ (where response time =execution time + network latency). Initially the wait time for both the servers will be equalto the response time i.e. W1 = 1 second and W2 = 2 seconds.

Table below illustrates the response and wait time for each job.

Table 4.1: Example.Number of Rounds R1(seconds) R2(seconds) W1(seconds) W2(seconds)0 1 1 2 21(1st job) 1 2 2 22(2nd job) 1 2 2 43(3rd job) 1 3 2 44(1st job complete) 1.5 2 2 45(4th job) 1.5 3.5 2 4

14

• Round 1When the first job is processed, the scheduler checks the wait time of each server andselects the one with minimum wait time. Thus in our case, it selects C1. When a jobis allotted to a server, the wait time of that server will be increased by the executiontime. Therefore, now both W1 and W2 will have 2 seconds as the wait time for nextround.

• Round 2The next job can select any one of the servers as both have same wait time. Assumingit selects C2, wait time for C2 is increased to 4 seconds.

• Round 3The scheduler selects C1 for 3rd job and updates the wait time to 3 seconds.

• Round 4In this, the 1st job is completed and let us assume it took more than estimated waittime of 1 second. Upon completion, the scheduler reduces wait time by the previousresponse time. Also, it updates the response time by the new response time which is1.5 seconds in our case.

• Round 5This round considers the new response time while calculating the wait time therebysetting it to 3.5 seconds. This way, the scheduler adjusts accordingly and always triesto estimate the most accurate wait time.

From the example, we can say that the greedy approach selects the servers based on theminimum wait time. Since wait time is calculated after every round and it is based on thecurrent and previous response time of each server, the scheduler tends to select the optimalserver at a given point of time.

15

Chapter 5

Experiment

5.1 Experiment Setup

The experiment setup mainly comprises of two pieces client devices and cloud servers. Clientdevices include Windows machine and Android devices.

Table 5.1: Client Devices.Devices CPU RAM Talk Time

(hrs)Motorola Droid 430MHz TMS320 256 MB 6.4 hrsMotorola Atrix 1 GHz dual-core ARM Cortex-A9 1GB 8.8 hrsLenovo Y550 Intel C2D T6400 2.00 GHz 3GB N/A

Pool of cloud servers is a distributed heterogeneous system. To set up the environmentfor the experiment, I will be using five different centers of Amazon Web Service- East (NorthVirginia), West (Oregon), Asia Pacific (Singapore), Europe (Ireland) and South America(Sao Paulo). On each center, the following different types of servers have been utilized.

Table 5.2: AWS Servers.Type CPU unit CPU Core MemoryMicro up to 2 ECUs 1 613 MibSmall 1 ECU 1 1.7 GibMedium 2 ECUs 1 3.7 GibLarge 4 ECUs 2 7.5 Gib

• AmazonLinux AMIThe Amazon Linux AMI is an EBS-backed, PV-GRUB image. It includes Linux 3.4,AWS tools, and repository access to multiple versions of MySQL, PostgreSQL, Python,Ruby, and Tomcat.

Following are the experiments performed:

16

5.2 The Round Trip time Experiment

In this experiment, the idea is to understand the latency between the client device and thecloud server. The client would simply send an image file to a server and server would returnit back. The key here is to note the total round trip time. Thus, to measure latencies for dif-ferent Amazon Web Service centers like Virginia, Oregon, Singapore, Ireland and Sao Paulo,I have pinged each server once every one minute for two days. To capture the variances innetwork traffic, I have run the experiment both on weekday and a weekend.

5.2.1 Weeday Vs Weekend Time test

The following graphs represents the round trip time to the server at N Virginia center forboth weekend and weekdays .

Figure 5.1: Round Trip Time Graph for Virginia weekend.

17

Figure 5.2: Round Trip Time Graph for Virginia weekday.

A general observation from the 5.1 and 5.2 graphsis that the round trip time is higheron weekends in comparison to a weekday. The average round trip time for all the servers,against various AWS centers is represented using the below graph.

Figure 5.3: Average Round Trip Time Graph.

18

The graph 5.3 corroborates the earlier observation that the round trip time was sloweron the weekend. In addition, we observe that the average round trip time from the clientdevice, which is located in Rochester NY, to Virginia center was minimum and hence thetransfer rate being the fastest. This contrasts with the average round trip time to Singaporewhich appears to be maximum. Hence, we can deduce that geographical range may have arole to play with regards to round trip time and transfer rate.

5.2.2 3G Vs Wi-Fi

Furthermore, to understand the transfer rate with respect to a mobile device, a similarexperiment was conducted. The idea here is to compare the 3G and Wi-Fi networks. Thetest is conducted for 200 rounds. Following are the respective graphs for N. Virginia center

Figure 5.4: Round Trip Time Graph for Virginia 3G.

19

Figure 5.5: Round Trip Time Graph for Virginia Wi-Fi.

Figure 5.6: Average Round Trip Time .

The graph 5.6 shows the average roundtrip time in millisecond at various AWS centers.It is clear that Wi-Fi Network is faster than 3G network.It is also evident from the graph5.6 that Singapore, being the most distant from Rochester,

20

NY, has the maximum round trip time in both 3G and Wi-Fi networks. Although, the sameis not the case for the center with the minimum response time. Virginia being proximatedoes not have the best response time in 3G network. But for Wi-Fi network, it certainlydoes.

5.3 Measuring Process Execution Time Experiment

This experiment measures the processing time for different types of servers i.e., micro, small,and medium. This helps in identifying the time it takes for different servers to execute agiven job. The setup is almost similar to the round trip time. For this test, one of thecourse-work assignments is being utilized in which a list of size n is randomly created andthe list is passed to be sorted which takes O(n2). For this test, value of n will be 5000,8000 and 10000. Each job is repeated for 1000 times on each server. The following graphillustrates different processors on the N. Virginia center for job size N=8000 and N=10,000

Figure 5.7: Execution Time Graph for Virginia N=8000.

21

Figure 5.8: Execution Time Graph for Virginia N=10,000.

We see that as the size of the job increases, micro processor is unable handle the job loadand hence there is a massive variation between the performances during minor load versusbulk load.

22

Figure 5.9: Average Execution Time N=5000.

Figure 5.10: Average Execution Time N=8000.

23

Figure 5.11: Average Execution Time N=10,000.

We can observe that similar processors on different centers take almost the same timeto process the job. We also see that micro processor being the least powerful still is ableto process the job quicker as compared to small processor and its processing time is almostequal to that of medium processor for job Size 5000 and 8000 . thus for smaller jobs, microprocessor seems to be ideal. However, for job size N=10000, we observe that the performanceis impacted significantly.

5.4 Face recognition Experiment

The face recognition experiment, provided by Professor Minseok Kwon, is an applicationwhich is computational-heavy with regards to processing time. This application, runningon the cloud server, is used to assess the scheduler. This software automatically identifies aperson from a digital image by comparing facial features from the image. The experimentsetup is as follows: Multiple instances of face recognition application are run as a servicefrom several locations on Amazon Web Service (AWS). This creates a pool of cloud serversdistributed across the globe. AWS offers different set of configurations and each configurationvaries in CPU power and memory. Deploying instances with different configurations ateach location will help understand the accuracy of the scheduler with respect to workloaddistribution. For this, eight servers are deployed out of which six running on AWS andtwo are local servers. The number of jobs sent by the client device plays a crucial role inunderstanding the threshold of each server. Thus for testing the schedulers, the job sizesthat are scheduled are 50, 100, 200, 500 and 1000. The job request sent by the client deviceto the server consists of couple of digital images. One image represents a face of a personwhile the other image comprises of multiple human faces including the face of the person

24

that needs to be compared and identified. The following graph displays the total executiontime for different schedulers with respect to different job sizes.

Figure 5.12: Response Time Graph for Schedulers.

The graph 5.12 is generated from the experiment conducted. We observe that for N=50,the random scheduler is efficient as compared to the other two schedulers. This is due tothe fact that greedy and randomized greedy schedulers both carry out an additional job ofrunning a test round to gather network and cloud information for future job scheduling. Forhigher number of job sizes like N=100 onwards, the performance of greedy scheduler andrandomized greedy scheduler is enhanced in contrast to that of random scheduler. For jobsize N=1000, we undoubtedly can state that greedy scheduler is the best performing one.Let us consider the following job distribution graphs which job size N=500 and N=1000.

25

Figure 5.13: Job distribution N=500.

For random scheduler, which is designed to distribute the job arbitrarily, we can seethat more number of the jobs was assigned to server 8 and job distribution remains unevenbetween the other servers. The greedy and randomized greedy scheduler allocates the jobconsistently among servers based on the performance of each and as a result, it does nottend to burden any specific server.

26

Figure 5.14: Job distribution N=1000.

The N=1000 graph above is analogous to the N=500 graph. The greedy and randomizedgreedy scheduler assigns the job after determining the effectiveness of each server whilerandom scheduler assigns the job inconsistently. The only difference here is that randomscheduler in N=1000 allocates more number of jobs to server 5 and 8.

27

Chapter 6

Conclusion

6.1 Conclusion

Each of the approach have been evaluated by considering following parameters in comparisonto random scheduler being the base case.

• PerformanceThe scheduler should achieve optimal performance under all conditions. The maxi-mum performance can be considered as the minimum response time. From the chartdisplaying the response time, we can conclude that the response time for greedy andrandomized greedy approach were nearly equal to that of random approach for smalljob size and moreover, they had superior response time for the higher job size in com-parison to random approach.

• Fast decision makingThe scheduler should not be an overhead on the system. Memory consumption andCPU utilization must be low. The execution time for the scheduler must be minimum.Since scheduling process for greedy and randomized greedy approaches do not involvemajor computation, the execution time for scheduling is almost insignificant.

• Precise classificationThe scheduler must be able to identify different available servers and classify themaccording to their performance. It must be able to accurately schedule the workload.Since greedy and randomized greedy approaches assign jobs to the servers based ontheir previous response time and also looking at the job distribution charts 5.13 and5.14, we can infer that these two approaches assign greater number of jobs to betterperforming servers like server 7 and 8 in our experiment. However, it does not totallyoverload them and at the same time utilizes other servers based on their performances.

• FairnessThe scheduler must maintain fairness it terms of distributing the workload amongvarious cloud servers. In simple terms, it should not over exhaust any server. From thejob distribution charts5.13 and 5.14, we can infer that greedy and randomized greedy

28

approaches are capable of uniformly distributing the jobs thereby avoiding overloadinga particular server.

Based on the experiments conducted, we could infer that for an application needing moreprocessing power, one can off load such task to a cloud server. To achieve an optimizationwith cloud scheduler, greedy approach has proved to be better preforming while dealing withhigher job size and random scheduler to be better performing with smaller job size.

29

Bibliography

[1] Amazon elastic compute cloud. http://docs.aws.amazon.com/AWSEC2/2007-08-29/

GettingStartedGuide/.

[2] Android (operating system). http://en.wikipedia.org/wiki/Android_(operating_

system).

[3] Norman Bobroff, Gargi Dasgupta, Liana Fong, Yanbin Liu, Balaji Viswanathan, FabioBenedetti, and Jonathan Wagner. A distributed job scheduling and flow managementsystem. volume 42, pages 63–70, New York, NY, USA, January 2008. ACM.

[4] Byung-Gon Chun and Petros Maniatis. Augmented smartphone applications throughclone cloud execution. In Proceedings of the 12th conference on Hot topics in operatingsystems, HotOS’09, pages 8–8, Berkeley, CA, USA, 2009. USENIX Association.

[5] Gonzalo Huerta-Canepa and Dongman Lee. A virtual cloud computing provider formobile devices. In Proceedings of the 1st ACM Workshop on Mobile Cloud Computing& Services: Social Networks and Beyond, MCS ’10, pages 6:1–6:5, New York, NY,USA, 2010. ACM.

[6] G. Mc Evoy and B. Schulze. Understanding scheduling implications for scientific appli-cations in clouds. In Proceedings of the 9th International Workshop on Middleware forGrids, Clouds and e-Science, MGC ’11, pages 3:1–3:6, New York, NY, USA, 2011. ACM.

30

A Real-Time Cloud Based Client Job · PDF fileA Real-Time Cloud Based Client Job Scheduler ......

Documents

Transcript of A Real-Time Cloud Based Client Job · PDF fileA Real-Time Cloud Based Client Job Scheduler ......