Wide Web Load Balancing Algorithm Design Yingfang Zhang.
-
date post
21-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of Wide Web Load Balancing Algorithm Design Yingfang Zhang.
Wide Web Load Balancing Algorithm Design
Yingfang Zhang
Outline of the Talk
• Introduction to wide web load balancing problem and related works
• Load balancing algorithm design and load balancing algorithms
• Sequence chart of simulation program• Traffic characteristics of web servers and clients• Test plan• Simulation results• Conclusion and future directions
Introduction Load Balancing Problem
WebServer 1
WebServer N
• Web system plays an essential role in providing and retrieving information.
• Cause web server to the overload and longer response time
• Duplicated web system is widely used today with purpose to spread the client’s requests and shorten the response time
• Key problem is how to allocate the requests efficiently to shorten the response timeClient Client
Request Request
Which Web Server to Choose?
? ?
Related Works on Web Systems
• NCSA Scalable Web Server….• CISCO Distributed Director…• GIT Feo et al, Dynamic Server Selection,…
NCSA Scalable Web Server
Advantage:Used in LAN local network. Because it ignores the path information, e.g., distance, hop count, bandwidth…, so it is simple, typically, it uses Round-Robin strategy.
Disadvantages:1. Just Balancing web servers that are located at same domain.2. Just Control parts of the requests due to name caching in the intermediate name server.
CISCO Distributed Director
.
CISCO Distributed Director
Advantage:1. Balancing geographical separate servers.
Disadvantage:1. Because of centralizing the requests, become bottleneck.
GIT Feo et al,Dynamic Server Selection
WebServer 1
WebServer 2
WebServer n-1
LB Agent LB Agent
WebServer n
LB AgentLB Agent
Server pushload status
Client proberesponse time
GIT Feo et al, Dynamic Server Selection
Advantage:1. User has the multiple choices to select the web server based on the performance of web server.
Disadvantage:1. Result in congestion due to client’s rushing to the lightly loaded servers
Load Balancing Algorithm
Advantages:1. Combine path information: Distance. Hop count. Bandwidth. Dynamic traffic delay.
with web information: Processing power. Number of pending requests. Size of pending requests.
Web server count. to guide allocating requests, therefore, load balancing algorithms can be used in wide web system. Because we combine static and dynamic information, The load balancing algorithms eliminate the congestion that happens at Dynamic server selection.
2. Design LBA, It can control all requests comparing to DNS, and avoid the bottleneck that is in Cisco Distributed Director.
Load Balancing Algorithm
Web Components and Their Interaction
Subnet 2
LBA 1Router
Client
1. Request
2.Request
3.Request/Web Address
Network
Subnet 1
LBA Message
4. Request/ Web Address
WebServer 1
WebServer 2
WebServer 3
Subnet 3
Subnet 4Client
Router LBA 2
RequestDocument
Request
Request/WebAddress
5. Request 6. Document
7. Document
8.Document
LBA Message
RequestDocument
Request Document
Document
RequestLBA Message
LBA Message
Two Metrics
1. Current Load
L1,t /S1 = L2,t /S2 = L3,t /S3 … = Lk,t /Sk
The goal is to make the loads even among the web servers
2. Average Response Time (ART)
ART load-balancing status = min{ART}
The goal is to shorten the response time
Factors That Affect Load Balancing
• The size of request.• The web processing power.• The number of pending requests at the web server.• The size of pending requests at the web server.• The distance of the path between clients and web servers.• The available bandwidth along the path.• The hop count along the path.• The traffic status.• The number of the web servers.• The number of the clients.• The number of the requests
Load Balancing Algorithm Design
Concern two processing time of the requests in load balancing algorithms:
1. Time from the client to web server. It Includes time: Transmission delay Queueing delay
2. Time stay at web server. It Includes time: Queueing delay at web server Processing time at web server
IdeaBalancing those times. Finally the load balancing algorithm should try to shorten the end-to-end response time of the requests.
Load Balancing Algorithm Design
Factors related to first portion of the time are the path information,e.g.,distance, hop count, bandwidth, dynamic traffic delay.
Factors related to second portion of the time are the web information,e.g., web processing power, number of pending requests.
Load Balancing Algorithm I
1. LBA-I: Use all static information
Selected web server i = Min{distance/bandwidth + loads of web server i / process power of web server i}
• Use the ratio of distance to bandwidth to measure the path, • Use the ratio of web server loads to processing power to measure the web status.
There are two variations of the algorithms:LBA-I-1: Add hop countLBA-I-2: Just concern web server status, ignore the path information.
Load Balancing Algorithm IILBA-II: LBAs communicate with each other. When a LBA makes an assignment decision, It passes this info to other LBAs, other LBAs update their assignment table.
Advantage: Improve the estimated precision of loads of web server.Disadvantage: Generate heavy communication overhead and take away available bandwidth for web access.
A variation of the algorithm LBA-II-1. Just pass the assignment info to neighboring LBAsAdvantage: Reduce communication overhead.Disadvantage: Decrease the estimated precision of web server loads.
Load Balancing Algorithm III
Using static information to measure the path traffic in LBA-I and LBA-II.
LBA-III: Using dynamic path information. Periodically sendprobing packets to probe path bandwidth. Advantage: Improve the estimated precision of the path traffic.Disadvantages: 1. Generate heavy communication overhead, reduce available bandwidth.2. The performance of the algorithm depends on the probing path message period.
Load Balancing Algorithm IV
Using the number of the assignment requests to measure the web server loads in LBA-I, LBA-II and LBA-III.
LBA-IV: Using the size of pending request information to measure the web server load status. Web servers periodically send the this information to LBAs Advantage: Improve the estimated precision of web server loads.Disadvantages: 1. Generate heavy communication overhead, reduce available bandwidth. 2. The performance of the algorithm depends on web server reporting period.
There are two variations of the algorithms1. LBA-IV(E): Long period, during a period interval, using first order random walking model to estimate the load of the web server
Advantage: Reduce the communication overhead.Disadvantage: Decrease the estimated precision of loads at web servers 2. LBA-IV(Tc): Overload alarming threshold. Web servers send asynchronous overload alarming message to LBAs.
Advantage: Reduce the communication overhead.
Load Balancing Algorithm IV
Simulation Program Design
1. Using discrete event simulation.
2. There are 4 types of nodes: Web Server, LBA, Router, Client
3. There are 6 types of the packets: Request packet sent by client Document packet sent by web server Path Probing message sent by LBA Current-load-report message sent by web server Overload-alarming message sent by web server Load balancing coordination message sent by LBA
Sequence Chart Of Simulation ProgramClient Router LBA
RequestRequestReq./ WebAddress Req./ Web
Address
DocumentDocument Probe-Path
Probe-Path
Probe-Path
Load-Report
Load-Report
LBA
Message Message
Probe-Path
Web Server
MessageMessage Overload
Overload
Web Server and Client Traffic CharacteristicsLog Files of Five Web Server are analyzed:
1. Department wide server at UCCS run on an Alpha workstation owl.uccs.edu. 2. Campus wide server at UCCS run on an Alpha workstation www.uccs.edu. 3. ClarkNet WWW server, which is a full Internet access provider for the Metro Baltimore-Washington DC area. 4. EPA WWW server located at Research Triangle Park, NC. 5. BU-Web-Client in the Boston University Computer Science Department.
Analyzing the statistic characteristics of the workload of the web servers and clients.Comparing those characteristics with other reports
Characteristics of Document Type
0%
10%
20%
30%
40%
50%
60%
70%
80%
owl uccs clarknet epa bugs
.gif
.mpg
.html
.au
.ps
.pl
.jpg
.avi
.wav
.mid
.mpeg
.jpeg
.pdfother
Most frequently requested by the clients are image and HTML type, which account over 80%of all requested type.
Characteristics of Document Size
Most document size requested by the clients is 1 ~ 5 KB
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
<100B
< 1 K 1~2K 2~3K 3~4K 4~5K 5~6K 6~7K 7~8K 8~9K 9~10K > 10K
owl
uccs
epa
clarknet
bugs
Characteristics of Request Time Interval
0%
10%
20%
30%
40%
50%
60%owl
uccs
epa
bugs
clarknet
Most time interval of the requests is less than 1 sec
Test PlanThere are the factors affecting performance of algorithms1. Network topology2. Number of the request3. Request time interval4. Bandwidth
We will change those parameters to test the performance of the algorithms. We will statistics the following Data:
1. Average response time2. Web queueing delay time3. Router queueing delay time4. Transmission delay time5. Propagation delay time 6. Processing time7. Various communication overhead
Performance of Load Balancing Algorithms
Distribution of request time interval as previous figure.Distribution of document size as previous figure.Range of the request count from 800 to 10000Average Bandwidth is 5 MbpsNew-Jersey Network with 116 nodes and 22 linksTransmission delay ( 90%) dominates the response time
AverageResponseTime
Average WebQueuingDelay
AverageTransmissionDelay
AverageOverheadMessages
Period
LBA-I 0.016226 0.000308 0.014775 0 NoLBA-I-1 0.016226 0.000308 0.014775 0 NoLBA-I-2 0.017708 0.000035 0.016528 0 NoRR 0.017663 0.000037 0.016483 0 NoRandom 0.017924 0.000053 0.016706 0 NoLBA-II(I) 0.016425 0.000292 0.015014 400% NoLBA-II(I-2) 0.017688 0.000027 0.016582 400% NoLBA-III 0.270639 0.253463 0.017001 2% 2 (s)LBA-IV-1(3) 0.019813 0.0039813 0.015300 2% 5(s)
Performance of Load Balancing Algorithms
Distribution of request time interval as previous figureDistribution of document size as previous figureRange of the request count from 800 to 10000Average bandwidth is 5 Mbpsr50 Network with 890 nodes and 217 linksTransmission delay ( 80%) dominates the response time
AverageResponseTime
Average WebQueuingDelay
AverageTransmissionDelay
OverheadMessages
Period
LBA-I 0.0288746 0.0013823 0.0244098 0 NoLBA-I-1 0.027373 0.001351 0.0228792 0 NoLBA-I-2 0.032093 0.0005428 0.0267744 0 NoRR 0.0319824 0.0004043 0.0266921 0 NoRandom 0.032012 0.000417 0.026793 0 NoLBA-II(I) 0.030891 0.0013901 0.0253481 400% NoLBA-II(I-2) 0.032236 0.000980 0.026345 400% NoLBA-III 0.207321 0.176451 0.029456 7% 5 (s)LBA-IV-1(3) 0.056710 0.01856 0.0277451 2% 5 (s)
Performance of Load Balancing Algorithm
0
0.05
0.1
0.15
0.2
-500 0 500 1000 1500 2000 2500 3000
Bandw idth (Mb)
Aver
age R
espo
nse T
ime
(s)
LBA-I
LBA-I-1
LBA-I-2
RR
Random
LBA-II(I)
LBA-II(I-2)
LBA-III
LBA-IV-1(3)
0
0.005
0.01
0.015
0.02
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s)
LBA-I
Performance of Load Balancing Algorithm
Algorithm LBA-I
0
0.005
0.01
0.015
0.02
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s) LBA-I-1
Performance of Load Balancing Algorithm
Algorithm LBA-I-1
Performance of Load Balancing Algorithm
0
0.005
0.01
0.015
0.02
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s) LBA-I-2
Algorithm LBA-I-2
Performance of Load Balancing Algorithm
Algorithm RR
0
0.005
0.01
0.015
0.02
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s)
Performance of Load Balancing Algorithm
Algorithm LBA-II(I)
0
0.005
0.01
0.015
0.02
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s)
Performance of Load Balancing Algorithm
Algorithm Random
0
0.005
0.01
0.015
0.02
-1000 0 1000 2000 3000
Bandw idth (Mb)
Avera
ge R
esponse T
ime
(s)
Performance of Load Balancing Algorithm
Algorithm LBA-II(I-2)
0
0.005
0.01
0.015
0.02
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s)
Performance of Load Balancing Algorithm
Algorithm LBA-III
0
0.05
0.1
0.15
0.2
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s)
Performance of Load Balancing Algorithm
Algorithm LBA-IV(3)
0
0.005
0.01
0.015
0.02
0.025
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s)
Performance of Load Balancing Algorithms
We change the request interval time
0%
20%
40%
60%
80%
100%
<=0.00001 0.0001~0.001 0.001~0.01
Request Interval (sec)
Comparing to previous request interval distribution, above distribution is 10000 times faster
Performance of Load Balancing Algorithms
Distribution of request time interval as above figure.Distribution of document size as previous figure.Range of the request count from 800 to 3000.New-Jersey Network with 116 nodes and 22 linksWeb queuing delay ( 80%) dominates the response time
AverageResponseTime
Average WebQueuing Delay
AverageTransmissionDelay
AverageOverheadMessages
Period
LBA-I 0.202547 0.191662 0.009676 0 NoLBA-I-1 0.202547 0.191662 0.009676 0 NoLBA-I-2 0.058024 0.046689 0.010152 0 NoRR 0.058280 0.046959 0.010118 0 NoRandom 0.058544 0.047286 0.010064 0 NoLBA-II(I) 0.173958 0.162056 0.009247 400% NoLBA-II(I-2) 0.054626 0.042309 0.009703 400% NoLBA-III 0.592967 0.561779 0.010113 2% 0.005 (s)LBA-IV-1(3) 0.054954 0.043393 0.010429 100% 0.005(s)
Performance of Load Balancing Algorithms
0
0.05
0.1
0.15
0.2
0.25
0 10 20 30 40 50
Bandw idth (Mb)
Avera
ge R
esponse T
ime
(sec)
LBA-I
LBA-I-1
LBA-I-2
RR
Random
LBA-II(I)
LBA-IV
LBA-II(I-2)
00.10.20.30.40.50.60.7
-1000 0 1000 2000 3000
Bandw idth (Mb)
Average R
esponse T
ime (
s)
LBA-I
LBA-I-1
LBA-I-2
RR
Random
LBA-II(I)
LBA-IV
LBA-II(I-2)
LBA-III
0
0.05
0.1
0.15
0.2
0.25
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s)
Performance of Load Balancing Algorithm
Algorithm LBA-I
Performance of Load Balancing Algorithm
Algorithm LBA-I-1
0
0.05
0.1
0.15
0.2
0.25
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s)
Performance of Load Balancing Algorithms
Algorithm LBA-I-2
0
0.02
0.04
0.06
0.08
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s)
Performance of Load Balancing Algorithm
Algorithm RR
0
0.02
0.04
0.06
0.08
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s)
Performance of Load Balancing Algorithm
Algorithm Random
0
0.02
0.04
0.06
0.08
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s)
Performance of Load Balancing Algorithm
Algorithm LBA-II(I)
0
0.05
0.1
0.15
0.2
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Resp
ons
e Tim
e (
s)
Performance of Load Balancing Algorithm
Algorithm LBA-II(I-2)
00.010.020.03
0.040.050.06
-1000 0 1000 2000 3000
Bandw idth (Mb)
Ave
rage
Res
pons
e Ti
me
(s)
Performance of Load Balancing Algorithm
Algorithm LBA-IV
00.010.020.03
0.040.050.06
-1000 0 1000 2000 3000
Bandw idth (Mb)
Avera
ge R
esponse T
ime
(s)
Conclusions
1. Algorithm LBA-I and LBA-I-1 have the better performance when the transmission delay dominates the response time. They are independent of any period and do not generate any overhead messages.2. Algorithm LBA-II have the better performance when the web queueing delay dominates the response time. It generates very heavy overhead messages and is independent of reporting period.3. Algorithm LBA-III have the worse performance in both case-- transmission delay dominates the response time and web queueing delay dominates response time. It generates very heavy overhead messages and is dependent on reporting period.4. Algorithm LBA-IV has the better performance when the web queuing delay dominates the response time. It generates overhead messages and is dependent on reporting period.
Network Design Issues
If the transmission delay dominates the response time, We have following suggestions for network design:
1. Reduce document size.2. Choose proper ratio of web servers and clients3. Choose proper process power of web servers4. Choose proper location of web servers
Future Directions
1. Using real and larger networks to test the proposed load balancing algorithms2. Investigate algorithm performance under heavy web server load.3. Investigate aggregate server/LBA reporting and impact of reporting frequencies.4. Implement load balancing algorithms in a prototype.