Load balancing theory and practice
-
Upload
foundationdb -
Category
Technology
-
view
1.708 -
download
1
description
Transcript of Load balancing theory and practice
![Page 1: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/1.jpg)
Load balancingtheory and practice
![Page 2: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/2.jpg)
Welcome
Me:• Dave Rosenthal• Co-founder of FoundationDB• Spent last three years building a distributed
transactional NoSQL database• It’s my birthday
Any time you have multiple computers working on a job, you have a load balancing problem!
![Page 3: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/3.jpg)
Warning
There is an ugly downside to learning about load balancing: TSA checkpoints, grocery store lines, and traffic lights may become even more frustrating.
![Page 4: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/4.jpg)
What is load balancing?
Wikipedia: “…methodology to distribute workload across multiple computers … to achieve optimal resource utilization, maximize throughput, minimize response time, and avoid overload”
All part of the latency curve
![Page 5: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/5.jpg)
The latency curve
Series11
10
100
1000
10000
Jobs/second
Late
ncy
Overload
Saturation
Nominal Interesting
![Page 6: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/6.jpg)
Goal for real-time systems
Series11
10
100
1000
10000
Jobs/second
Late
ncy
Low latency at given load
![Page 7: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/7.jpg)
Goal for batch systems
Series11
10
100
1000
10000
Jobs/second
Late
ncy High Jobs/sec at a
reasonable latency
![Page 8: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/8.jpg)
The latency curve
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 11
10
100
1000
Load
Late
ncy
(ms)
Better load balancing strategies can dramatically improve both latency and throughput
![Page 9: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/9.jpg)
Load balancing tensions
• We want to reduce queue lengths in the system to yield better latency
• We want to lengthen queue lengths to keep a “buffer” of work to keep busy during irregular traffic and yield better throughput
• For distributed systems, equalizing queue lengths sounds good
![Page 10: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/10.jpg)
Can we just limit queue sizes?
0 2 4 6 8 10 12 14 16 18 200
5
10
15
20
25
30
35
40
Queued job limit
% o
f dro
pped
jobs
![Page 11: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/11.jpg)
Simple strategies
Global job queue: for slow tasksRound robin: for highly uniform situationsRandom: probably won’t screw youSticky: for cacheable situationsFastest of N tries: tradeoff throughput for latency. I recommend N = 2 or 3.
![Page 12: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/12.jpg)
Use a global queue if possible
1 2 3 4 5 6 7 8 9 100.1
1
10
Random assignmentGlobal Job Queue
Cluster Size
Late
ncy
unde
r 80%
load
![Page 13: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/13.jpg)
Options for information transfer
• None (rare)• Latency (most common)• Failure detection• Explicit– Load average– Queue length– Response times
![Page 14: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/14.jpg)
FoundationDB’s approach
1. Request to random of three servers2. Server either answers query or replies “busy” if its queue
is longer than the queue limit estimate3. Queries that were busy are sent to second random server
with “must do” flag set.
Queue limit = 25 * 2^(20*P)• A global queue limit is implicitly shared by estimating the
fraction of incoming requests (P) that are flagged “must do”• Converges to a P(redirect)/queue-size equilibrium
![Page 15: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/15.jpg)
FDB latency curve before/after
0 200000 400000 600000 800000 1000000 12000000.1
1
10
100
Operations per second
Late
ncy
0 200000 400000 600000 800000 1000000 12000000.1
1
10
100
Operations per second
Late
ncy
![Page 16: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/16.jpg)
Tackling load balancing
• Queuing theory: One useful insight• Simulation: Do this• Instrumentation: Do this• Control theory: Know how to avoid this• Operations research: Read about this for fun– Blackett: Shield planes where they are not shot!
![Page 17: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/17.jpg)
The one insight: Little’s law
Q = R*W
• (Q)ueue size = (R)ate * (W)ait-time• Q is the average number of jobs in the system• R is the average arrival rate (jobs/second)• W is the average wait time (seconds)• For any (!) steady-state systems– Or sub-systems, or joint systems, or…
![Page 18: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/18.jpg)
Little’s law example 1
Q = R*W
• We get 1,000,000 request per second (R=1E6)• We take 100 ms to service each request• (Q = 1E6*0.100)• Little’s Law: Average queue depth is 100,000!
![Page 19: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/19.jpg)
Little’s law example 2
W = Q/R
• We have 100 users in the system making continuous requests (Q=100)
• We get 10,000 requests per second• (W = 100 / 10,000)• Little’s Law: Average wait time is 10 ms
![Page 20: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/20.jpg)
Little’s law ramifications
Q = R*W
• In distributed system:– R scales up– W remains the same, or gets a bit worse
• To maintain performance, you’re going to need a whole lot of jobs in flight
![Page 21: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/21.jpg)
The rest of queuing theory
Erlang• A language • A man (Agner Krarup Erlang)• And a unit! (Q from little’s law AKA offered load
is measured in dimensionless Erlang units)• Erlang-B formula (for limited-length queues)• Erlang-C formula (P(waiting))
![Page 22: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/22.jpg)
Abandon hope
Series11
10
100
1000
10000
Math for queuing theory
Real-world applicability
Com
plex
ity o
f Mat
h
Little’s law ?
![Page 23: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/23.jpg)
Simulation
The best way to explore distributed system behavior
![Page 24: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/24.jpg)
Quiz
Model: Jobs of random durations. 80% load.Goal: Minimize average job latency.
What to work a bit more on?• First task received• Last task received• Shortest task• Longest task• Random task• Task with least work remaining• Task with most work remaining
![Page 25: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/25.jpg)
Simulation code snippits
![Page 26: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/26.jpg)
Simulation results at 80% load
Task with most work remaining
Task with least work remaining
Random task
Longest task
Shortest task
Last task received
First task received
0 5 10 15 20 25 30 35 40 45 50
Latency
![Page 27: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/27.jpg)
Simulation results at 95% load
Task with most work remaining
Task with least work remaining
Random task
Longest task
Shortest task
Last task received
First task received
10 100 1000 10000 100000
Latency
![Page 28: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/28.jpg)
FoundationDB’s approach
• Strategy validated using simulation used for a single server’s fiber scheduling
• High priority: Work on the next task to finish• But be careful to enqueue incoming work from
the network with highest priority—we want to know about all our jobs to make good decisions
• Low priority: Catch up with housekeeping (e.g. non-log writing)
![Page 29: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/29.jpg)
Load spikes
Low load system High load system
Series1
Series1
Bursts of job requests can destroy latency. The effect is quadratic: A burst produces a queue of size B that lasts time proportional to B. On highly-loaded systems, the effect is multiplied by 1/(1-load), leading to huge latency impacts.
![Page 30: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/30.jpg)
Burst-avoiding tip
1. Search for any delay/interval in your system2. If system correctness depends on the
delay/interval being exact, first fix that3. Now change that delay/interval to randomly
wait 0.8-1.2 times the nominal time on each execution
YMMV, but this tends to diffuse system events more evenly in time and help utilization and latency.
![Page 31: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/31.jpg)
Overload
Series11
10
100
1000
10000
Jobs/second
Late
ncy
Overload
![Page 32: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/32.jpg)
Overload
What happens when work comes in too fast?• Somewhere in your system a queue is going to
get huge. Where?• Lowered efficiency due to:– Sloshing– Poor caching
• Unconditional acceptance of new work means no information transfer to previous system!
![Page 33: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/33.jpg)
Overload (cont’d): Sloshing
Loading 10 million rows into popular NoSQL K/V store shows sloshing
12.5 minutes
![Page 34: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/34.jpg)
Overload (cont’d): No sloshing
Loading 10 million rows into FDB shows smooth behavior:
![Page 35: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/35.jpg)
System queuing
Work
A
B
C
D
E
Node 1
Queue
Node 2
Queue
Node 3
Queue
![Page 36: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/36.jpg)
System queuing
Work
B
C
D
E
Node 1
Queue
A
Node 2
Queue
Node 3
Queue
![Page 37: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/37.jpg)
Internal queue buildup
Work
E
Node 1
Queue
A
B
C
D
Node 2
Queue
Node 3
Queue
![Page 38: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/38.jpg)
Even queues, external buildup
Work
D
E
…
Node 1
Queue
C
Node 2
Queue
B
Node 3
Queue
A
![Page 39: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/39.jpg)
Our approach
“Ratekeeper”• Active management of internal queue sizes prevents
sloshing• Avoids every subcomponent needing it’s own well-
tuned load balancing strategy• Explicitly send queue information at 10hz back to a
centrally-elected control algorithm• When queues get large, slow system input• Pushes latency into an external queue at the front of
the system using “tickets”
![Page 40: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/40.jpg)
Ratekeeper in action
0 100 200 300 400 500 6000
200000
400000
600000
800000
1000000
1200000
1400000
Seconds
Ope
ratio
ns p
er s
econ
d
![Page 41: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/41.jpg)
Ratekeeper internals
![Page 42: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/42.jpg)
What can go wrong
Well, we are controlling the queue depths of the system, so, basically, everything in control theory…
Namely, oscillation:
![Page 43: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/43.jpg)
Recognizing oscillation
• Something moving up and down :)– Look for low utilization of parallel resources– Zoom in!
• Think about sources of feedback—is there some way that having a machine getting more job done feeds either less or more work for that machine in the future? (probably yes)
![Page 44: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/44.jpg)
What oscillation looks like
1 1.5 2 2.5 3 3.5 4 4.5 50
10
20
30
40
50
60
70
Node ANode B
Util
izati
on %
![Page 45: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/45.jpg)
What oscillation looks like
2 2.05 2.1 2.15 2.2 2.25 2.30
20
40
60
80
100
120
Node ANode B
Util
izati
on %
![Page 46: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/46.jpg)
Avoiding oscillation
• This is control theory—avoid if possible!• The major thing to know: control gets harder
at frequencies get higher. (e.g. Bose headphones)
• Two strategies:– Control on a longer time scale– Introduce a low-pass-filer in the control loop (e.g.
exponential moving average)
![Page 47: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/47.jpg)
Instrumentation
If you can’t measure, you can’t make it better
Things that might be nice to measure:• Latencies• Queue lengths• Causes of latency?
![Page 48: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/48.jpg)
Measuring latencies
Our approach:• We want information about the distribution, not
just the average• We use a “Distribution” class– addSample(X)– Stores 500+ samples– Throws away half of them when it hits 1000 samples,
and halves probability of accepting new samples– Also tracks exact min, max, mean, and stddev
![Page 49: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/49.jpg)
Measuring queue lengths
Our approach:• Track the % of time that a queue is at zero length• Measure queue length snapshots at intervals• Watch out for oscillations– Slow ones you can see– Fast ones look like noise (which, unfortunately, is
also what noise looks like)– “Zoom in” to exclude the possibility of micro-
oscillations
![Page 50: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/50.jpg)
Measuring latency from blocking
• Easy to calculate:– L = (b0^2 + b1^2 … bN^2) / elapsed – Total all squared seconds of blocking time over
some interval, divide by the duration of the interval. • Measures impact of unavailability on mean
latency from random traffic• Example: Is server’s slow latency explained by
this lock?• Doesn’t count catch-up time.
![Page 51: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/51.jpg)
Summary
Thanks for listening, and remember:• Everything has a latency curve• Little’s law• Randomize regular intervals• Validate designs with simulation• Instrument
May your queues be small, but not empty
![Page 52: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/52.jpg)
Prioritization/QOS
• Can help in systems under partial load• Vital in systems that handle batch and real-
time loads simultaneously• Be careful that high priority work doesn’t
generate other high priority work plus other jobs in the queue. This can lead to poor utilization analogous to the internal queue buildup case.
![Page 53: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/53.jpg)
Congestion pricing
• My favorite topic• Priority isn’t just a function of the benefit of
your job• To be a good citizen, you should subtract the
costs to others• For example, jumping into the front of a long
queue has costs proportional to the queue size
![Page 54: Load balancing theory and practice](https://reader034.fdocuments.us/reader034/viewer/2022042613/549226feac795916288b46de/html5/thumbnails/54.jpg)
Other FIFO alternatives?
• LIFO– Avoids the reason to line up early– In situations where there is adequate capacity to
serve everyone, can yield better waiting times for everyone involved