Post on 17-Jan-2016
LINUX SCHEDULING
Evolution in the 2.6 Kernel
Kevin LambertMaulik MistryCesar Davila
Jeremy Taylor
Main Topics
General Process Scheduling 2.6 Kernel Processing (short-term) I/O Scheduling (disk requests)
General Scheduler Considerations
1 - Preemptive vs. Cooperative
2 - I/O-bound vs. CPU-bound
3 - Throughput Vs. Latency
PRIORITY!
Text Editor vs. Video Encoder
Which one has more priority in Linux? Which requires more processing? Which one requires more I/O? Which one has greater priority?
Preemption
Due to timeslice running out Due to priority being lower than that of
current running process
Where Did We Come From?
Pre 2.6 Schedulers Didn’t utilize SMP very well
Single runqueue lock meant idle processors awaiting lock release
Preemption not possible Lower priority task can execute while high
priority task waits O(n) complexity
Slows down with larger input.
Where Are We Now?
The 2.6 Scheduler Each CPU has a separate runqueue
140 FIFO priority lists 1-100 are for real-time tasks 101-140 are for user tasks
Active and Expired runqueues O(1) complexity
Constant time thanks to runqueue swap
Where Are We Now? (cont)
The 2.6 Scheduler Preemption Dynamic task prioritization
Up to -5 niceness for I/O-bound Up to +5 niceness for CPU-bound Remember, less niceness is good… in this
case. SMP load balancing
Checks runqueues every 200 ms
CFS – The Future is Now!
Completely Fair Scheduler Merged into the 2.6.23 kernel Runs task with the “gravest need” Guarantees fairness (CPU usage)
No runqueues! Uses a time-ordered red-black binary tree Leftmost node is the next process to run
Red/Black Tree Rules
4) Every path from the root to a tree leaf contains the same number (the "black-height") of black nodes.
Cited from http://mathworld.wolfram.com/Red-BlackTree.htmlGood animation at http://www.geocities.com/SiliconValley/Network/1854/Rbt.html
1) Every node has two children, each colored either red or black.
2) Every tree leaf node is colored black.
3) Every red node has both of its children colored black.
CFS Features (cont)
No timeslices!... sort of Uses wait_runtime (individual) and fair_clock
(queue-wide) Processes build up CPU debt Different priorities “spend” time differently Half priority task sees time pass twice as fast
O(log n) complexity Only marginally slower than O(1) at very
large numbers of inputs
IO Scheduling
• Minimize latency on disk seeks
• Prioritize processes’ IO requests
• Efficiently share disk bandwidth between processes
• Guarantee that requests are issued before a deadline
• Avoid starvation
CFQ What it's good for:
Default system for Red Hat Enterprise Linux 4 Distributes bandwidth equally among IO
requests and is excellent for multi-user environments
Offers performance for the widest range of applications and IO system designs and those that require balancing
Considered anticipatory because process queue idles at the end of synchronous IO allowing IO to be handled from that process.
How CFQ Works Assigns requests to queues and priorities
based on the process they are coming from
Current time recorded when task enters runqueue
Traffic divides into a fixed number of buckets (64 by default)
Hash code from networking atm Round robin all non-empty buckets
How CFQ Works IO scheduler uses a per-queue function
(not per-bucket) Runnable tasks use a 'fair clock' with
runnable tasks (1/N) to increase priority Several innovations made for CFQ V2
http://www.redhat.com/magazine/008jun05/features/schedulers/