Post on 27-Aug-2018
Why Elapsed Time First?Ø End-to-End Latency
q Get absolute time on each end?
Ø Elapsed Time:q Round Trip Time (ping)
q Same Clock Source
4
Clock1 Clock2
Start_>me(clock1) end_>me(clock2)
Synchronized/AsynchronizedClocks?
Start_>me(clock1)End_>me(clock1)
Clock1
MeasureTimeonasingleHost
Get a coarse grained estimationØ Use shell built-in command
q time
q “real”: Wall Time Elapsed (Just an estimation, Don’t rely on it)
q “user”: Execution Time in User Space
q “sys”: Execution Time in Kernel Space (syscall)
5
Getanoverviewofyourprogram’sresponse/execu>on>me.
Measure Elapse Time: gettimeofday()Ø gettimeofday() [http://linux.die.net/man/2/sched_setscheduler]
q return struct timeval, includes tv_sec and tv_usec
q NOT ok for measuring overhead on standard kernel configuration
q Wall clock time may change• User/other program(NTP) changes clock
6
NormalEvalua>on(5seconds)
Inten>onallychangethewallclock,whenprogramisrunning
Don’t Relied on Wall-Time ClockØ Shell built-in command “time”
7
WhenMeasuringElapseTime,don’trelyonWTC.
Use POSIX clock_gettime()Ø Sources:
q CLOCK_REALTIME: Wall Time, affected by discontinuous jump
q CLOCK_MONOTIC: Not affected by discontinuous jump , but affected by incremental adjust (e.g. NTP).• Clock cannot jump, but may skew
q CLOCK_MONOTIC_RAW: Not affected by NTP• More accurate for very short intervals
Ø Precision: Clock_getres()
8
Forshortlatencymeasurementinasinglehost,tryCLOCK_MONOTIC_RAW
Use Processor Cycles (x86 x64)
Ø FIX CPU FrequencyØ Disable Hyperthreading
Ø Rdtsc q read CPU cycles directly (need to fix CPU frequency)
q cat /proc/cpuinfo to get CPU frequencyq on a 1GHZ CPU, ticks 1,000,000,000 times per second• if you use rdtsc to record time, pay attention to this value
• cat /proc/cpuinfo # get CPU frequency
• cat /sys/devices/system/clocksource/clocksource0/current_clocksource
9
Generallyspeaking,rdtsc()givesyoubestaccuracyandtrivialoverhead.However,inmul>-coresystem,Cyclecountersfordifferentcoresarenotalwayssynchronized!
Inter-domain TSC Synchronization �(x86 x64)
Ø Issues: Non-Synchronized Clocks / Offsetsq Clock Tree + DPLL
Ø Workaround: q Enable Invariant TSC (INTEL, since Haswell)q rdtscp()[1]
Multilcore CPU Chip
Core0
Tsc Reg
Core1
Tsc Reg
rdtsc() rdtsc()
sched@core0 sched@core1
Multilcore CPU Chip
Core0
InvariantTsc Reg
Core1
Tsc Reg
rdtscp()rdtscp()
sched@core0 sched@core1
Tsc Reg
[1]Howtobenchmarkcodeexecu>onTimesonIntel:h`ps://www.intel.com/content/www/us/en/embedded/training/ia-32-ia-64-benchmark-code-execu>on-paper.html
FYI: Other Time SourcesRTC (Real-Time Clock)
q Available on most computers (not on RPi 2 or 3 unless you add it)q Low precision (as low as 0.5 seconds)
Hardware Timersq Might be used to generate interrupts, might be queryableq Run at a variety of frequenciesq Programmable Interval Timer (PIT)q High-Performance Event Timer (HPET)q Programmable Interrupt Controller (PIC)q Advanced Programmable Interrupt Controller (APIC)
Processor Cyclesq Timestamp Counter (TSC) on x86, 64-bitq Cycle Counter (CCNT) on ARM, 32-bit, 64-cycle divider, not accessible in user
modeq Potentially very high accuracy
11Source:CSE522Sh`p://www.cse.wustl.edu/~cdgill/courses/cse522/slides/03_Timers_Timing.pptx
PointersØ Time related system calls in the Linux kernel
q https://0xax.gitbooks.io/linux-insides/content/Timers/timers-7.htmlØ Clock_gettime()
q Man 2 clock_gettimeq https://linux.die.net/man/2/clock_gettime
Ø RDTSCq http://www.mcs.anl.gov/~kazutomo/rdtsc.html
Ø RDTSCPq https://www.intel.com/content/www/us/en/embedded/training/ia-32-ia-64-
benchmark-code-execution-paper.htmlØ Invariant TSC
q Pitfall of TSC usageq http://oliveryang.net/2015/09/pitfalls-of-TSC-usage/
Ø (ARM) High Resolution Timing on Raspberry Piq https://blog.regehr.org/archives/794
12
Why Elapsed Time First?Ø End-to-End Latency
Ø Contributor:q propagation delay: staticq queuing delaysq node processing delaysq routing changes
14
Clock1 Clock2
Start_>me(clock1) end_>me(clock2)
Get Clock Synchronized?Ø Query NTP Server, get clock synchronized
15
Clock1 Clock2
Start_>me(clock1) end_>me(clock2)
NTPServer
Background: NTPØ Hierarchical NTP servers: Clock StrataØ Stratum0: reference clock
Ø Stratum1: primary time servers
16Source:h`ps://en.wikipedia.org/wiki/Network_Time_Protocol
TheU.S.NavalObservatoryAlternateMasterClockStratum0:high-precision>mekeepingdevicesatomic(cesium)clocks
Sadly, Wall TimeØ timedatectl: ubuntuØ ntpd: POSIX
Ø “Low” Precision:q Milliseconds ~ Seconds
q ~1ms in Local Area Network, using your own NTP server[1]
17SetupanNTPserver:h`ps://ubuntuforums.org/showthread.php?t=862620
Precision Time Protocol: Better Accuracy
Ø On LAN:q sub-microsecond range
q making it suitable for measurement and control systems
q PTP v.s. NTP• hardware support present in various network interface controllers
(NIC) and network switches.
18
LAN
PTPmaster
PTPslave PTPslave
Warning:WenevertriedtousePTPonEC2.Iamaskingyoutodosomeresearch.
PTP vs. NTPØ NTP
q NTP servers with hardware timestamping and a good oscillator
q Server located as close to clients as possible
q Switches are low latency and lightly loadedq Redundancy: Clients use multiple servers
q Monitoring: servers monitor each other (peer stats)
Ø PTPq Switches are transparent clocks or boundary clocks
q Or use telecom profile technology (ITU-T G.8265.1 or G.8275.2)
q Redundant Grand Masters monitor each other
q Hardware slaves (e.g. PCIe card) when possible
19
PointersØ Linux PTP
q http://linuxptp.sourceforge.net/q SO_TIMESTAMPS
q Supports the Linux PTP Hardware Clock (PHC) subsystem by using the clock_gettime family of calls, including the new clock_adjtimex system call. clo
Ø PTP daemon man pageq sudo apt-get install ptpd
q man ptpd
Ø NTP v.s PTP: How do you get accuracyq http://www.atis.org/tam/presentations/ntp_vs_ptp_darnold.pdf
q https://blog.meinbergglobal.com/2013/11/22/ntp-vs-ptp-network-timing-smackdown/
20
Scheduling in Classes
Multiple schedulers are implemented as different scheduling classes.
Normal:q SCHED_NORMAL/SCHED_OTHER: regular, interactive CFS tasksq SCHED_BATCH: low priority, non-interactive CFS tasksq SCHED_IDLE: very low priority tasks
Real-time:q SCHED_RR: round-robin q SCHED_FIFO: first-in, first-out q SCHED_DEADLINE: earliest deadline first
CSE522S–AdvancedOpera>ngSystems 22
Real-Time Scheduling
23
Real-time tasks execute repeatedly (usually are periodic) under some time constraint
Task Task Task
E.g.,ataskisreleasedtoexecuteevery5msec,andeachinvoca>onhasadeadlineof5msec
Separatepriorityrangefromnice:• Priori>esrangefrom1(low)to99(high)
0ms 5ms 10msTime
Source:CSE522S–AdvancedOpera>ngSystems:hQp://www.cse.wustl.edu/~cdgill/courses/cse522/slides/09_real_>me_sched.pptx
Real-Time OS Support
Goal is to achieve predictable execution:
Sources of uncertainty (and solutions):q Scheduling preemptions (real-time scheduling)
q Interrupts (can mask interrupts)q Migrations (can pin tasks to cores)
q OS latency & jitter (RT_PREEMPT patch set)
Source:CSE522S–AdvancedOpera>ngSystems:hQp://www.cse.wustl.edu/~cdgill/courses/cse522/slides/09_real_>me_sched.pptx24
Ideal: Realworld:
Preempt Migrate
SCHED_RR
Round-robin scheduling
Among tasks of equal priority:Ø Rotate through all tasks
Ø Each task gets a fixed time slice
Cannot run if higher priority tasks are runnable
25
Task1 Task2 Task3
Time
Task1 Task2 Task3 Task1 Task2 Task3
Source:CSE522S–AdvancedOpera>ngSystems:hQp://www.cse.wustl.edu/~cdgill/courses/cse522/slides/09_real_>me_sched.pptx
SCHED_FIFO
First-in, First-out scheduling
Ø The first enqued task of highest priority executes to completionØ A task will only relinquish a processor �
when it completes, yields, or blocks
26
Task1 Task2 Task3
Time
Source:CSE522S–AdvancedOpera>ngSystems:hQp://www.cse.wustl.edu/~cdgill/courses/cse522/slides/09_real_>me_sched.pptx
SCHED_DEADLINE
Earliest Deadline First (EDF) scheduling
Ø Whichever task has next deadline gets to run
Ø Theory exists to analyze such systemsØ Linux implements bandwidth reservation to prevent deadline abuse
27
Task3 Deadline:8Exec>me:2Task2 Deadline:12
Exec>me:3Task1 Deadline:5Exec>me:4
Time0
Task1 Task2 Task3
5 8 12
Source:CSE522S–AdvancedOpera>ngSystems:hQp://www.cse.wustl.edu/~cdgill/courses/cse522/slides/09_real_>me_sched.pptx
Time0
Task1 Task2Task3
5 8 12
MissDeadlineWhenusingSCHED_FIFO
Scheduler Setup – Basic Ø Two classes, would always schedule RT class first
q RT class: static priority, 1 (lowest) to 99 (highest)• Preemptive scheduling• SCHED_FIFO, SCHED_RR to schedule processes with same
priority• Can be used to implement static priority (like rate monotonic)• SCHED_DEADLINE: Exists in Kernel, however, no libc wrapper:
§ You need write a Syscall Wrapper by yourself§ Or: https://github.com/jlelli/schedtool-dl
• Default Reserve 5% for other classes (50ms every 1s)§ /proc/sys/kernel/sched_rt_period_us 1000000§ /proc/sys/kernel/sched_rt_runtime_us 950000
q Non-RT class: SCHED_OTHER with Complete Fair Scheduler
28
Scheduler Setup – Priorities Ø chrt command (can also check task priorities)
http://www.cyberciti.biz/faq/howto-set-real-time-scheduling-priority-process/
q sudo chrt –f –p 99 4800 # pid 4800 with priority 99 and fifo
Ø sched_scheduler [http://linux.die.net/man/2/sched_setscheduler]
29
#include <sched.h> int main() { … struct sched_param sched; sched.sched_priority = 98; if (sched_setscheduler(getpid(), SCHED_FIFO, &sched) < 0) { exit(EXIT_FAILURE); } … }
Scheduler Setup – AffinitiesØ taskset command (can also check task affinities) [
http://linux.die.net/man/2/sched_setscheduler]
q sudo taskset -c 2,3 4800 # pid 4800 runs on cores 2-3
Ø sched_setaffinity [http://linux.die.net/man/2/sched_setscheduler]
30
#include <sched.h> int main() { … unsigned long mask = 1; if (sched_setaffinity(getpid(), sizeof(mask), &mask) < 0) { exit(EXIT_FAILURE); } … }
Scheduler Setup – Preemptive Ø Scheduler is triggered every HZ quantum
Ø cat /boot/config-* | grep CONFIG_HZq For most desktops, value is 1000. ticked every 1ms
Ø CONFIG_NO_HZ = y q Temporarily disable timer interrupt when system is idle or there
is only single task running
Ø CONFIG_HIGH_RES_TIMERS = y http://elinux.org/High_Resolution_Timers
Ø Can recompile kernel to change these values
31
Trace Your System by using ftraceØ ftrace
q Traces the internal operations of the kernel q Static tracepoints within the kernel (event tracing)
• Scheduling• interrupts
Ø Trace-cmdq Front-End (user-level) utility for ftraceq Example:
• sudo trace-cmd record -e sched_switch ./myapp• Dump trace.dat
Ø Kernel Sharkq GUI trace-cmd reader
• kernelshark trace.dat
32KernelShark:h`p://rostedt.homelinux.com/kernelshark/
Generate Periodic TasksØ Video decoding, sensor processing, etc.Ø Use Signals and Timers
Ø Many other approaches in pointers
34
struct sigaction sa; … sa.sa_sigaction = work; sigaction(SIGRTMIN, &sa, NULL); … struct sigevent timer_event; timer_event.sigev_signo = SIGRTMIN; … timer_create(CLOCK_REALTIME, &timer_event, &timer); timer_settime(timer, TIMER_ABSTIME, &timerspec, NULL); …
PointersØ Periodically running a task
q http://man7.org/linux/man-pages/man2/timer_create.2.htmlq http://courses.cs.vt.edu/~cs5565/spring2012/projects/project1/posix-timers.cq http://www.embedded-linux.co.uk/tutorial/periodic_threads
Ø Video playersq https://wiki.litmus-rt.org/litmus/Publications
Ø Get time in Linuxq gettimeofday: http://linux.die.net/man/2/gettimeofdayq rdtsc: http://www.mcs.anl.gov/~kazutomo/rdtsc.html
Ø Fix CPU frequenciesq http://www.mjmwired.net/kernel/Documentation/cpu-freq/governors.txtq https://wiki.archlinux.org/index.php/CPU_Frequency_Scaling
35
PointersØ Linux schedulers
q https://www.kernel.org/doc/Documentation/scheduler/q http://oreilly.com/catalog/linuxkernel/chapter/ch10.html
Ø Set priorityq chrt: http://www.cyberciti.biz/faq/howto-set-real-time-scheduling-priority-process/
q sched_scheduler: http://linux.die.net/man/2/sched_setscheduler
Ø Set CPU affinity on multi-core:q taskset: http://linux.die.net/man/1/tasksetq Check freq: http://www.advenage.com/topics/linux-timer-interrupt-frequency.php
Ø Linux real-time patches:q Linux Preempt RT: https://rt.wiki.kernel.org/index.php/Main_Pageq LITMUS-RT: http://www.litmus-rt.org/q RTAI: https://www.rtai.org/ q SCHED_DEADLINE: http://gitorious.org/sched_deadline
36