Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
description
Transcript of Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
Red Hat Subject Matter Experts
Red Hat Enterprise Linux 6
Performance Tuning Guide
Opt imizing subsystem throughput in Red Hat Enterprise Linux 6
Edit ion 4.0
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
Red Hat Enterprise Linux 6 Performance Tuning Guide
Opt imizing subsystem throughput in Red Hat Enterprise Linux 6
Edit ion 4.0
Red Hat Subject Matter Experts
Edited by
Don Domingo
Laura Bailey
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
Legal Notice
Copyright 2011 Red Hat, Inc. and others.
This document is licensed by Red Hat under the Creative Commo ns Attribution-ShareAlike 3.0Unported License. If you dis tribute this do cument, or a modified versio n of it, you must provideattribution to Red Hat, Inc. and provide a link to the original. If the document is modified, all RedHat trademarks must be removed.
Red Hat, as the licenso r of this document, waives the right to enforce, and agrees no t to assert,Section 4d o f CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, Red Hat Enterprise Linux, the Shadowman logo, JBoss, MetaMatrix, Fedora, the InfinityLogo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and o thercountries.
Linux is the registered trademark o f Linus Torvalds in the United States and o ther countries.
Java is a regis tered trademark o f Oracle and/or its affiliates.
XFS is a trademark of Silicon Graphics International Co rp. or its subsidiaries in the UnitedStates and/or o ther countries.
MySQL is a registered trademark o f MySQL AB in the United States, the European Unio n andother countries.
Node.js is an o fficial trademark of Joyent. Red Hat Software Collections is not formallyrelated to o r endorsed by the official Joyent Node.js open so urce o r commercial project.
The OpenStack Word Mark and OpenStack Logo are either registered trademarks/servicemarks or trademarks/service marks of the OpenStack Foundation, in the United States and o ther
countries and are used with the OpenStack Foundation's permiss ion. We are not affiliated with,endorsed or sponso red by the OpenStack Foundation, or the OpenStack community.
All other trademarks are the property of their respective owners.
Abstract
The Performance Tuning Guide describes ho w to o ptimize the performance of a systemrunning Red Hat Enterprise Linux 6 . It also documents performance-related upgrades in RedHat Enterprise Linux 6 . While this guide contains procedures that are field- tested and proven,
Red Hat recommends that you properly test all planned configurations in a testing environmentbefore applying it to a production environment. You sho uld also back up all your data and pre-tuning configurations.
http://creativecommons.org/licenses/by-sa/3.0/ -
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table of Contents
Preface
1. Doc ument Conventio ns
1.1. Typographic Conventions
1.2. Pull-quo te Conventions
1.3. No tes and Warnings
2. Getting Help and Giving Feedback2.1. Do Yo u Need Help ?
2.2. We Need Feedb ack!
Chapter 1. Overview
1.1. Ho w to read this bo ok
1.1.1. Audience
1.2. Release o verview
1.2.1. New features i n Red Hat Enterp rise Linux 6
1.2.1.1. New in 6 .6
1.2.1.2. New in 6 .5
1.2.2. Ho rizontal Scalability1.2.2.1. Parallel Co mputing
1.2.3. Distrib uted Systems
1.2.3.1. Communication
1.2.3.2. Storage
1.2.3.3. Converged Networks
Chapt er 2. Red Hat Ent erpriseLinux 6 Performance Features
2.1. 6 4-Bit Suppo rt
2.2. Tic ket Spi nloc ks
2.3. Dynamic Lis t Structure
2.4. Tic kless Kernel
2.5. Co ntrol Gro ups
2.6. Storag e and File System Impro vements
Chapt er 3. Monitoring and Analyzing System Performance
3.1. The proc File System
3.2. GNOME and KDE System Monito rs
3.3. Performance Co-Pilot (PCP)
3.3.1. PCP Architecture
3.3.2. PCPSetup
3.4. irqb alance3.5. Buil t-in Command-line Monitoring Tools
3.6. Tuned and ktune
3.7. Appli catio n Profilers
3.7.1. SystemTap
3.7.2. OProfile
3.7.3. Valgrind
3.7.4. Perf
3.8. Red Hat Enterpris e MRG
Chapter 4. CPU
Topo logy
Threads
Interrup ts
4.1. CPU Top olo gy
4
4
4
5
6
66
7
8
8
8
9
9
9
10
11
11
12
12
13
14
16
16
16
17
17
18
19
21
21
21
22
22
23
2425
26
27
27
29
29
30
31
32
32
32
32
33
T able of Cont ents
1
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . .
4.1.2. Tuning CPU Performance
4.1.2.1. Setting CPU Affinity with taskset
4.1.2.2. Co ntroll ing NUMA Pol icy with numactl
4.1.3. Hard ware p erformance po licy (x86_energy_perf_po licy)
4.1.4. turbostat
4.1.5. numastat
4.1.6. N UMA Affinity Manag ement Daemon (numad)
4.1.6 .1. Benefits o f numad
4.1.6.2. Mod es o f op eratio n
4.1.6.2.1. Using numad as a service
4.1.6.2.2. Using numad as an executable
4.2. CPU Sched uling
4.2.1. Realtime scheduling po licies
4.2.2. Normal scheduling p ol icies
4.2.3. Po lic y Selectio n
4.3. Interrup ts and IRQ Tuning
4.4. CPU Frequency Go verno rs
4.5. Enhancements to N UMA in Red Hat Enterp rise Linux 6
4.5.1. Bare-metal and Scalab ility Op timizations
4.5.1.1. Enhancements in topology-awareness
4.5.1.2. Enhancements in Multi-processor Synchronization
4.5.2. Virtualization Optimizations
Chapter 5. Memory
5.1. Huge Translation Lookaside Buffer (HugeTLB)
5.2. Huge Pages and Transparent Huge Pages
5.3. Using Valgrind to Profile Memory Usage
5.3.1. Profili ng Memo ry Usag e with Memcheck
5.3.2. Profiling Cache Usage with Cachegrind
5.3.3. Pro filing Heap and Stack Space with Mass if
5.4. Capacity Tuning
5.5. Tuning Vir tual Memory
Chapter 6. Input/Out put
6.1. Features
6 .2. Analysis
6.3. Too ls
6.4. Configuration
6 .4.1. Comp letely Fair Q ueuing (CFQ)6.4.2. Deadline I/O Scheduler
6.4.3. Noo p
Chapt er 7. File Systems
7.1. Tuning Considerations for File Systems
7.1.1. Formatting Op tions
7.1.2. Mount Op tions
7.1.3. File sys tem maintenance
7.1.4. App lication Consi derations
7.2. Profiles for file system performance
7.3. File Systems
7.3.1. The Ext4 File System
7.3.2. The XFS File System
7.3.2.1. Basic tuning for XFS
34
36
36
38
38
39
41
42
42
42
43
43
44
44
45
45
46
47
47
47
48
48
49
49
49
50
50
51
52
53
56
59
59
59
6 1
6 5
6 56 7
6 8
70
70
70
70
72
72
72
73
73
74
74
Red Hat Ent erprise Linux 6 Performance Tuning G uide
2
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
7.3.2.2.1. Optimizing for a large number of files
7.3.2.2.2. Optimizing for a large number o f files in a sing le directo ry
7.3.2.2.3. Op timising for co ncurrency
7.3.2.2.4. Optimising for applications that use extended attributes
7.3.2.2.5. Op timising for sustained metadata mod ifications
7.4. Clustering
7.4.1. Glo b al File System 2
Chapt er 8. Net working
8.1. Network Performance Enhancements
Receive Packet Steering (RPS)
Receive Flow St eering
get sockopt support for T CP thin- streams
T ransparent Proxy (TProxy) suppo rt
8.2. Op timized Network Settings
Socket receive buffer size
8 .3. Overview o f Packet Receptio n
CPU/cache aff inity
8.4. Resolving Co mmon Q ueuing/Frame Loss Issues
8 .4.1. NIC Hard ware Buffer
8 .4.2. Socket Queue
8.5. Multicast Consi derations
8.6. Receive-Side Scaling (RSS)
8 .7. Receive Packet Steering (RPS)
8 .8. Receive Flow Steering (RFS)
8.9 . Accelerated RFS
Revision Hist ory
75
75
75
76
76
77
77
80
8 0
80
80
81
81
8 1
83
8 3
84
8 4
8 4
8 5
8 6
8 6
8 7
8 8
8 9
90
T able of Cont ents
3
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
Preface
1. Document Conventions
This manual uses several con ventions to h ighlight certain words an d phrases and draw attention to
specific pieces of information.
1.1. T ypographic Conventions
Four typographic conventions are used to ca ll attention to specific words and phrases. These
conventions, and the circumstances they apply to, are as follows.
Mono-spaced Bold
Used to hig hlight system input, includ ing shell command s, file names and paths. Also used to
high light keys and key combinations. For example:
To see the contents of the file my_next_bestselling_novelin you r currentworking d irectory, enter th e cat my_next_bestselling_novel command at the
shell prompt and press Enterto execute the command.
The above includes a file name, a shell command and a key, all presented in mono-spaced bold and
all distingu ishab le thanks to context.
Key combina tions can be distinguished from an in dividua l key by the plus sign that connects each
part of a key combination . For example:
Press Enterto execute the command.
Press Ctrl+Alt+F2to switch to a virtual termina l.
The first example high ligh ts a particula r key to press. The second example highligh ts a key
combination : a set of three keys pressed simultaneously.
If sou rce code is discussed, class na mes, methods, functions, variab le names and returned va lues
mentioned within a paragraph will be presented as above, in mono-spaced bold . For example:
File-related classes include filesystemfor file systems, filefor files, and dirfor
directories. Each class has its own asso ciated set of permissions.
Proportional Bold
This denotes words or phrases encountered o n a system, including a pplication names; dialog -box
text; labeled buttons; check-box and rad io-bu tton labels; menu titles and submenu titles. For
example:
Choose System Preferences Mousefrom the main menu bar to laun ch
Mouse Preferences. In the Buttonstab, select the Left-handed mousecheck
box and click Closeto switch the primary mouse button from the left to the right
(making the mouse suitable for use in the left hand).
To insert a special character into a gedit file, choose Applications
Accessories Character Map from the main menu bar. Next, choose Search
Findfrom the Character Map menu bar, type the name of the character in the
Searchfield an d click Next. The character you sought will be highlighted in the
Red Hat Ent erprise Linux 6 Performance Tuning G uide
4
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
Character Table. Double-click this highlighted character to place it in the Text
to copyfield and then click the Copybutton. Now switch back to your document
and choose Edit Paste from the gedit menu b ar.
The above text inclu des app lication names; system-wide menu names and items; app lication -specific
menu n ames; and buttons an d text found within a GUI interface, all presented in proportional bo ld
and all distinguisha ble by context.
Mono-spaced Bold Italicor Proportional Bold Italic
Whether mono-spaced bold or prop ortional bold, the addition of italics indicates replaceable or
variable text. Italics d enotes text you do not input literally o r displayed text tha t chang es dependin g
on circumstance. For example:
To con nect to a remote machine using ssh, type ssh [email protected] a
shell prompt. If the remote machine is example.comand your username on that
machine is john, type ssh [email protected].
The mount -o remount file-systemcommand remounts the named file system.
For example, to remount the /homefile system, the command is mount -o remount
/home.
To see the version o f a currently installed packag e, use the rpm -qpackage
command. It will return a result as follo ws:package-version-release.
Note the words in b old italics a bove: username, domain.name, file-system, package, version and
release. Each word is a p laceho lder, either for text you enter when issuing a command or for text
disp layed by the system.
Aside from standard usage for presenting the title of a work, italics denotes the first use of a new and
important term. For example:
Publican is a DocBookpublishing system.
1.2. Pull-quote Conventions
Terminal ou tput and source code listings are set off visually from the surroun ding text.
Output sent to a termina l is set in mono-spaced romanand presented thus:
books Desktop documentation drafts mss photos stuff svn
books_tests Desktop1 downloads images notes scripts svgs
Source-code listings are also set in mono-spaced romanbut add syntax highlig hting as follows:
staticintkvm_vm_ioctl_deassign_device(structkvm *kvm,
structkvm_assigned_pci_dev *assigned_dev)
{
intr = 0;
structkvm_assigned_dev_kernel *match;
mutex_lock(&kvm->lock);
match = kvm_find_assigned_dev(&kvm->arch.assigned_dev_head,
assigned_dev->assigned_dev_id);
if(!match) {
printk(KERN_INFO "%s: device hasn't been assigned
Preface
5
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
before, "
"so cannot be deassigned\n", __func__);
r = -EINVAL;
gotoout;
}
kvm_deassign_device(kvm, match);
kvm_free_assigned_device(kvm, match);
out:
mutex_unlock(&kvm->lock);
returnr;
}
1.3. Notes and Warnings
Fina lly, we use three visua l styles to draw attention to information that might otherwise be overlooked.
Note
Notes are tips, shortcuts or alternative approaches to the task a t hand . Ignoring a note should
have no nega tive consequences, but you might miss out on a trick that makes your life easier.
Important
Important boxes detail things that are easily missed: configuration changes that only apply to
the current session, or services that need restarting before an update will app ly. Ignoring a
box labeled Important will no t cause data loss but may cau se irritation an d frustration.
Warning
Warnings shou ld no t be ignored. Ignoring warnings will most likely cause data loss.
2. Get t ing Help and Giving Feedback
2.1. Do You Need Help?
If you experience difficulty with a procedure described in this do cumentation, visit the Red Hat
Customer Portal at http://access.redhat.com. Through the customer portal, you can:
search o r browse through a knowledgebase of technical support articles abou t Red Hat products.
submit a suppo rt case to Red Hat Globa l Support Services (GSS).
access other product documentation.
Red Hat Ent erprise Linux 6 Performance Tuning G uide
6
http://access.redhat.com/ -
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
Red Hat also hosts a large number of electronic mailin g lists for discussio n o f Red Hat software and
techno logy. You can find a list of publicly availab le mailing lists at
https://www.redhat.com/mailman/listinfo . Click on the name of any mailin g list to subscribe to that list
or to access the list archives.
2.2. We Need Feedback!
If you find a typographica l error in this manual, or if you have though t of a way to make this manua lbetter, we would love to hear from you! Please submit a report in Bugzilla : http://bugzilla.redhat.com/
aga inst the product Red Hat Enterprise Linux 6.
When submitting a bug report, be sure to mention the manua l's identifier: doc-
Performance_Tuning_Guide
If you have a suggestion for improving the documentation , try to be as specific as possib le when
describing it. If you have found an error, please includ e the section number and some of the
surrounding text so we can find it easily.
Preface
7
http://bugzilla.redhat.com/https://www.redhat.com/mailman/listinfo -
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
Chapter 1. Overview
The Performance Tuning Guideis a comprehensive reference on the configu ration and o ptimization of
Red Hat Enterprise Linux. While this release also con tains information on Red Hat Enterprise Linux 5
performance capabi lities, all in structions su ppl ied herein are specific to Red Hat Enterprise Linux 6.
1.1. How to read this book
This bo ok is d ivided in to chap ters discussing specific subsystems in Red Hat Enterprise Linux. The
Performance Tuning Guidefocuses on three major themes per subsystem:
Features
Each subsystem chapter describes performance features un ique to (or implemented
differently in) Red Hat Enterprise Linux 6. These chapters also discu ss Red Hat Enterprise
Linux 6 u pdates that significantly improved the performance of specific sub systems over
Red Hat Enterprise Linux 5.
Analysis
The book also enumerates performance indica tors for each specific subsystem. Typica l
values for these ind icators a re described in the context of specific services, helping you
understand their significance in real-world, production systems.
In addition, the Performance Tuning Guidealso sh ows d ifferent ways o f retrieving
performance da ta (that is, profiling) for a subsystem. Note that some of the profiling too ls
showcased here are documented elsewhere with more detail.
Configuration
Perhaps the most important information in this book are instructions on how to adjust the
performance o f a specific sub system in Red Hat Enterprise Linux 6. The Performance Tuning
Guideexplain s how to fine-tune a Red Hat Enterprise Linux 6 subsystem for sp ecific
services.
Keep in mind tha t tweaking a specific sub system's performance may a ffect the performance of
another, sometimes adversely. The default configuration of Red Hat Enterprise Linux 6 is optimal for
mostservices running under moderateloads.
The procedures enumerated in the Performance Tuning Guidewere tested extensively by Red Hat
engineers in bo th lab and field. However, Red Hat recommends that you p roperly test all p lanned
configura tions in a secure testing envi ronment before applyin g it to your production servers. Youshould also back up all da ta and co nfiguration information before you start tuning your system.
1.1.1. Audience
This book is sui table for two types of readers:
System/Bus iness Analyst
This b ook enumerates and explains Red Hat Enterprise Linux 6 performance features a t a
high level, providing enough information on how subsystems perform for specific
workloads (both b y defau lt and when o ptimized). The level of d etai l used in describ ing RedHat Enterprise Linux 6 p erformance features helps p otential customers and sales engineers
understand the suitabil ity of this p latform in provid ing resource-intensive services at an
acceptable level.
Red Hat Ent erprise Linux 6 Performance Tuning G uide
8
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
The Performance Tuning Guidealso provides links to more detailed documentation on each
feature whenever possib le. At that detail level, readers can understand these performance
features enough to form a high-level strategy in d eploying an d op timizing Red Hat
Enterprise Linux 6. This a llows readers to both develop andevaluate infrastructure
proposals.
This feature-focused level of documentation is su itable for readers with a high-level
understanding o f Linux subsystems and enterprise-level networks.
System Administrat or
The procedures enumerated in this bo ok are suitab le for system administrators with RHCE
skill level (or its equ ivalent, tha t is, 3-5 years experience in deploying and managing
Linux). The Performance Tuning Guideaims to provide as much detail as possible about the
effects of each configuration ; this means describin g any performance trade-offs that may
occur.
The underlying skill in performance tuning lies not in knowing how to ana lyze and tune a
subsystem. Rather, a system administrator adept at performance tun ing knows how to
ba lance and optimize a Red Hat Enterprise Linux 6 system for a specific purpose. This meansalsoknowing which trade-offs and performance penalties are acceptable when attempting
to implement a con figura tion design ed to bo ost a specific subsystem's performance.
1.2. Release overview
1.2.1. New features in Red Hat Enterprise Linux 6
Read this section for a brief overview of the performance-related ch anges inclu ded in Red Hat
Enterprise Linux 6 .
1.2.1 .1. New in 6.6
perf has been up dated to version 3.12, which includes a number of new features, inclu ding:
New perf record options for statistically sampling consecutive taken branches, -jand -b.
See the man p age for further details: man perf-record .
Several new perf reportparameters, including --groupand --percent-limit, and
add itional options for sorting d ata collected with the perf record -jand -boptions
enabled. See the man page for further details : man perf-report.
New perf memcommand for profiling load and store memory access.
Several new options in perf top, including --percent-limitand --obj-dump.
--forceand --appendop tions h ave been removed from perf record .
New --initial-delayoption for perf stat.
New --output-filenameoption for perf trace.
New --groupoption for perf evlist.
Changes to the perf top-G and perf record -g op tion: these are no longer alternatives
to the --call-graphop tion. When libunwind support is ad ded to future versions o f
Red Hat Enterprise Linux, these options will enable the configured unwind method.
[1]
Chapt er 1. O verview
9
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
1.2.1 .2. New in 6.5
Updates to the kernel remove a bottleneck in memory manag ement and improve performance by
allowing I/O load to spread across multiple memory poo ls when the irq-tab le size is 1 GB or
greater.
The cpupowerutils packa ge has been upda ted to include the turbostat and
x86_energy_perf_policytools. These too ls are newly documented in Section 4.1.4,
turbostat and Section 4.1.3, Hardware performance po licy (x86_energy_perf_policy) .
CIFS now suppo rts larger rsize options and a synchrono us readpages, allowing for significant
increases in through put.
GFS2 now provides an Orlov block a llocator to increase the speed a t which block a llocation
takes place.
The virtual-hostsprofile in tuned has been adjusted. The value for
kernel.sched_migration_costis now 5000000 nan oseconds (5 milliseconds) instead o f
the kernel default 500000 nanoseconds (0.5 milliseconds). This reduces contention at the run
queue lock for large virtualization h osts.
The latency-performanceprofile in tuned has been adjusted. The value for power
management qua lity of service, cpu_dma_latency requirement, is now 1instead of 0 .
Several op timization s are now included in the kernel copy_from_user()and
copy_to_user()functions, improv ing the performance of both.
The perf too l has received a number of updates and enhancements, inclu ding:
Perf can now use hardware counters provided by the System z CPU-measurement counter
facility. There are four sets of hardware coun ters ava ilab le: the basic counter set, the problem-
state counter set, the crypto-activity counter set, and the extended counter set.
A new command, perf trace, is now available. This enables strace-like behavior using
perf in frastructure to a llow add itiona l targets to be traced. For further information, refer to
Section 3.7.4, Perf .
A script browser has been added to enable users to view all availab le scripts for the current
perf data file.
Several ad ditional sample scripts are now ava ilable.
Ext3 has been upd ated to reduce lock contention, thereby improving the performance of multi-
threaded write operations.
KSM has been updated to be aware of NUMA topo logy, allowing it to take NUMA locality in to
account while coalescing pa ges. This prevents performance drops related to pages being moved
to a remote node. Red Hat recommends avoid ing cross-node memory merging when KSM is in
use.
This update introduces a new tunable, /sys/kernel/mm/ksm/merge_nodes , to control this
behavior. The default valu e (1) merges pages across different NUMA nodes. Setmerge_nodesto
0 to merge pages only on the same node.
hdparmhas been updated with several new flags, including --fallocate, --offset, and -R
(Write-Read-Verify enablement). Additionally, the --trim-sector-rangesand --trim-sector-ranges-stdinoptions replace the --trim-sectorsoption, allowing more than a
sing le sector range to b e specified. Refer to the man page for further information about these
options.
Red Hat Ent erprise Linux 6 Performance Tuning G uide
10
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
1.2.2. Horizontal Scalability
Red Hat's efforts in improv ing the performance of Red Hat Enterprise Linux 6 focus on scalability.
Performance-boosting features are evaluated primarily based on how they affect the platform's
performance in different areas of the workload spectrum that is, from the lonely web server to the
server farm mainframe.
Focusing on scalabili ty allows Red Hat Enterprise Linux to maintain its versatility for different types of
workloads and purposes. At the same time, this means that as your business grows and your
workload sca les up, re-co nfiguring your server environment is less prohib itive (in terms of cost and
man-hours) and more intuitive.
Red Hat makes improvements to Red Hat Enterprise Linux for both horizontal scalabilityand vertical
scalability; however, horizon tal scalabili ty is the more generally app licab le use case. The idea behind
horizontal scalability is to use multiple standard computersto distribute heavy workloads in order to
improve performance and reliability.
In a typical server farm, these standard computers come in the form of 1U rack-mounted servers and
blade servers. Each standard computer may be as small as a simple two-socket system, although
some server farms use large systems with more sockets. Some enterprise-grade networks mix largeand small systems; in such cases, the large systems are high performance servers (for example,
database servers) and the small ones are dedicated app lication servers (for example, web or mail
servers).
This type of sca lab ility simplifies the growth of your IT infrastructure: a medium-sized business with
an a ppro priate load migh t only need two pizza box servers to suit all their needs. As the business
hires more people, expands its operations, increases its sales volumes and so forth, its IT
requirements increase in both volume and complexity. Horizontal sca lab ility allows IT to simply
deploy additional machines with (mostly) identical configurations as their predecessors.
To summarize, horizontal scalability adds a layer of abstraction that simplifies system hardware
administration. By developing the Red Hat Enterprise Linux platform to scale horizon tally, increasing
the capacity and performance of IT services can be as simple as adding new, easily con figured
machines.
1.2.2 .1. Parallel Comput ing
Users benefit from Red Hat Enterprise Linux's horizontal sca lab ility not just because it simplifies
system hardware administration; but also because horizontal scalability is a suitable development
philosophy given the current trends in hardware advancement.
Consider this: most complex enterprise applica tions have thousa nds of tasks that must be performed
simultaneously, with different coordination methods between tasks. While early computers ha d a
single-core processor to juggle all these tasks, virtually all processors available today have multiple
cores. Effectively, modern computers put multiple cores in a sing le socket, making even sing le-socket
desktops or laptops multi-processor systems.
As of 20 10, stand ard Intel an d AMD processors were available with two to sixteen cores. Such
processors are prevalent in pizza box or blade servers, which can now con tain as many as 40 cores.
These low-cost, high -performance systems bring la rge system capab ilities and ch aracteristics into
the mainstream.
To achieve the best performance and utiliza tion o f a system, each core must be kept busy. This
means that 32 separate tasks must be runn ing to take advan tage of a 32 -core blade server. If ablade chass is con tains ten of these 32-core blades, then the entire setup can pro cess a minimum of
320 tasks simultaneously. If these tasks a re part of a single job, they must be coord ina ted.
Chapt er 1. O verview
11
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
Red Hat Enterprise Linux was developed to adapt well to hardware development trends and ensure
that bu sinesses can fully b enefit from them. Section 1.2.3, D istributed Systems explores the
technologies that enab le Red Hat Enterprise Linux's horizo ntal sca lab ility in greater detail.
1.2.3. Distributed Systems
To fully realize horizontal scala bility, Red Hat Enterprise Linux uses many components of distributed
computing. The technologies that make up d istributed computing are divided in to three layers:
Communication
Horizontal scala bility requires many tasks to be performed simultaneously (in paral lel). As
such , these tasks must have interprocess communicationto coordinate their work. Further, a
platform with horizontal scalability should be able to share tasks across multiple systems.
Storage
Storage via local disks is not sufficient in addressing the requirements of horizontal
scala bility. Some form of distributed or sha red storage is needed, one with a layer of
abstraction that allows a single storage volume's capacity to grow seamlessly with theaddition of new storage hardware.
Management
The most important duty in d istributed computing is the managementlayer. This
management layer coordinates all software and hardware compon ents, efficiently
managing communication, storage, and the usage of shared resources.
The follo wing sections describe the technologies within each layer in more detail.
1.2.3.1. Co mmunicatio n
The communication layer ensures the transport of data, and is composed of two parts:
Hardware
Software
The simplest (and fastest) way for multiple systems to communicate is through shared memory. This
entails the usage of familiar memory read/write operations; shared memory has the high bandwidth,
low latency, and low overhead o f ordina ry memory read/write operations.
Ethernet
The most common way o f communicating between computers is over Ethernet. Today, Gigabit Ethernet
(GbE) is provided by defau lt on systems, and most servers include 2-4 ports of Gigab it Ethernet. GbE
provides good b andwidth and latency. This is the foundation of most distributed systems in use
today. Even when systems include faster network ha rdware, it is still common to u se GbE for a
dedicated management in terface.
10GbE
Ten Gigabit Ethernet(10GbE) is rapidly growing in acceptance for high end and even mid-range
servers. 10GbE provides ten times the bandwidth o f GbE. One of its major a dvantages is with modernmulti-core processors, where it restores the ba lance between communication and computing. You
can compare a single core system using Gb E to an eigh t core system using 10GbE. Used in this way,
10GbE is especially valuable for maintaining overall system performance and avoiding
communication bottlenecks.
Red Hat Ent erprise Linux 6 Performance Tuning G uide
12
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
Unfortunately, 10GbE is expensive. While the cost of 10GbE NICs has come down, the price of
interconnect (especially fibre op tics) remains h igh, and 10GbE network switches are extremely
expensive. We can expect these prices to decline over time, but 10GbE today is most heavily u sed in
server room backbones and performance-critical applications.
Infiniband
Infiniband offers even higher performance than 10GbE. In ad dition to TCP/IP and UDP network
connections used with Ethernet, Infiniband also supports shared memory communication. This
allows Infiniband to work between systems via remote direct memory access(RDMA).
The use of RDMA allows Infiniband to move data d irectly between systems without the overhead of
TCP/IP or socket connections. In turn, this reduces latency, which is critical to some applica tions.
Infiniban d is most commonly used in High Performance Technical Computing(HPTC) applications
which requ ire high bandwid th, low la tency and low overhead . Many supercomputing app lica tions
benefit from this, to the point that the best way to improve performance is by investing in Infiniband
rather than faster processors or more memory.
RoCE
RDMA over Converged Ethernet(RoCE) implements Infiniband-style communications (inc lud ing
RDMA) over a 10GbE infrastructure. Given the cost improvements associated with the growing
volume of 10GbE produ cts, it is reasonable to expect wider usage of RDMA and RoCE in a wide
range of systems and app lications.
Each of these communication methods is fully-su pported b y Red Hat for use with Red Hat Enterprise
Linux 6 .
1.2.3.2. St orage
An environ ment that uses distribu ted computing u ses multiple instances of shared storag e. This can
mean one of two things:
Multiple systems storing data in a single location
A storage unit (e.g. a volume) composed o f multiple storage appliances
The most familiar example of storage is the local d isk drive mounted on a system. This is approp riate
for IT operations where all appl ication s are hosted on on e host, or even a small number of hosts.
However, as the infrastructure scales to do zens or even hun dreds of systems, managing as many
local storage disks becomes difficult and complicated.
Distributed storage adds a layer to ease and automate storage hardware administration as the
business scales. Having multiple systems share a handful of storage instan ces reduces the number
of devices the administrator needs to manage.
Consolidating the storage capabilities of multiple storage appliances into one volume helps both
users and administrators. This type of distributed storage provides a layer of abstraction to storage
poo ls: users see a single unit of storage, which an administrator can easily grow by adding more
hardware. Some techno log ies that enable distributed storage also provide added benefits, such as
failover and multipathing.
NFS
Network File System(NFS) allows multiple servers or users to mount and use the same instance of
remote storage via TCP or UDP. NFS is commonly used to ho ld data sha red by multiple applica tions.
It is also convenient for bu lk storage of large amounts of data.
Chapt er 1. O verview
13
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
SAN
Storage Area Networks(SANs) use either Fibre Channel or iSCSI protoco l to p rovide remote access to
storage. Fibre Channel infrastructure (such as Fibre Channel host bus adapters, switches, and
storage arrays) combines high performance, high bandwidth, and massive storage. SANs separate
storage from processing, providing considerable flexibility in system design.
The other majo r advantage of SANs is tha t they prov ide a mana gement environment for performing
major storage hardware administrative tasks. These tasks in clud e:
Controlling access to storage
Managing large amoun ts of data
Provisioning systems
Backing up and replicating data
Taking snapshots
Supporting system failover
Ensuring data integrity
Migrating data
GFS2
The Red Hat Global File System 2(GFS2) file system provides several specialized capa bili ties. The
basic function of GFS2 is to provid e a single file system, includ ing concurrent read/write access,
shared across multiple members of a cluster. This means that each member of the cluster sees exactly
the same data "on d isk" in the GFS2 filesystem.
GFS2 allows all systems to have concurrent access to the "disk". To maintain data integrity, GFS2
uses a Distributed Lock Manager(DLM), which on ly allows one system to write to a specific location a t
a time.
GFS2 is especially well-suited for failover applications that require high availability in storage.
For further information about GFS2, refer to the Global File System 2. For further information about
storage in general, refer to the Storage Administration Guide. Both are available from
http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ .
1.2.3.3. Converged Net works
Communication over the network is normally done through Ethernet, with storage traffic us ing a
dedicated Fib re Channel SAN environment. It is common to ha ve a dedica ted network or serial link
for system manag ement, and perhaps even heartbeat . As a result, a sing le server is typically on
multiple networks.
Providing multiple connections on each server is expensive, bulky, and complex to manage. This
gave rise to the need for a way to consolid ate all conn ections into on e. Fibre Channel over Ethernet
(FCoE) and Internet SCSI(iSCSI) address this n eed.
FCoE
[2]
Red Hat Ent erprise Linux 6 Performance Tuning G uide
14
http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ -
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
With FCoE, standard fibre chann el commands and data pa ckets are transported over a 10GbE
physical infrastructure via a single converged network adapter(CNA). Standard TCP/IP ethernet traffic
and fibre channel storage operations can be transported via the same link. FCoE uses one physical
network interface card (an d one cable) for multiple logica l network/storage connections.
FCoE offers the following a dvantages:
Reduced number of conn ections
FCoE reduces the number of network connections to a server by ha lf. You can still choose
to have multiple connections for performance or availabili ty; however, a single connection
provides both storage and network connectivity. This is especially helpful for pizza box
servers and b lade servers, since they both h ave very limited sp ace for components.
Lower cos t
Reduced number of connections immedia tely means reduced number of cab les, switches,
and o ther networking equipment. Ethernet's h istory also features great economies of scale;
the cost of networks drops d ramatically a s the number of devices in the market goes from
millions to b illions, as was seen in the decline in the price of 100Mb Ethernet and gigab it
Ethernet devices.
Similarly, 10GbE will a lso become cheaper as more bus inesses adapt to its use. Also, as
CNA hardware is integrated into a sing le chip, widespread use will also increase its volume
in the market, which will result in a sign ificant price drop over time.
iSCSI
Internet SCSI(iSCSI) is ano ther type of con verged network pro tocol; it is an alternative to FCoE. Like
fibre channel, iSCSI provides blo ck-level storage over a network. However, iSCSI does no t provide a
complete management environment. The main advan tage of iSCSI over FCoE is that iSCSI provides
much of the capab ility and flexibil ity of fibre channel, but at a lower cost.
[1] Red Hat Certified Engineer. For more information, refer to
http://www.redhat.com/training/certifications/rhce/.
[2] Heartbeatis the exchange o f messag es b etween systems to ensure that each system is s till
functioning. If a system "lo ses heartbeat" it is assumed to have failed and is shut do wn, with another
system taking o ver fo r it.
Chapt er 1. O verview
15
http://www.redhat.com/training/certifications/rhce/ -
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
Chapter 2. Red Hat Enterprise Linux 6 Performance Features
2.1. 64-Bit Support
Red Hat Enterprise Linux 6 supports 64-bit processors; these processors can theoretically use up to
16 exabytesof memory. As of general availability (GA), Red Hat Enterprise Linux 6.0 is tested and
certified to support up to 8 TB of physical memory.
The size of memory supported by Red Hat Enterprise Linux 6 is expected to grow over several minor
updates, as Red Hat continues to in troduce an d improve more features that enab le the use of larger
memory blocks. Fo r cu rrent details, see https://access.redhat.com/site/articles/rhel-limits . Example
improvements (as of Red Hat Enterprise Linux 6.0 GA) are:
Huge pages and transpa rent huge pages
Non-Uniform Memory Access improvements
These improvements are outlined in g reater detail in the sections that follow.
Huge pages and transparent huge pages
The implementation of huge pagesin Red Hat Enterprise Linux 6 allows the system to manage memory
use efficiently across d ifferent memory worklo ads. Huge pages dynamically u tilize 2 MB pages
compared to the standard 4 KB page size, allowing app lications to scale well while processing
gigabytes an d even terabytes o f memory.
Huge pag es are difficul t to manu ally create, manage, and use. To address this, Red Hat Enterprise 6
also features the use of transparent huge pages(THP). THP automatically manages many of the
complexities invo lved in the use of huge pages.
For more information on huge pages and THP, refer to Section 5.2, Huge Pages and Transparent
Huge Pages .
NUMA improvement s
Many new systems no w support Non-Uniform Memory Access(NUMA). NUMA simplifies the desig n a nd
creation of ha rdware for large systems; however, it also a dds a layer of complexity to app lication
develop ment. For example, NUMA implements b oth local and remote memory, where remote memory
can take several times longer to access than loca l memory. This feature has performance
implications for operating systems and a pplications, an d should be configured carefully.
Red Hat Enterprise Linux 6 is better op timized for NUMA use, thanks to several additional features
that help mana ge users and applica tions o n NUMA systems. These features include CPU affinity,
CPU pinn ing (cpusets), numactl and control group s, which allow a p rocess (affinity) or applica tion
(pinning ) to "b ind" to a specific CPU or set of CPUs.
For more information about NUMA support in Red Hat Enterprise Linux 6, refer to Section 4.1.1, CPU
and NUMA Topology .
2.2. Ticket Spinlocks
A key part of any system design is ensuring that one process does not alter memory used by anotherprocess. Uncontrolled data change in memory can result in data corruption and system crashes. To
prevent this, the operating system allows a process to lock a piece of memory, perform an operation ,
then un lock or " free" the memory.
Red Hat Ent erprise Linux 6 Performance Tuning G uide
16
https://access.redhat.com/site/articles/rhel-limits -
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
One common implementation of memory locking is through spin locks, which allow a process to keep
checking to see if a lock is availa ble and take the lock as soon a s it becomes available. If there are
multiple processes competing for the same lock, the first one to request the lock after it has been freed
gets it. When all processes have the same access to memory, this approach is " fair" a nd works qu ite
well.
Unfortunately, on a NUMA system, not all processes ha ve equal a ccess to the locks. Processes on
the same NUMA node as the lock have an unfair advan tage in obtaining the lock. Processes on
remote NUMA nodes experience lock starva tion and degraded performance.
To address this, Red Hat Enterprise Linux implemented ticket spinlocks. This feature adds a
reservation queue mechanism to the lock, al lowing allprocesses to take a lock in the order that they
requested it. This eliminates timing problems and unfair ad van tages in lock requests.
While a ticket spinlock has slightly more overhead than an ordinary spinlock, it scales better and
provides better performance on NUMA systems.
2.3. Dynamic List Structure
The operating system requires a set of information on each processor in the system. In Red Hat
Enterprise Linux 5, this set of information was a lloca ted to a fixed-size array in memory. Information
on each individu al processor was obtained by indexing into this array. This method was fast, easy,
and straightforward for systems that contained relatively few processors.
However, as the number of processors for a system grows, this method p rodu ces signi ficant
overhead. Because the fixed-size array in memory is a sing le, shared resource, it can become a
bottleneck as more processors attempt to access it at the same time.
To address this, Red Hat Enterprise Linux 6 uses a dynamic list structurefor processor information.
This al lows the array used for processor information to be allocated dyna mically: if there are only
eight processors in the system, then only eight entries are created in the list. If there are 2048
processors, then 2048 entries a re created as well.
A dynamic list structure allows more fine-grain ed lockin g. For example, if information needs to b e
updated a t the same time for processors 6, 72, 183, 657, 931 and 1546, this can be done with greater
parallelism. Situations like this obviously occur much more frequently on large, high-performance
systems than on small systems.
2.4. Tickless Kernel
In previous versions of Red Hat Enterprise Linux, the kernel used a timer-based mechanism thatcontinuously p rodu ced a system interrupt. Du ring each interrupt, the systempolled; that is, it checked
to see if there was work to be done.
Dependin g on the setting, this system interrupt or timer tickcou ld occur several hu ndred or several
thousand times per second. This ha ppened every second, regardless of the system's worklo ad. On a
ligh tly load ed system, this impactspower consumptionby preventing the processor from effectively
using sleep states. The system uses the least power when it is in a sleep state.
The most power-efficient way for a system to op erate is to do work as quickly as possib le, go into the
deepest sleep state possible, and sleep a s long as p ossib le. To implement this, Red Hat Enterprise
Linux 6 uses a tickless kernel. With this, the interrupt timer has been removed from the idle loop,transforming Red Hat Enterprise Linux 6 into a completely in terrupt-driven environment.
The tickless kernel allows the system to go into deep sleep states du ring idle times, and respon d
quickly when there is work to be don e.
Chapt er 2. Red Hat Ent erprise Linux 6 Performance Feat ures
17
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
For further information, refer to the Power Management Guide, available from
http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ .
2.5. Control Groups
Red Hat Enterprise Linux p rovides many useful options for performance tuning . Large systems,
scaling to hundreds of processors, can b e tuned to deliver superb performance. But tuning these
systems requ ires considerable expertise and a well-defined workload. When large systems were
expensive and few in number, it was acceptable to g ive them special treatment. Now that these
systems are mainstream, more effective tools are needed.
To further complica te things, more powerful systems are being u sed now for service consolid ation .
Workloads that may h ave been run ning on four to eight older servers are now placed in to a sin gle
server. And as d iscussed earlier in Section 1.2.2.1, Para llel Computing , many mid-range systems
nowadays contain more cores than yesterday's high-performance machines.
Many modern ap plications are designed for parallel processing, using multiple threads or p rocesses
to improve performance. However, few applications can make effective use of more than eight
threads. Thus, multiple applica tions typica lly need to be installed on a 32 -CPU system to maximizecapacity.
Consider the situation : small, inexpensive mainstream systems are no w at parity with the performance
of yesterday's expensive, high-performance mach ines. Cheaper high-performance mach ines gave
system architects the ability to conso lida te more services to fewer machin es.
However, some resources (such as I/O and network communication s) are shared, and do not grow
as fast as CPU count. As such , a system housing multiple app lications can experience degraded
overall p erformance when one ap plication h ogs too much o f a sing le resource.
To ad dress this, Red Hat Enterprise Linux 6 n ow supports control groups(cgroups). Cgroups allow
administrators to allocate resources to specific tasks as needed. This means, for example, being a ble
to alloca te 80% of four CPUs, 60GB of memory, and 40% of disk I/O to a database application. A
web a pp lica tion running o n the same system could be given two CPUs, 2GB of memory, and 50% of
availab le network band width.
As a result, both da tabase and web applica tions deliver goo d performance, as the system prevents
both from excessively consuming system resources. In addi tion, many aspects of cgroup s are self-
tuning, allowing the system to respond acco rdingly to changes in workload.
A cgroup has two major components:
A list of tasks assigned to the cgroup
Resources alloca ted to those tasks
Tasks assigned to the cgroup run withinthe cgroup. Any child tasks they spawn a lso run within the
cgroup. This a llows an a dministrator to manage an entire application a s a single unit. An
administrator can also configure allocations for the following resources:
CPUsets
Memory
I/O
Network (bandwidth)
Within CPUsets, cgroups a llow administrators to configure the number of CPUs, affinity for specific
Red Hat Ent erprise Linux 6 Performance Tuning G uide
18
http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ -
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
CPUs or nodes , and the amoun t of CPU time used by a set of tasks. Using cgroups to configure
CPUsets is vital for ensuring good overall performance, preventing an application from consuming
excessive resources at the cost of other tasks while simultaneously ensuring tha t the app lication is
not starved for CPU time.
I/O bandwidth an d network band width are managed by other resource controllers. Aga in, the
resource controllers allow you to determine how much bandwidth the tasks in a cgroup can
consume, and ensure that the tasks in a cgroup n either consu me excessive resources nor are
starved of resources.
Cgroup s allo w the administrator to define and a llocate, at a high level, the system resources that
various applications need (and will) consume. The system then automatically manages and
balances the various a pplica tions, delivering g ood predictable performance and optimizing the
performance of the overall system.
For more information on how to use control g roups, refer to the Resource Management Guide,
available from http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ .
2.6. Storage and File Syst em Improvements
Red Hat Enterprise Linux 6 also features several improvements to storage and file system
management. Two of the most no table advances in this version are ext4 and XFS sup port. For more
comprehensive coverage of performance improvements relating to storage and file systems, refer to
Chapter 7, File Systems.
Ext4
Ext4 is the default file system for Red Hat Enterprise Linux 6. It is the fourth generation version of the
EXT file system family, supporting a theoretical maximum file system size of 1 exabyte, and single file
maximum size of 16TB. Red Hat Enterprise Linux 6 supports a maximum file system size of 16TB, anda single file maximum size of 16TB. Other than a much larger storage capacity, ext4 also inclu des
several n ew features, such as:
Extent-based metadata
Delayed a llocation
Journal check-summing
For more in formation abou t the ext4 file system, refer to Section 7.3.1, The Ext4 File System.
XFS
XFS is a robu st and mature 64-bit journalin g file system that supports very large files and file systems
on a sing le host. This file system was o riginally developed by SGI, and has a long history of running
on extremely large servers and storage arrays. XFS features include:
Delayed a llocation
Dyna mically-allocated inodes
B-tree indexing for scala bili ty of free space management
Online defragmentation and file system growing
Sophisticated metadata read-ahead algorithms
[3]
Chapt er 2. Red Hat Ent erprise Linux 6 Performance Feat ures
19
http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ -
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
While XFS scales to exabytes, the maximum XFS file system size supported by Red Hat is 100TB. For
more information about XFS, refer to Section 7.3.2, The XFS File System .
Large Bo ot Drives
Tradition al BIOS supports a maximum disk size of 2.2TB. Red Hat Enterprise Linux 6 systems using
BIOS can suppo rt disks larger than 2.2TB by using a new disk structure called Global Partition Table
(GPT). GPT can on ly be used for da ta disks; it cannot be used for boot drives with BIOS; therefore,
boot drives can o nly b e a maximum of 2.2TB in size. The BIOS was originally created for the IBM PC;
while BIOS has evolved considerably to adapt to modern hardware, Unified Extensible Firmware
Interface(UEFI) is designed to support new and emerging hardware.
Red Hat Enterprise Linux 6 also supports UEFI, which can be used to replace BIOS (still sup ported).
Systems with UEFI runnin g Red Hat Enterprise Linux 6 a llow the use of GPT and 2 .2TB (and la rger)
partitions for both boot partition and d ata partition.
Important UEFI for 32-bit x86 systems
Red Hat Enterprise Linux 6 does not support UEFI for 32-bi t x86 systems.
Important UEFI for AMD64 and Intel 64
Note that the boot configu rations of UEFI and BIOS differ sign ificantly from each other.
Therefore, the installed system must boot using the same firmware that was used du ring
installation . You cann ot install the operating system on a system that uses BIOS and then
boot this instal lation on a system that uses UEFI.
Red Hat Enterprise Linux 6 supports version 2.2 o f the UEFI specification. Hardware that
supports version 2.3 of the UEFI specification o r later shou ld bo ot and operate with Red Hat
Enterprise Linux 6, but the add itiona l functionality defined by these later specifications will not
be ava ilab le. The UEFI specifications a re available from http://www.uefi.org/specs/agreement/.
[3] A node is generally defined as a set of CPUs o r co res within a soc ket.
Red Hat Ent erprise Linux 6 Performance Tuning G uide
20
http://www.uefi.org/specs/agreement/ -
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
Chapter 3. Monitoring and Analyzing System Performance
This chapter briefly introduces tools that can be used to monitor and ana lyze system and app lication
performance, and poin ts out the situation s in which each tool is most useful. The data collected by
each too l can reveal bo ttlenecks or other system problems that con tribute to less-than-optimal
performance.
3.1. The proc File Syst em
The proc" file system" is a directory that contains a hierarchy o f files that represent the current state
of the Linux kernel. It allows applica tions and u sers to see the kernel's v iew of the system.
The procdirectory also contains information abou t the hardware of the system, and an y currently
runn ing processes. Most of these files are read-only, but some files (primarily those in /proc/sys)
can be manipulated b y users and app lications to communicate configuration changes to the kernel.
For further information about viewing and editing files in the procdi rectory, refer to the Deployment
Guide, availa ble from http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ .
3.2. GNOME and KDE System Monitors
The GNOME and KDE desktop environments both have graphica l tools to ass ist you in monitoring
and modifying the behavior of your system.
GNO ME System Monito r
The GNO ME System Monito rdisp lays basic system information and allows you to monitor system
processes, and resource or file system usage. Open it with the gnome-system-monitorcommandin the Terminal, or click on the Applicationsmenu, and select System Too ls > System
Monitor.
GNO ME System Monito rhas four tabs:
System
Displays basic in formation a bou t the computer's hardware and software.
Processes
Shows active processes, and the relation ships between those processes, as well as
detailed information abou t each process. It also lets you filter the processes disp layed, and
perform certain actions on those processes (start, stop, kill, cha nge priority, etc.).
Resources
Displays the current CPU time usage, memory and swap space usag e, and network usage.
File Systems
Lists all moun ted file systems alongside some basic information about each , such as the
file system type, mount poin t, and memory usa ge.
For further information about the GNO ME System Moni to r, refer to the Help menu in the
app lication, or to the Deployment Guide, availab le from
http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ .
Chapt er 3. Monitoring and Analyzing Syst em Performance
21
http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ -
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
KDE System Gu ard
The KDE System Gu ard allows you to monitor current system load and processes that are running .
It also lets you perform actions on processes. Open it with the ksysguard command in the
Terminal, or click on the Kickoff Application Launcherand select Applications> System>
System Monit or .
There are two tabs to KDE System Gu ard :
Process Tabl e
Displays a list of all running processes, alphabetically by default. You can also sort
processes by a nu mber of other properties, inclu ding total CPU usage, physica l or sha red
memory usag e, owner, and p riority. You can also filter the visible results, search for specific
processes, or perform certain actions on a process.
System Load
Displays historical g raphs of CPU usage, memory and swap space usage, and n etwork
usage. Hover over the graphs for detailed ana lysis and graph keys.
For further information about the KDE System Gu ard , refer to the Help menu in the app lication.
3.3. Performance Co-Pilot (PCP)
Performance Co-Pilot (PCP) provides tools and infrastructure for monitoring, analysing, and
respond ing to details of system performance. PCP has a fully d istributed architecture, which means it
can b e run on a sing le host, or across any number of hosts, depending on yo ur needs. PCP is also
designed with a plug-in architecture, making it useful for monitoring and tuning a wide variety of
subsystems.
To install PCP, run:
# yum install pcp
Red Hat also recommends thepcp-guipa ckage, which provides the ability to create visualizations of
collected da ta, and thepcp-docpackage, which installs detailed PCP documentation to the
/usr/share/doc/pcp-docdirectory.
3.3.1. PCP Architecture
A PCP deployment is comprised of both co llector and mon itor systems. A sing le host can b e both a
collector an d a monitor, or collectors and monitors can be distributed across a number of hosts.
Collector
A collector system collects performance da ta from one or more domains, an d stores it for
analysis. Collectors have a Performance Metrics Collector Daemon (pmcd ), which passes
requests for performance data to an d from the appropria te Performance Metrics Do main
Agent, and one or more Performance Metrics Domain Agents (PMDA), which are
responsible for respond ing to requests about their domain (a database, a server, an
applica tion, or similar). PMDAs are controlled by the pmcd running on the same collector.
Monitor
A monitor system uses monitoring too ls like pmieor pmreportto display and analyse data
from local or remote collectors.
Red Hat Ent erprise Linux 6 Performance Tuning G uide
22
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
3.3.2. PCP Setup
A collector requires a runn ing Performance Metrics Collector Daemon (pmcd ) and o ne or more
Performance Metrics Domain Agents (PMDAs).
Procedure 3.1. Settin g up a collector
1. Install PCP
Run the follo wing command to install PCP on this system.
# yum install pcp
2. Start pmcd
# service pmcd start
3. Opt ionally, confi gure addition al PMDAs
The kernel, pmcd, per-process, memory-mapped valu es, XFS, and JBD2 PMDAs are installed
and con figured in /etc/pcp/pmcd/pmcd.confby default.
To con figure an ad ditional PMDA, change into its directory (for example,
/var/lib/pcp/pmdas/pmdaname) directory, and run the Install script. For example, to
install the PMDA for proc:
# cd /var/lib/pcp/pmdas/proc
# ./Install
Then follow the prompts to configure the PMDA for a collector system, or a monitor and
collector system.
4. Opt ionally, listen f or remote monitors
To respond to remote monitor systems, pmcd must be able to listen for remote monitors
throug h po rt 44321. Execute the following to open the appropria te port:
# iptables -I INPUT -p tcp -dport 44321 -j ACCEPT
# iptables save
You will need to ensure that no o ther firewall rules blo ck access through this port.
A monitor requires thatpcpis instal led and ab le to connect to at least one pmcd instance, remote or
local. If your collector and your monitor are both on a single machine, and yo u ha ve followed the
instructions in Procedure 3.1, Setting up a collector , there is no further configuration and you can
go a head and use the monitoring tools p rovided by PCP.
Procedure 3.2. Setting up a remote monitor
1. Install PCP
Run the following command to in stall PCP on the remote monitor system.
# yum install pcp
Chapt er 3. Monitoring and Analyzing Syst em Performance
23
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
2. Connect t o remote collectors
To connect to remote collector systems, PCP must be able to contact remote collectors
throug h po rt 44321. Execute the following to open the appropria te port:
# iptables -I INPUT -p tcp -dport 44321 -j ACCEPT
# iptables save
You will a lso need to ensure that no other firewall rules block access throu gh this p ort.
You can now test that you can connect to the remote collector by run ning a p erformance monitoring
tool with the -hop tion to sp ecify the IP address of the collector you want to con nect to, for example:
# pminfo -h 192.168.122.30
3.4. irqbalance
irqbalanceis a command lin e tool that distribu tes hardware interrupts across processors toimprove system performance. It runs as a daemon by defau lt, but can be run o nce on ly with the --
oneshotoption.
The following parameters are useful for improving performance.
--powerthresh
Sets the number of CPUs that can id le before a CPU is p laced into powersave mode. If more
CPUs than the threshold are more than 1 standard d eviation below the average softirq
workload and no CPUs are more than one standard deviation ab ove the average, and
have more than o ne irq assign ed to them, a CPU is placed into p owersave mode. In
powersave mode, a CPU is not part of irq balancing so that it is not woken unnecessarily .
--hintpolicy
Determines how irq kernel affinity hinting is handled. Valid valu es are exact(irq a ffinity
hint is always ap plied), subset(irq is ba lanced, but the assign ed object is a subset of the
affinity hint), or ignore(irq affinity hin t is igno red completely).
--policyscript
Defines the location of a script to execute for each interrupt request, with the device path
and irq number passed as arguments, and a zero exit code expected by irqbalance. The
script defined can sp ecify zero or more key value pai rs to guide irqbalancein managing
the passed irq.
The following are recognized as valid key value pairs.
ban
Valid values are true(exclude the passed irq from balancing) or false(perform
balancing o n this irq).
balance_level
Allows user override of the balance level of the passed irq. By default the bala nce
level is based on the PCI device class of the device that owns the irq. Valid va lues
are none, package, cache, or core.
Red Hat Ent erprise Linux 6 Performance Tuning G uide
24
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
numa_node
Allows user override of the NUMA node that is considered local to the passed irq .
If information about the local node is not specified in ACPI, devices are
considered equ idis tant from all nodes. Valid values are integers (starting from 0)
that identify a specific NUMA node, and -1, which sp ecifies that an irq should be
considered equidistant from all nodes.
--banirq
The interrupt with the specified interrupt request number is a dded to the list of banned
interrupts.
You can a lso use the IRQBALANCE_BANNED_CPUSenvironment variable to specify a mask o f CPUs
that are ignored b y irqbalance.
For further details, see the man pag e:
$ man irqbalance
3.5. Built-in Command-line Monitoring Tools
In addition to graphical monitoring tools, Red Hat Enterprise Linux provides several tools that can be
used to moni tor a system from the command lin e. The advantage of these tools is tha t they can be
used ou tside run level 5. This section discu sses each tool briefly, and suggests the purposes to
which each tool is best suited.
top
The top tool p rovides a dyna mic, real-time view of the processes in a runn ing system. It can disp laya variety of information, includ ing a system summary and the tasks currently being manag ed by the
Linux kernel. It also has a limited ab ility to manipu late processes. Both its operation an d the
information it displays are highly configurable, and an y configuration details can be made to persist
across restarts.
By defau lt, the processes shown are ordered by the percentage of CPU usage, giving an easy view
into the processes that are consuming the most resources.
For detailed information a bout using top , refer to its man page: man top.
ps
The pstool takes a snapshot of a select group of active processes. By default this group is limited to
processes owned by the current user and a ssocia ted with the same termina l.
It can provide more detailed in formation abo ut processes than top , but is not dynamic.
For detailed information a bout using ps, refer to i ts man page: man ps.
vmstat
vmstat (Virtual Memory Statistics) ou tputs instan taneous reports about your system's p rocesses,
memory, paging, b lock I/O, interrupts an d CPU activity.
Although it is not dyn amic like top , you can specify a sampling interval, which lets you observe
system activity in near-real time.
Chapt er 3. Monitoring and Analyzing Syst em Performance
25
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
For detailed information a bout using vmstat , refer to its man page: man vmstat.
sar
sar(System Activity Reporter) collects and reports information about today's system activity so far.
The defau lt output covers today's CPU utilization a t ten minu te intervals from the beginning o f the
day:
12:00:01 AM CPU %user %nice %system %iowait %steal
%idle
12:10:01 AM all 0.10 0.00 0.15 2.96 0.00
96.79
12:20:01 AM all 0.09 0.00 0.13 3.16 0.00
96.61
12:30:01 AM all 0.09 0.00 0.14 2.11 0.00
97.66
...
This too l is a useful alternative to a ttempting to create periodic reports on system activity throu gh topor similar tools.
For detailed information a bout using sar, refer to i ts man page: man sar.
3.6. Tuned and ktune
Tuned is a daemon tha t monitors and collects data on the usage of various system compon ents,
and uses that information to dynamically tun e system settings as required. It can react to chan ges in
CPU and network use, and adjust settings to improve performance in active devices or reduce power
consumption in inactive devices.
The accompanying ktunepa rtners with the tuned-admtool to p rovide a number of tuning profiles
that are pre-configured to enha nce performance and reduce power consumption in a nu mber of
specific use ca ses. Edit these profiles or create new profiles to create performance so lution s tailo red
to your environment.
The profiles provided as part of tuned-adminclude:
default
The defaul t power-savin g profile. This is the most basic power-savin g profile. It enables
on ly the disk and CPU plug-ins. Note that this is not the same as turnin g tuned-admoff,where bo th tuned and ktuneare disabled.
latency-performance
A server profile for typica l latency performance tun ing. This profile disables dynamic tuning
mechanisms and transp arent hugepages. It uses the performancegoverner for p-states
through cpuspeed , and sets the I/O schedu ler to deadline. Add itiona lly, in Red Hat
Enterprise Linux 6.5 a nd later, the profile requests a cpu_dma_latencyvalue of 1. In Red
Hat Enterprise Linux 6.4 and earlier, cpu_dma_latency requested a value of 0 .
throughput-performance
A server profile for typica l throughput performance tuning. This p rofile is recommended if
the system does not ha ve enterprise-class s torage. throughput-performance disa bles power
saving mechanisms and enables the deadlineI/O scheduler. The CPU governor is set to
performance. kernel.sched_min_granularity_ns(scheduler minimal preemption
Red Hat Ent erprise Linux 6 Performance Tuning G uide
26
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
gran ularity) is set to 10 milliseconds, kernel.sched_wakeup_granularity_ns
(scheduler wake-up gran ula rity) is set to 15milliseconds, vm.dirty_ratio(virtual
memory dirty ratio) is set to 40%, and transp arent huge pages are enabled.
enterprise-storage
This profile is recommended for enterprise-sized server configurations with enterprise-class
storage, including battery-backed controller cache protection and management of on-disk
cache. It is the same as the throughput-performance profile, with one addition: filesystems are re-mounted with barrier=0 .
virtual-guest
This profile is optimized for virtual machines. It is based on the enterprise-storage
profile, but also d ecreases the swappiness of virtual memory. This p rofile is availab le in
Red Hat Enterprise Linux 6 .3 and later.
virtual-host
Based on the enterprise-storagep rofile, virtual-hostdecreases the swappin ess of
virtual memory and enables more aggressive writeback of dirty pag es. Non-roo t and non-boot file systems are mounted with barrier=0 . Add itiona lly, as of Red Hat Enterprise Linux
6.5, the kernel.sched_migration_costparameter is set to 5milliseconds. Prior to Red
Hat Enterprise Linux 6.5, kernel.sched_migration_costused the default value of 0.5
milliseconds. This p rofile is availa ble in Red Hat Enterprise Linux 6.3 and later.
Refer to the Red Hat Enterprise Linu x 6 Power Management Guide, availab le from
http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ , for further information about
tuned and ktune.
3.7. Applicat ion Profilers
Profiling is the process of gathering information about a program's behavior as it executes. You
profile an appl ication to determine which areas of a program can b e optimized to increase the
prog ram's overall sp eed, reduce its memory usag e, etc. App lication profiling too ls help to simplify
this process.
There are three supported profiling too ls for use with Red Hat Enterprise Linux 6 : SystemTap ,
OProfile and Valgrind . Documenting these profilin g tools is o utside the scope of this guide;
however, this section does prov ide links to further information and a brief overview of the tasks for
which each profiler is su itable.
3.7.1. SystemT ap
SystemTap is a tracing and probing tool that lets users monitor and ana lyze operating system
activities (particularly kernel activities) in fine detail. It provides information s imilar to the output of
tools like netstat, top , psand iostat , but includes additional filtering and an alysis options for the
information tha t is collected.
SystemTap p rovides a deeper, more precise analysis of system activities and a pp lication behavior to
allow you to pinpo int system and app lication bo ttlenecks.
The Function Callgraph plug-in for Eclipse uses SystemTap a s a back-end, allowing it to thoroughlymonitor the status of a prog ram, including fun ction ca lls, returns, times, and user-space variables,
and display the information visually for easy op timization.
Chapt er 3. Monitoring and Analyzing Syst em Performance
27
http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ -
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
The Red Hat Enterprise Linux 7 SystemTap Beginner's Guideinclu des several sample scripts that are
useful for profiling an d monitorin g performance. By default they are installed to the
/usr/share/doc/systemtap-client-version/examplesdirectory.
Network monito ring scripts (in examples/network)
nettop.stp
Every 5 seconds, prin ts a list of processes (process identifier and command) with thenumber of packets sent and received and the amount of da ta sent and received by the
process during that interval.
socket-trace.stp
Instruments each o f the functions in the Linux kernel's net/socket.cfile, and p rints trace
data.
tcp_connections.stp
Prints information for each new incoming TCP connection accepted by the system. The
information includes the UID, the command a ccepting the connection, the process identifier
of the command, the port the conn ection is on , and the IP address of the origin ator of the
request.
dropwatch.stp
Every 5 second s, prints the nu mber of socket buffers freed at locations in the kernel. Use the
--all-modulesop tion to see symbol ic na mes.
Storage monito ring scripts (in examples/io )
disktop.stp
Checks the status of read ing/writing d isk every 5 second s and outputs the top ten entries
during that period.
iotime.stp
Prints the amount of time spent on read a nd write operations, an d the number of bytes read
and written.
traceio.stp
Prints the top ten executab les based o n cumulative I/O traffic observed, every second .
traceio2.stp
Prints the executab le name and process id entifier as reads and writes to the specified
device occur.
inodewatch.stp
Prints the executab le name and process id entifier each time a read o r write occu rs to the
specified inode on the specified major/mino r device.
inodewatch2.stp
Prints the executab le name, process identifier, and attributes each time the attribu tes a re
changed on the specified inode on the specified major/mino r device.
Red Hat Ent erprise Linux 6 Performance Tuning G uide
28
-
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
The latencytap.stp scrip t records the effect that d ifferent types o f latency have on one or more
processes. It prin ts a list of la tency types every 30 seconds, sorted in descending order by the total
time the process or processes spent waiting . This can be useful for id entifying the cause of bo th
storage and network la tency. Red Hat recommends using the --all-modulesoption with this
script to better enab le the mapping o f latency events. By defau lt, this script is installed to the
/usr/share/doc/systemtap-client-version/examples/profilingdirectory.
For further information about SystemTap, refer to the SystemTap Beginners Guide, available from
http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ .
3.7.2. OProfile
OProfile (oprofile ) is a system-wide performance monitoring too l. It uses the processor's d edicated
performance monitoring h ardware to retrieve information about the kernel and system executables,
such as when memory is referenced, the number of L2 cache requests, and the number of hardware
interrupts received. It can a lso be used to determine processor usage, and which ap plica tions and
services are used most.
OProfile can also be used with Eclipse via the Eclipse OProfile plug-in. This plug-in a llows users to
easily determine the most time-consuming areas o f their code, and perform all command-line
functions of OProfile with rich visua lization of the results.
However, users should be aware of several OProfile limitations:
Performance monitoring sa mples may not be precise - because the processor may execute
instructions ou t of order, a sample may be recorded from a nearby instruction , instead o f the
instruction that triggered the in terrupt.
Because OProfile is system-wide and expects processes to start and stop multiple times, samples
from multiple runs are allowed to accumulate. This means you may need to clear sa mple data
from previous runs.
It focuses on identifying prob lems with CPU-limited p rocesses, and therefore does not identify
processes that are sleeping whi le they wait on locks for other events.
For further information about using OProfile, refer to the Deployment Guide, availab le from
http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ , or to the oprofile
documentation on your system, located in /usr/share/doc/oprofile-.
3.7.3. Valgrind
Valg rind p rovides a number of detection a nd p rofiling tools to help improve the performance andcorrectness of your app lication s. These tools can detect memory and thread-related errors as well as
heap, stack an d array o verruns, allowing you to easily loca te and correct errors in your a pplication
code. They can also pro file the cache, the heap, and branch-prediction to identify factors that may
increase application speed and minimize application memory use.
Valgrind ana lyzes your a pplication by running it on a synthetic CPU and instrumenting the existing
applica tion code as it is executed. It then prints "co mmentary" clearly identifying each process
invo lved in applica tion execution to a user-specified file descriptor, file, or network so cket. The level
of instrumentation varies depending on the Valg rind too l in use, and its settings, but it is important to
note that executing the instrumented code can take 4-50 times lon ger than n ormal execution .
Valgrind can be used on your application as-is, without recompiling. However, because Valgrind
uses debugging information to pinpoint issues in your code, if your ap plication an d support libraries
were no t compi led with debugging in formation ena bled, recompi ling to include this information is
highly recommended.
Chapt er 3. Monitoring and Analyzing Syst em Performance
29
http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ -
5/19/2018 Red Hat Enterprise Linux-6-Performance Tuning Guide-En-US
As of Red Hat Enterprise Linux 6.4, Valgrind integrates with gdb(GNU Project Debug ger) to improve
debugging efficiency.
More information about Valgrind is ava ilab le from the Developer Guide, available from
http://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/ , or by using the man
valgrind command when the valgrindpacka ge is installed. Accompanying documentation can also
be found in:
/usr/share/doc/valgrind-/valgrind_manual.pdf
/usr/share/doc/valgrind-/html/index.html
For in formation about how Valgrind can be used to profile system memory, refer to Section 5.3,
Using Valgrin d to Pro file Memory Usage.
3.7.4. Perf
The perf tool p rovides a number of useful performance counters that let the user assess the impact
of other commands on their system:
perf stat
This command prov ides overall statistics for common performance events, inclu ding
instructions executed and clock cycles consumed. You can use the option flags to gather
statistics on events other than the default measurement events. As of Red Hat Enterprise
Linux 6 .4, it is possible to use perf statto filter monitoring b ased on one or more
specified con trol groups (cgroups). For further information, read the man page: man perf-
stat.
perf record
This command records performance data into a file which can be later analyzed usingperf report. For further details , read the man page: man perf-record .
As of Red Hat Enterprise Linux 6.6, the -band -jop tions are provided to allo w statistical
sampling of taken branches. The -boption samples any b ranch es taken, while the