(Xperf). Trace Capture, Processing, and Command-Line Analysis tool Xperf.exeCaptures traces,...

Post on 15-Dec-2015

229 views 6 download

Tags:

Transcript of (Xperf). Trace Capture, Processing, and Command-Line Analysis tool Xperf.exeCaptures traces,...

Introduction and Case Studies

Tate CalhounPlatforms Global Escalation ServicesCore Team

(Xperf)

Agenda

• What is Xperf?

• How does it work?

• What can it do?• Case Studies!

• Where can I get it?

What is Xperf?

No application or driver is an island! Triaging system wide performance issues via multiple trace streams introduces complexity. Data must be aggregated, visualized, and understood to determine root cause. Xperf is the long term investment Microsoft is making to meet this diagnostic need.

Support Engineer Trends/Challenges:

1.) Memory.dmps are not getting smaller! (yet)2.) Classic yet challenging application and system hangs, slow I/O, and

other system wide issues3.) User observation to code behavior. Stackwalking!4.) The visualization of complex ETW data

What is Xperf?

• An ETW controller and consumer• Created by the Windows Fundamentals

team• Designed for analyzing system and

application performance on Windows Vista, Windows Server 2008, and later.

  Trace Capture, Processing, and Command-Line Analysis tool

Xperf.exe Captures traces, post-processes them for use on any machine, and supports command-line (action-based) trace analysis.

Visual Trace Analysis tool

Xperfview.exe Presents trace content in the form of interactive graphs and summary tables.

On/Off Transition Trace Capture tool

Xbootmgr.exe Automates on/off state transitions and captures traces during these transitions.

What is Xperf?Capture and analysis model follows this general flow:

1. ETW tracing is enabled using xperf (or xbootmgr). 2. Scenario of interest is performed. 3. ETW tracing is stopped using xperf (or automatically by xbootmgr)

and the data is saved to an ETL trace file. 4. Trace files can then further be processed with xperf and/or viewed

with xperfview. 5. Traces can be processed on the machine on which they were taken,

or copied to another machine for analysis (including cross-architecture).  Everything needed for analysis is stored in the trace file.

How does this work?Performance Analyzer is built on top of the Event Tracing for Windows

(ETW)infrastructure. ETW enables Windows and applications to efficiently

generateevents, which can be enabled and disabled at any time without requiring

system orprocess restarts. ETW collects requested kernel events and saves to an

etl file.

Provides extensive details about the system. Context switches, interrupts, deferred procedure calls, process and thread creation and destruction, disk I/Os, hard faults, processor P-State transitions, and registry operations, etc.

One of the great features of ETW, supported in WPT, is the support of symbol decoding, sample profiling, and capture of call stacks on kernel events.

ETW OverviewETW, or Event Tracing for Windows, is a high performance kernel level tracing system that

made its firstappearance in Windows 2000 and has found widespread use since. Most operations

throughout the operating system that are interesting to a performance analyst are already instrumented.

ETW is comprised of four key components:

Providers: Components that generate events. They are instrumented with the ETW EventWrite() API.

Sessions: Kernel objects that collect events into buffers and send those events to a file or directly a consuming application.

Controllers: Programs that create and control sessions, and that control providers. Controlling providers consists of enabling and disabling events, or groups of events.

Consumer: Programs that consume event data from ETW sessions. The data can be consumed from a file, or in ‘real time’ streaming mode.

If you are interested in instrumenting your own applications, please see the ETW documentation in MSDN

ETW Components

ETW Advantages for Performance Tracing• Events can be enabled or disabled dynamically, no system or

process restarts

• By default all events are disabled and the call to EventWrite() incurs a very minimal overhead – just a flag check. This allows retail (shipping) code to be fully instrumented and enables tracing to be enabled in shipping components at any time.

• Even when enabled, event tracing has a very low run-time overhead, even when large numbers of events are generated per second.

• Once begun, an ETW logging session has a defined maximum memory footprint. The disk footprint can be similarly limited. CPU load is directly proportional to the event rate and scales very well.

• Events from multiple providers, sessions and the kernel logger can be merged seamlessly when collected simultaneously on the same machine. This allows different kinds of events and settings to be used on different logging sessions and then merged together after the fact.

• The Windows kernel and multiple components are already extensively instrumented.

ETW is the instrumentation method of choice on Windows.

Key Benefits to Support Engineers

1.) Lightweight/Non-Invasive, only about 2-3% CPU, 5% Memory, configurable.

2.) STACK TRACING!3.) Independent Data Gathering vs. Analysis(Easily understated…this is almost like being able to share a live debug

session)4.) We are trying to move away from "how do I reproduce this problem

internally?" to "What was the problem and how to fix it“5.) A different…temporal debugger?

More about stacks…

In Vista+, stack walking can be enabled for certain kernel events. When combined with symbols on the processing side this extends analysis using system events to correlate back into the code context that triggered the corresponding event, from up in kernel mode to deep inside applications. This is a very powerful tool!

What can it do?

More and more each release and in the upcoming Win7 beta releases of the tool you will see improved data capturing ability. For now, here are some scenarios that we have explored.

 1.) CPU Consumption and Scheduling Issues2.) Storage Delay, Disk I/O Issues3.) Power Management Events

Let's get started…

Download and install of the tools (preferably custom to c:\xperf) and an elevated CMD. http://msdn.microsoft.com/en-us/performance/default.aspx

Let's get started…Let's take a view of the advanced stackwalk options and other Kernel

groups(xperf -providers K)

Let's get started…Here are the stackwalk flags (xperf -help stackwalk)

Let's get started…Any Easy Syntax to start out!?!

Using the following is a good head start and we see that massive customization is possible with stackwalk, etc.

 

Xperf -on DiagEasy For profiling issues to get stacks one can use: 

Xperf -on Latency -stackwalk Profile

Case Study 1 – IE HangXperf -on latency -stackwalk profileRepro IE Hang…Xperf -d ielatency.etl

(-d MergedETL Merge the ETL files of stopped logging sessions into MergedETL; if no session is stopped explicitly, the "NT Kernel Logger" stopped implicitly.)

Case Study 1 – IE HangCase Study 1 – IE Hang

Case Study 1 – IE HangSo we have spent a significant portion of time in MSHTML.DLL!

CHtPvPv::Lookup I happened to have a dump of the process as well….

Case Study 1 – IE Hang

Summary: Searching around this mshtml code, this is being reviewed via a Windows 7 bug. Unfortunately, we didn’t have a repro at the time... I provided this etl trace to the team (with dumps and error reporting data) and the investigation continues!

A great example of the portability advantages of this tool!

Case Study 2 – Everything slow…Xperf -on latency -stackwalk profileRepro slowness…Xperf -d WUlatency.etlNote overlay of Disk Utilization in the middle...

Case Study 2 – Everything slow…So we have some activity here in cbscore.dll!

CCbsPublicPackage::EvaluateApplicability, etc.

Case Study 2 – Everything slow…What is cbscore.dll?

Component-Based ServicingComponent-Based Servicing (CBS) is part of the servicing stack. The

servicing stack is a set of files and resources that are required to service a Windows image or operating system. The servicing stack is available on all Windows Vista and Windows Server 2008 installations, as well as in the Windows Automated Installation Kit AIK (Windows AIK) and the Windows OEM Preinstallation Kit (Windows OPK). CBS provides various APIs (which are not publicly available) to its client installers to service the operating system components. Client installers such as Windows Update or Windows Installer work with CBS to enumerate, install, update, and uninstall component packages on the destination operating system. CBS interacts with the Component Servicing Infrastructure to perform the necessary system changes.

For more information about CBS, see http://go.microsoft.com/fwlink/?LinkId=91917.

Wow, there's even CBS logging as well if needed: http://technet.microsoft.com/en-us/library/cc732334.aspx#CBS

Case Study 2 – Everything slow…

Summary: My machine is working hard scanning for updates, so this seems "expected". However, it turns out there are actually performance improvements exactly here in cbscore for Windows 7 and Vista SP1 to mitigate this issue!

Case Study 3 – Outlook hangs on startup…

Xperf -on latency -stackwalk profileStart Outlook…Xperf -d outlookhang.etl

I've got a DPC issue on this machine to track down further with the NIC :)….15% DPC time as a constant is not good for performance!

Interesting to note that it matches the CPU Sampling by CPU.

Case Study 3 – Outlook hangs on startup…

However, we are ~41% idle per the above…

Case Study 3 – Outlook hangs on startup…

What is going on here? Half Idle, Half…

Case Study 3 – Outlook hangs on startup…

What is going on here? Half Idle, Half…

Case Study 3 – Outlook hangs on startup…Ah, my backup application at PID 496 is running doing it's thing…

Case Study 3 – Outlook hangs on startup…But what is Outlook doing during all this?

Case Study 3 – Outlook hangs on startup…

Summary: So I've found a taxing network DPC issue on my machine so I'll go follow up with my NIC configuration/driver. (turns out disable renable did the trick) Also, I've seen that I’ve got my backup application querying the file system here likely indexing by design. Also, I've got more digging to do into Outlook using the Cswitch stackwalk event (running the trace before the hang).

Addendum: Found two separate hang bugs in Outlook actually where the root cause of the hangs(with the debugger, I cheated, sorry, I feel bad). The write stack matched one though!

Case Study 4 – Run Scott Run…

The situation is that on two different platforms (physical vs. virtual) of stated same build are displaying two different scheduling behaviors when viewing CPU time for RunScottRun.exe. The exe “doesn't stay on a processor”… xperf -on PROFILE+DISPATCHER+PROC_THREAD+LOADER+HARD_FAULTS+INTERRUPT+DPC+CSWITCH -maxbuffers 1024 Here is what the "good" trace looks like…We have our Process level view to clone the selection….

Case Study 4 – Run Scott Run…

Now the CPU Scheduling section. At a high level the problem reported is that RunScottRun.exe is having trouble getting enough time on a single CPU and we see context switches...

Case Study 4 – Run Scott Run…

We can see this at a little bit lower level what is getting to run…

Case Study 4 – Run Scott Run…

At an even lower level RunScottRun.exe is lower priority, we see SomeService.exe sneak in here at Pri 9, then back to RunScottRun.exe after Delay Execution.

Case Study 4 – Run Scott Run…

If you are an Excel wiz :) the data can be exported for another view.  Here's the "good" trace summary/pivot

“Good” “Bad”

Case Study 5 – A Random Stackwalk - CreateProcessC:\xperf>xperf -on DiagEasy -stackwalk ProcessCreate+ProcessDelete

C:\xperf>notepadC:\xperf>xperf -d processcreate.etlMerged Etl: processcreate.etl(shows stacks on 7sdk Stack Counts by Type)

Case Study 5 – A Random Stackwalk - CreateProcess

Case Study 5 – A Random Stackwalk - CreateProcess

Case Study 5 – A Random Stackwalk - CreateProcess

Case Study 5 – A Random Stackwalk - CreateProcess

Case Study 6 – A Random Stackwalk – File I/OC:\xperf>xperf -on FileIO+FILENAME -stackwalk @file.txt

 C:\xperf>xperf -d file.etlMerged Etl: file.etlC:\xperf>xperf file.etl(shows stacks with 08 ver, in File IO Section)

Case Study 6 – A Random Stackwalk – File I/O

Case Study 7 – A Random Stackwalk – RegistryC:\xperf>xperf -on DiagEasy+REGISTRY -stackwalk @reg.txtC:\xperf>regeditC:\xperf>xperf -d reg.etlMerged Etl: reg.etl(shows stacks with 08 ver, in Registry section)

Case Study 7 – A Random Stackwalk – Registry

Case Study 8 – A Random Stackwalk – PowerC:\xperf>xperf -on DiagEasy+POWER -stackwalk @power.txt

C:\xperf>xperf -d power.etlMerged Etl: power.etlC:\xperf>xperf power.etlC:\xperf>c:\xperf7sdk\xperf power.etl(shows stacks with 7sdk Stack Counts by Type)

Case Study 8 – A Random Stackwalk – Power

Case Study 9 – SLOOOW I/O

Case Study 9 – SLOOOW I/O

What didn’t we cover?Xbootmgr.exe! It can trace boot, hibernate, standby, shutdown, rebootCycle performance.

Heap and VirtualAlloc tracing in user mode...

Where do I get this?

The toolkit: http://msdn.microsoft.com/en-us/performance/default.aspx The forum: The Windows Performance Toolkit Forum

Remember to check the Windows 7 SDK for the cool Stack Counts by Type section and other improvements!

Top QuestionsCan I use Performance Analyzer on Windows XP or Windows Server 2003?

Unfortunately, the answer is 'no'. While Windows XP and Windows Server 2003 do support collection of ETL traces, these OSes do not contain instrumentation for most of the events needed by Performance Analyzer (PA). You need Windows Vista or later OS to use PA. An example of crucial instrumentation added in Vista is stack walking. Performance analysis without stacks can be an extremely daunting task that only a true expert with access to source code can tackle.

With the Windows Vista release, Microsoft has really taken the OS to next level in terms of system diagnosibility and a lot of the analysis based on this instrumentation isn't feasible on Windows XP/Server 2003 ETL traces.

Top QuestionsHow do I find out the time span of the trace my customer collected to correlate with other data? (-a tracestats)

Top QuestionsHow to enable stackwalking on x64 systems?

x64 ETW stackwalking is only supported on Windows Vista SP1, Windows Server 2008, and above, and requires setting a certain registry value (see below) to 1 and rebooting the machine ( so that Windows kernel picks it up).

HKLM\System\CurrentControlSet\Control\Session Manager\Memory Management

DisablePagingExecutive 1

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions,

it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.