SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007....
Transcript of SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007....
![Page 1: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/1.jpg)
January 19, 2005
SGI® Altix™Using The Intel VTune Performance Analyzer
Reiner VogelsangSGI GmbH
![Page 2: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/2.jpg)
January 19, 2005 Page 2| |
Module Objectives
After completing the module you will be able •to profile an application using VTune•to run an experiement with multiple performance counter•to generate a callgraph of your application
![Page 3: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/3.jpg)
January 19, 2005 Page 3| |
VTune – Purpose
Helps to identify and characterize performance issues by •Collecting performance data
– CPU-Cycles (time)– Micro-architectural events of processor– Platform resource utilization
Organizing and displaying the data
Identifying performance ‘hotspots’
Suggesting improvements (currently Windows only!)
![Page 4: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/4.jpg)
January 19, 2005 Page 4| |
VTune: Status
•Native: Vtune for Linux 3.0– Any IA-32 or Itanium® system running recent Linux version
– Some kernel and GLIBC dependencies– Full Eclipsed-based GUI only for IA32 today
– Due to Eclipse issues with 64bit – Simple GUIs for IA64 available
– For Itanium® & EM64T command-line version– But graphical viewers for result– Eclipse-based release for 64bit system later in 2005
•Remote Data Collection– Allows full Windows GUI to be used for Linux too
![Page 5: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/5.jpg)
January 19, 2005 Page 5| |
VTune: Features
•Sampling of Execution Addresses–Profiling based on processor event counters
•Call Graph Profiling - Instrumented analysis–Call tree, number of calls, timing information–Executing Instrumented Code
•Tracking of System Performance Counters–Performance Monitor (perfmon) Style Counters–Extended Performance DLL APIs – SDK Available!
•Intel® Tuning Assistant: Interpret the results ( Windows or RDC only )
![Page 6: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/6.jpg)
January 19, 2005 Page 6| |
VTune: Example 1
General status commandsvtl query –lc
– lists all collectors ( sampling and callgraph for 2.0) vtl –help –c sampling
– lists all events available for EBS ( event base sampling ) Compile code with -g Create/Run a Sampling activity
setenv OMP_NUM_THREADS 4vtl activity –c sampling –app dplace, “-c 4-7 \
./untrim_elbe” run– Create and run a single Sampling collector Activity with application
‘dplace -x 2 -c 4-7 ./untrim_elbe’ ; default settings ( Instruction Retired and Cycles )
Invoke the viewer in order to display collected data
vtl view -gui
– Displays the last activitiy per default
![Page 7: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/7.jpg)
January 19, 2005 Page 7| |
VTune -Process Display
![Page 8: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/8.jpg)
January 19, 2005 Page 8| |
VTune: Module Display
![Page 9: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/9.jpg)
January 19, 2005 Page 9| |
VTune: Hotspots
![Page 10: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/10.jpg)
January 19, 2005 Page 10| |
VTune: Source Code Display
![Page 11: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/11.jpg)
January 19, 2005 Page 11| |
VTune: Multiple Performance Counter Events
•VTune let you sample customized subsets of performance counter subsets.
•Important for stall cycle analysis and DEAR analysis(DEAR=Data Event Address Registers)
vtl activity -d 600 -c sampling \-o "-cpu_mask 8-15-ec en='L3_READS-ALL-MISS', \en='LOADS_RETIRED',en='STORES_RETIRED', \en='FP_OPS_RETIRED'" \-app dplace,"-x2 -c8-15 ./untrim_elbe" run
–Example collects all loads,stores, floating point operations and misses in L3 due to reads.
–Application will be executed #-of-event times.–Viewer let you sort hot spots according to each individual event.
![Page 12: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/12.jpg)
January 19, 2005 Page 12| |
VTUNE: Module Display, Multiple Events
![Page 13: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/13.jpg)
January 19, 2005 Page 13| |
VTune: Hotspots, Multiple Events
![Page 14: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/14.jpg)
January 19, 2005 Page 14| |
VTune: Hotspots As Charts, Multiple Events
![Page 15: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/15.jpg)
January 19, 2005 Page 15| |
VTune: Source of L3_MISS Hotspot
![Page 16: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/16.jpg)
January 19, 2005 Page 16| |
VTUNE: Example 3, Callgraph
• Shows graphically the caller-callee relationship
• Highlights the hot path of an application
vtl activity -d 600 -c callgraph -app ../src/adi \-moi ../src/adi run
– It is important to declare the path to the application and the module of interest (-moi) in unique manner.
![Page 17: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/17.jpg)
January 19, 2005 Page 17| |
VTUNE: Callgraph + Hot Path
![Page 18: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/18.jpg)
January 19, 2005 Page 18| |
VTune: Summary
•VTune has its benefits for
–Hotpath detection within a caller-callee relationship
–Collecting and Displaying multiple performance counter events in case of stallcylce analysis or DEAR analysis.
![Page 19: SGI® Altix™ Using The Intel VTune Performance Analyzerparallel/parallelrechner/altix... · 2007. 2. 9. · SGI GmbH reiner@sgi.com | January 19, 2005 | Page 2 Module Objectives](https://reader033.fdocuments.us/reader033/viewer/2022051510/6003a0d113144b764b7f7c2e/html5/thumbnails/19.jpg)
January 19, 2005 Page 19| |