Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

37
Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile

Transcript of Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Page 1: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Profiling Tools

Introduction to Computer System, Fall 2015. (PPI, FDU)

Vtune & GProfile

Page 2: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Profiling• In software engineering, profiling ("program profiling",

"software profiling") is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls. Most commonly, profiling information serves to aid program optimization.

• Profiling is achieved by instrumenting either the program source code or its binary executable form using a tool called a profiler (or code profiler). Profilers may use a number of different techniques, such as event-based, statistical, instrumented, and simulation methods.

Page 3: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Performance Tuning forIntel® Xeon Phi™ Coprocessors

Visualizing Performance Opportunities using Intel® VTune™ Amplifier

Page 4: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice

Introduction

Can profile host, offload or native coprocessor applications

Host-based profiling may be sufficient to identify vectorization/parallelism/ offload candidates Call stacks currently available for host only

Start with representative/reasonable workloads!

Use Intel® VTune™ Amplifier XE to gather hot spot data

Tells what functions account for most of the run time

Often, this is enough

But it does not tell you much about program structure

Move on to more detailed analyses

2

Page 5: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice

Hotspot (Statistical call tree)Hardware-Event Based Sampling

Thread Profiling

Visualize thread interactions on timelineBalance workloads

Easy set-up

Pre-defined performance profilesUse a normal production build

Compatible

Microsoft*, GCC*, Intel compilersC/C++, Fortran, Assembly, .NET*Latest Intel processorsand compatible processors1

Find Answers Fast

Filter out extraneous dataView results tied to source/assembly linesEvent multiplexing

Windows* or Linux*Visual Studio* Integration (Windows)

Standalone user interface and command line32 and 64-bit

3

Intel® VTune™ Amplifier XETune Applications for Scalable MulticorePerformance

Fast, Accurate Performance Profiles

1IA-32 and Intel® 64 architectures.Many features work with compatible processors.Event based sampling requires a genuine Intel Processor.

Page 6: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice4

A Quick Tour Through Intel® VTune™AmplifierSetting up a project

Execution file, command line arguments, working directory

Search directories (standard binary libraries for Intel MPSS 3)

Quick tour of advanced setup dialog

Selecting a collector

Host versus native event collection

Launching a collection

Viewing results, source and assembly

Page 7: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice

VTune™ Amplifier XE visualizes performance

5

Page 8: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice

VTune™ Amplifier XE visualizes performance

6

Instructions Navigator New New CompareOpenResult

Open PropertiesProject

Toolbar

Page 9: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice

VTune™ Amplifier XE visualizes performance

13

Grid Pane

Page 10: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice

VTune™ Amplifier XE visualizes performance

14

Grid Pane

Grouping pull-down

Page 11: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

VTune™ Amplifier XE visualizes performance

Intel Confidential

Optimization Notice

18

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.6/29/2014

Source View /

Per line localization

Page 12: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

VTune™ Amplifier XE visualizes performance

Intel Confidential

Optimization Notice

19

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.6/29/2014

Source View /

View / Hot spotNavigation controls

Can also copy small data files onto card,but will need to be recopied after reboot.

Suggest create /tmp/usrname as workingdirectory

Page 13: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

VTune™ Amplifier XE visualizes performance

Intel Confidential

Optimization Notice

20

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.6/29/2014

Assembly View /

View / Hot spotNavigation controls

Page 14: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice

For event collection the coprocessor istreated as a special HW architecture

21

Page 15: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice

General Exploration runs a set of events todrive top-down analysis

25

Page 16: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

VTune™ Amplifier XE visualizes performance

Intel Confidential

Optimization Notice

20

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.6/29/2014

Assembly View /

View / Hot spotNavigation controls

Page 17: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

VTune™ Amplifier

Intel Confidential

Optimization Notice

20

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.6/29/2014

Advantage• Both command line and GUI, easy to use

• Multiple predefined analyzing suite

• Support hardware events like cache and memory access analysis

• Multithread profiling well supported

Page 18: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

VTune™ Amplifier

Intel Confidential

Optimization Notice

20

Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.6/29/2014

Limitations• For enterprise use, Expensive!!!

• Can only be used on intel machines.

Page 19: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.
Page 20: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

GPROF• Gprof is a performance analysis tool for Unix

applications. It uses a hybrid of instrumentation and sampling and was created as extended version of the older "prof" tool. Unlike prof, gprof is capable of limited call graph collecting and printing.

Page 21: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Usage

• Instrumentation code is automatically inserted into the program code during compilation (for example, by using the '-pg' option of the gcc compiler), to gather caller-function data. A call to the monitor function 'mcount' is inserted before each function call.

• gcc -Wall -g -pg -lc_p example.c -o example• ./example will create gmon.out• gprof -b example gmon.out

Page 22: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Result• Gprof output consists of two parts: the flat profile

and the call graph. The flat profile gives the total execution time spent in each function and its percentage of the total running time. Function call counts are also reported. Output is sorted by percentage, with hot spots at the top of the list.

Page 23: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Result%                       the percentage of the total running time of thetime                    program used by this function.

cumulative          a running sumof the number of seconds accountedseconds            for by this function and those listed above it.

self                   the number of seconds accounted for by thisseconds            function alone. This is the major sort for this                         listing.

calls                  the number of times this function was invoked, if                         this function is profiled, else blank.

self                  the average number of milliseconds spent in thisms/call              function per call, if this function is profiled,                        else blank.

total                 the average number of milliseconds spent in thisms/call              function and its descendents per call, if this                         function is profiled, else blank.      name                the name of the function. This is the minor sort                         for this listing. The index shows the location of                         the function in the gprof listing. If the index is                         in parenthesis it shows where it would appear in                         the gprof listing if it were to be printed.      

Page 24: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Advantages

• GNU is not UNIX(supported by GNU)• Unlimited by hardwares

Page 25: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Limitations

• Gprof cannot measure time spent in kernel mode (syscalls, waiting for CPU or I/O waiting), and only user-space code is profiled.

• Gprof profiles the main thread of application of multi-threaded application.

• Insert code when compiling.• No hardware events.

Page 26: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

More

• man gprof• https://sourceware.org/binutils/

docs/gprof/

Page 27: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Open topic

Introduction to Computer System, Fall 2015. (PPI, FDU)

Page 28: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Pwned

Page 29: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Attack: Stack Buffer Overflow• A Typical Buffer Overflow Attack

– Inject malicious code in buffer– Overwrite return address to

buffer– Once return, the malicious code

runs 0110110101010101010101101010101010101010

return addrsaved ebp

ebp

buf

01010110101010111010

void function(char *str) { char buf[16]; strcpy(buf,str);}

Page 30: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Defense: DEP (Data Execution Prevention)

• Execute Code, not Data• Data areas marked non-

executable– Stack marked non-executable

• Hardware enforced (NX)• You can load your shellcode in the

stack …but you can’t jump to it

slide 30

Page 31: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

How to pwn?

• Give other ways of pwning except buffer overflow.

• Focusing on how to change the program form its normal execution path.

Page 32: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Debugging

Page 33: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

How to Debug?

• The Program gets wrong results• Runs program in debug mode• Execute the code line by line to find

the cause

Can this always work well in a multi-threads program?

If not, why? what’s the difference between sequential bugs and parallel ones?

And how to debug a tricky multi-threads program?

Page 34: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Cache

Page 35: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Cache locality

• Cache locality is the key to achieving high levels of performance.

• We can improve cache locality by either optimizing our program or changing the cache strategy or the implementation.

• You can introduce some methods to improve the cache locality from certain perspective and present how it works.

Page 36: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

Requirement

• Each student picks one topic and do a presentation with ppt slides.

• Any techniques or methods if you can finish presentation within 6 min

• 2015/10/30 6-7 classroom will be informed later.

• PPT slides should be emailed to your TA before 2015/10/29 23:59 p.m.

Page 37: Profiling Tools Introduction to Computer System, Fall 2015. (PPI, FDU) Vtune & GProfile.

How to score high?• Illustrate your ideas clearly, you may refer to the

Internet or give out your own solution.• Remember time is limited, try to be precise and

concise.• Your presentation contains three part: PPT, oral

speaking and your content. All of these are important in grading.