Projections –APerformance Tool for...

45
Projections – A Performance Tool for Charm++ Applications Chee Wai Lee [email protected] Parallel Programming Laboratory Dept. of Computer Science University of Illinois at Urbana Champaign http://charm.cs.uiuc.edu

Transcript of Projections –APerformance Tool for...

Page 1: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Projections – A Performance Tool forCharm++Applications

Chee Wai [email protected]

Parallel Programming Laboratory

Dept. of Computer Science

University of Illinois at Urbana Champaign

http://charm.cs.uiuc.edu

Page 2: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Outline• General Introduction to Projections

• Projections Basics

• Advanced Features

• Features to aid Effective Analysis

• Extremely Large Datasets

• Tips and Notes

Page 3: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

10/19/05 Projections Tutorial 3

Projections

• Projections is a performance tool designed for use withCharm++/AMPI.

• Trace-based, post-mortem analysis.

• Supports highly detailed traces, summary formats and aflexible user-level API.

• Java-based visualization tool for presentingperformance information.

Page 4: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

10/19/05 Projections Tutorial 4

Charm++ Model

• Object Oriented. Chares (objects) encapsulate data,standard C++ methods and entry methods.

• Message Driven. Entry methods represent work unitsactivated by an incoming message.

• Only one entry method may execute at any time in aChare.

• A runtime schedules incoming messages.

Page 5: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

What you will need

• A version of Charm++ built without theCMK_OPTIMIZE flag.– User-built, download our release or check out a copy fromanonymous CVS.

– Pre-built, check with your machine sysadmin.

• Java Runtime 1.3.1 or higher.

• Projections Visualization binary (projections.jar)– User-built Charm++, located in charm/tools/projections/bin

– Pre-built, check with sysadmin or acquire the binaryseparately.

Page 6: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Outline: Basics• General Introduction to Projections

• Projections Basics– Instrumentation

– Trace generation

– Visualization

• Advanced Features

• Features to aid Effective Analysis

• Extremely Large Datasets

Page 7: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Trace Generation: Basics

– Automatic trace instrumentation - No user codesrequired by default.

– Any Charm++ version built without theCMK_OPTIMIZE flag supports tracing.

– All Charm++ entry methods and messaging eventsare traced.

Page 8: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Trace Generation: Steps

• Link your application with link-time options“-tracemode projections -tracemode summary”

• Run your application normally.

• At the end of the run, you will see “.log”,“.sum” as wellas “.sts” files generated on the same directory as yourapplication binary.

Page 9: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Visualization: Basic Steps

• Run the script found at charm/tools/projections/bin assuch:– projections [<application>.sts]

• Or activate the Java binary projections.jar via thecommand-line passing, in the optional <application>.stsfilename as argument.

Page 10: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

10/19/05 Projections Tutorial 10

Visualization: Main Window

Page 11: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Visualization: Overview

Page 12: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Visualization: Usage Profile

Page 13: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Visualization: Time Profile

Page 14: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Visualization: Timeline

Page 15: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Visualization: Task Histogram

Page 16: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Visualization: Communication

Page 17: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Visualization: Tabulated Call Info

Page 18: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

10/19/05 Projections Tutorial 18

Projections Basic demo.

Page 19: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Outline: Advanced Features

• General Introduction to Projections

• Projections Basics

• Advanced Features– Partial Tracing

– Tracing User Events

– Tracing AMPI Functions

• Features to aid Effective Analysis

• Extremely Large Datasets

• Tips and Notes

Page 20: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

10/19/05 Projections Tutorial 20

Partial Trace Generation

• The following API calls are provided by the tracingframework:– void traceBegin()

– void traceEnd()

– The above calls turns tracing on/off for the processor onwhich the call was made.

– int traceIsOn() queries the tracing framework status.

• +traceoff runtime option– Causes tracing (over all processors) to be turned off when theapplication is started.

Page 21: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Partial Tracing “watch-it”s• traceBegin() and traceEnd() calls apply only on theprocessor on which it invoked.– Offers flexibility but vulnerable to programmer error.

– Typically used in a collective manner. (eg. NAMD does this atspecific load balancing operations)

• Partial trace calls are invoked in the context of an entrymethod. One should be prepared to drop initialperformance data just after turning tracing on and justbefore turning tracing off.

• Appropriate use of the +traceoff runtime option isessential.

Page 22: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Partial Tracing Example// in the case when trace is off at the beginning,

// only turn trace of from after the first LB to the firstLdbStep after

// the second LB.

// 1 2 3 4 5 6 7

// off on Alg7 refine refine ... on

#if CHARM_VERSION >= 050606

if (traceAvailable()) {

static int specialTracing = 0;

if (ldbCycleNum == 1 && traceIsOn() == 0) specialTracing = 1;

if (specialTracing) {

if (ldbCycleNum == 4) traceBegin();

if (ldbCycleNum == 6) traceEnd();

}

}

#endif

Page 23: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

10/19/05 Projections Tutorial 23

Tracing User Events

• The following APIs are provided for user eventregistration and tracing:

– int traceRegisterUserEvent(char *eventDesc, int EventNum=-1)

• Acquire or specify an event ID to be associated with event name.

– void traceUserEvent(int eventNum)

• Use a valid event ID to record an event.

– void traceUserBracketEvent(int eventNum, double startTime,

double endTime)

• Use a valid event ID to record an event interval.

Page 24: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

AMPI Function Tracing API

• Works like User Events in Charm++Applications.

• Why? MPI Function abstraction is invisible to theCharm++ runtime.

• REGISTER_FUNCTION(<namestring>) to register<namestring> as a string-id to be traced.

• TRACEFUNC(<funcall>,<namestring>) to record a callto function <funcall> to be associated with the eventregistered as <namestring>.

Page 25: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Advanced Features Demo

Page 26: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Outline: Effective Analysis

• General Introduction to Projections

• Projections Basics

• Advanced Features

• Features to aid Effective Analysis– How Tracing Works

– Memory Footprint Control

– Data Volume Control

– Visualization Controls

• Extremely Large Datasets

• Tips and Notes

Page 27: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Log Event Tracing

• Each Charm++ event and any registered user events arerecorded into a pre-assigned memory buffer on eachprocessor.

• Default Buffer size is 10,000 trace log entries.

• When a buffer is full, a special flush event is logged andbuffer is flushed to disk.

• Flushing is done independent of other processors. Thereis no synchronization on a flush.

Page 28: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Summary Tracing

• Invoked by the same event set as in Log tracing. Userevents are, however, ignored.

• Memory buffer is organized into k bins of representingan initial time of 1ms. Each event contributes data intothe appropriate time-bin.

• When a buffer is filled, bin-time representation isdoubled and the data is packed into the first k/2 bins.Event contribution continues at the (k/2+1)th bin.

Page 29: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Memory Footprint Control

• Tracing Memory Footprint– Event Log tracemode (-tracemode projections)

• Default 10,000 event entries.

• Controlled by runtime flag “+logsize <size>”.

– Summary tracemode (-tracemode summary)

• Default 10,000 time bins of 1 ms.

• Controlled by runtime flags “+bincount <#bins>” and“+binsize <seconds>”.

Page 30: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Memory Footprint Control (2)

• Trade-offs to consider– Log Buffers

• Flush overhead vsMemory usage

• Frequency of flushes vs Size of flushes

– Summary Buffers

• Frequency of compaction + Data Granularity vsMemoryusage

Page 31: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Controlling Data Volume

• [Code-time] User API for partial tracing.

• [Link-time] Generating only summary data.

• [Run-time] Writing compressed output (runtime flag):+gz-trace

• [Post-run] Deleting subset of generated logs.

• [Visualization] Parameter range control, analysis“Memory”.

Page 32: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Visualization Parameter Control

Page 33: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

10/19/05 Projections Tutorial 33

Specific Visualization Tool Issues

• Memory usage constraints– Timeline - dependent on the event-density of the selectedtime range of the application. Typical workable range is10ms to 10s for between 10-20 processors.

– Processor based tools (Overview, Communications, UsageProfile) - limit to 2000 processors or less.

– Interval based tools (Graph, Time Profile, Communication vsTime, Animation) - limit to 1500 time intervals.

– Histogram tools – limit to 1000 bins or less.

Page 34: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Analysis Techniques

• Zoom in from wide-ranged low-detail views to detailedlook at problem spots.

• Make use of range histories.

• Control data volume.

• Use effective colors.

• Make effective use of specific tool features (seemanual).

Page 35: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Analysis Techniques (2)

• Load Imbalance: Overview, Usage Profile.

• Where's my work going?: Time Profile.

• How is my communication behaving?: Communication,Communication vs Time.

• Are there critical paths? I need details!: Timeline.

Page 36: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Putting it all together

Page 37: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Outline: Extremely Large Datasets

• General Introduction to Projections

• Projections Basics

• How Tracing Works

• Features to aid Effective Analysis

• Extremely Large Datasets

• Tips and Notes

Page 38: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Considerations for Extremely Large Datasets

• Large number of files – ware thee the filesystem.

• Huge amounts of data – control!

Page 39: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Demo – NAMD on 8192 processors

Page 40: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Outline: Miscellany

• General Introduction to Projections

• Projections Basics

• How Tracing Works

• Features to aid Effective Analysis

• Extremely Large Datasets

• Tips and Notes

Page 41: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

10/19/05 Projections Tutorial 41

Conveniently Placing Logs

• Specifying a user-defined output location (runtimeoption):

+traceroot <desired log directory>

– It is important to note that <desired log directory>must be available on a machine's compute nodes.

Page 42: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

10/19/05 Projections Tutorial 42

Perturbation Issues

• Perturbation of application.– Tracing overhead

• Timer overheads (rdtsc, machine wallclock).

• Acquisition of performance data and storage.

– Observed in the case of NAMD with timesteps below 10mswith many compute objects in the microsecond range.

Page 43: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Look Closely!

• Do not be fooled! Projections does not handle somevisualization artifacts well:– Fine grain details can sometimes look like one big solid blockon timeline.

– It is hard to mouse-over items that represent fine-grainedevents.

– Other times, tiny slivers of activity become too small to bedrawn.

Page 44: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Answering my questions (again!)

• Load Imbalance: Overview, Usage Profile.

• Where's my work going?: Time Profile.

• How is my communication behaving?: Communication,Communication vs Time.

• Are there critical paths? I need details!: Timeline.

Page 45: Projections –APerformance Tool for Charm++Applicationscharm.cs.uiuc.edu/workshops/charmWorkshop2005/... · Dept. of Computer Science University of Illinois at Urbana Champaign ...

Frequently Asked Questions

Q: I tried user events and projections visualization crashes!

A: Did you register the events before using them?

Q: There are giant stretched event(s) in my run!

A: Did you set a large enough log buffer size?

Q: Projections visualization crashes!

A: Are your logs corrupted? These can happen on bad I/O(experienced on NFS).