A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group [email protected] NERSC...

13
A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group [email protected] NERSC User Group Meeting September 17, 2007

Transcript of A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group [email protected] NERSC...

Page 1: A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group kantypas@lbl.gov NERSC User Group Meeting September 17, 2007.

A New Parallel Debugger for Franklin: DDT

Katie AntypasUser Services Group

[email protected]

NERSC User Group MeetingSeptember 17, 2007

Page 2: A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group kantypas@lbl.gov NERSC User Group Meeting September 17, 2007.

NERSC User Group Meeting, September 17, 2007 2

Outline

• Parallel debugger usage at NERSC• Comparison of Totalview and Allinea DDT• Selecting a parallel debugger for NERSC:

Allinea DDT– Functionality– License model and Price

• Current Status– Acceptance Testing– User availability

Page 3: A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group kantypas@lbl.gov NERSC User Group Meeting September 17, 2007.

NERSC User Group Meeting, September 17, 2007 3

Since parallel debuggers are valuable, yet expensive tools for HPC centers, survey

actual debugger usage at NERSC on Seaborg and Bassi to see if resources can

be better optimized.

Motivation

Page 4: A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group kantypas@lbl.gov NERSC User Group Meeting September 17, 2007.

NERSC User Group Meeting, September 17, 2007 4

Totalview Usage on Seaborg and Bassi

Number of times users have run Totalview on Seaborg in the past year

Number of times

27 Users ran Totalview

fewer than 5 times

Number of times users have run Totalview on Bassi in the past 18 months

Number of times

23 Users ran Totalview

between 10 and 25 times

Number of Totalview runs 507Number of unique users 47Average usage/user 10.8

Number of Totalview runs 1268Number of unique users 81Average usage/user 15.7

Page 5: A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group kantypas@lbl.gov NERSC User Group Meeting September 17, 2007.

NERSC User Group Meeting, September 17, 2007 5

Totalview usage• Very roughly ~15-20 % of active users have run

Totalview• Functionality requested is basic

– Find cause for crashes and code hangs– Examine variables across processors– Users typically aren’t using Totalview for analysis

• Users are running at lower concurrencies than we expected– Many users debug codes locally and run in production mode at

NERSC– In many codes an error at 512 processors can be detected at

32 processors. – Totalview runs interactively and users must wait a longer time

for more nodes– Debuggers can run slowly at 256 + processors

• Rarely were all licenses checked out

Page 6: A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group kantypas@lbl.gov NERSC User Group Meeting September 17, 2007.

NERSC User Group Meeting, September 17, 2007 6

Another Debugger in the Market: Allinea Software’s DDT

• DDT (Distributed Debugging Tool)– Some HPC Customers

• Lawrence Livermore National Lab (LLNL)• Texas Advanced Computing Center (TACC)• Barcelona Supercomputing Center (BSC)• Leibniz Computing Center (LRZ)• HPC Center Stuttgart (HLRS)• CEA, IPGP, ONERA - France• CINECA, CASPUR - Italy• AWE, RAL - UK

• Spring 2007 tested DDT on NERSC platforms– Low learning curve for Totalview users– Basic debugging functionality worked as expected– Found some bugs, all on AIX– Responsive developers– Viable alternative to Totalview

• Created an RFP to get best response from vendors

Page 7: A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group kantypas@lbl.gov NERSC User Group Meeting September 17, 2007.

NERSC User Group Meeting, September 17, 2007 7

Weighing the Debuggers ...

• Established company and technology with large market share

• Totalview debugger ported to most platforms and tested on many codes

• Full featured parallel debugger with advanced features such as debugging with multiple executables, GAS languages, sophisticated analysis tools

• Inflexible license server model

• Expensive

Totalview• Younger company, established

market in Europe but smaller American presence

• Basic Parallel Debugging functionality

• Linux strongest supported operating system. (Increasing support for AIX)

• Responsive developers

• Flexible license model

• Lower price

Allinea DDT

Page 8: A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group kantypas@lbl.gov NERSC User Group Meeting September 17, 2007.

NERSC User Group Meeting, September 17, 2007 8

DDT Licensing Model and Price

• Flexible model– 1024 processors– Can be divided any way• One 1024 processor job• Two 512 processor jobs• One 512, one 256, four 64 processor jobs

• Significantly cheaper than Totalview

Page 9: A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group kantypas@lbl.gov NERSC User Group Meeting September 17, 2007.

NERSC User Group Meeting, September 17, 2007 9

DDT Functionality

• Parallel Debugger– Support for MPI, OpenMP, pthreads– Fortran, C, C++

• Typical serial debugging features – set breakpoints and watches, step through program,

dive into arrays, evaluate expressions, analyze core files

• Parallel debugging features• Step through processors

• View variables across processors

• Grouping processors Parallel Stack View

• Other Features– Memory Debugging– Visualization Tools

Page 10: A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group kantypas@lbl.gov NERSC User Group Meeting September 17, 2007.

NERSC User Group Meeting, September 17, 2007 10

User Interface

Page 11: A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group kantypas@lbl.gov NERSC User Group Meeting September 17, 2007.

NERSC User Group Meeting, September 17, 2007 11

Parallel Stack View• Allows user to see position of each processor in

the code in the same window• Essentially groups processors by location in

code -- only reasonable strategy at high concurrencies

• Easily can find stray processor• Can create sub-groups of processors

8888

88

2

Page 12: A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group kantypas@lbl.gov NERSC User Group Meeting September 17, 2007.

NERSC User Group Meeting, September 17, 2007 12

Current Status

• Acceptance Testing DDT on Franklin– Running 5-6 codes with DDT at various

concurrencies– Testing MPI, OpenMP, Fortran, C, C++, mixed-

mode applications

• Demo on Thursday• Available for users to try• Please let us know if you have any problems• Excited to have DDT on Franklin and think it

is good for the HPC community to have options in parallel debugging

Page 13: A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group kantypas@lbl.gov NERSC User Group Meeting September 17, 2007.

NERSC User Group Meeting, September 17, 2007 13

Questions?