TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM....

23
TeraGrid: Past, Present, and Future Phil Andrews, Director, National Institute for Computational Sciences, Oak Ridge, Tennessee

Transcript of TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM....

Page 1: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

TeraGrid: Past, Present, and Future

Phil Andrews, Director, National Institute for Computational Sciences, Oak Ridge, Tennessee

Page 2: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

2

TeraGrid Abstract:

The TeraGrid is the amalgamation of the major U.S. National Science Foundation HPC centers, currently with approximately a dozen Resource Provider sites and a total computational capacity of over 1 Petaflop. This is scheduled to increase to about 2 PF in 2009 and 3 PF in 2010. The TeraGrid itself will experience a major transformation in 2010, with a new organizational model expected to replace the current approach. This reorganization is presently in the proposal stage. We will discuss the history of the TeraGrid, the past changes in the funding and strategic plans, and any available information on future directions, including technology, standards, and international collaboration.

EGEE/OGF Meeting, Catania, March’09

Page 3: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

3

TeraGrid: Beginnings2001: NSF sent “Dear Colleagues” letter to SDSC, NCSA to provide coordinated systemsFor complex reasons, CACR (CalTech) and Argonne National Laboratory (ANL) addedSignificant funding earmarked for connecting network: initially intended as a mesh, then a more flexible “Grid”Named: Distributed TeraScale Facility

EGEE/OGF Meeting, Catania, March’09

Page 4: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

4

Technical Details (2001, DTF)

Intended to be highly homogenous (IA64)Every node GbE connected40 Gb/s backbone, 30 Gb/s to each siteStandardized on Globus middlewareInitial award: $53MInitial capability: 13.6 TFHigh point of standardization for TeraGridIntended use: distributed MPI system!

EGEE/OGF Meeting, Catania, March’09

Page 5: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

5

NCSA6+2 TF4 TB Memory240 TB disk

SDSC4.1 TF2 TB Memory225 TB SAN

Caltech0.5 TF.4 TB Memory86 TB disk

ANL1 TF.25 TB Memory25 TB disk

32

32

5

32

32

5

TeraGrid: 13.6 TF, 6.8 TB memory, 79 TB internal disk, 576 network disk

HPSS HPSS

HPSS300 TB

ESnetHSCCMREN/AbileneStarlight

32

24

8

32

24

8 4

Juniper M160

OC-12

OC-48

OC-12

574p IA-32 Chiba City

128p Origin

HR Display & VR Facilities

256p HP X-Class

128p HP V2500

92p IA-32

MyrinetMyrinet MyrinetMyrinet

Chicago & LA DTF Core Switch/RoutersCisco 65xx Catalyst Switch (256 Gb/s Crossbar)

OC-12

OC-12

OC-3

vBNSAbileneMREN

1176p IBM SP1.7 TFLOPsBlue Horizon

OC-48NTON

4

4

2 x Sun E10K

4 15xxp Origin

UniTree

1024p IA-32320p IA-64

2

14

8

vBNSAbileneCalrenESnet

OC-12

OC-12

OC-12

OC-3

SunServer

GbE

24

Extreme Blk Diamond

OC-12 ATMCalren

16

10

Page 6: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

6

Extended TeraScale Facility (ETF)

Added Pittsburgh Supercomputing Center, no longer homogenousSite specialization (compute, data, visualization, etc.)Distributed independent sub-tasks, not a single taskBegan to relax standardization

EGEE/OGF Meeting, Catania, March’09

Page 7: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

7 NCSA: Compute-Intensive

ANL: VisualizationCaltech: Data collection analysis

SDSC: Data-Intensive PSC: Heterogeneity

4 TF Intel IA-64

1.1 TF IBM Power4

500 TB Online Disk

Sun Disk ServerIBM Database Server

6.TF Compaq EV680.3 TF EV7

221 TB Online Disk

Disk Cache Server

1.25 TF Intel IA-64

20 TB Online Disk

96 Visualization nodes

0.4 TF Intel IA-64

80 TB Online Disk

IA-32 Datawulf

Sun Storage Server

40Gb/sec Extensible Backplane NetworkLA Hub Chicago Hub

Storage Server

Online Disk Storage

Clusters

Shared Memory

Visualization Cluster

LEGEND

The TeraGrid (including ETF)

10 TF Intel IA-64(128 Large-Memory nodes)

230 TB Online Disk

Page 8: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

8

TeraGrid: more sites, less network

Added Purdue, Indiana, TACCIntroduced Coordinated TeraGrid Software and Services (CTSS)Network “minimum” reduced to 10 Gb/sBegan work on parallel global file systems

EGEE/OGF Meeting, Catania, March’09

Page 9: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

9

TeraGrid Network

EGEE/OGF Meeting, Catania, March’09

Page 10: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

10

Tracks 1 and 2, 2005: Big Iron gets Bigger

4 yearly Track 2 RFPs announced: $30M for capital, operations funded separately, production in 2007,8,9,10Single Track 1 RFP, $200M for “sustained PetaScale system”, production in 2011TeraGrid connectivity and consistency important

EGEE/OGF Meeting, Catania, March’09

Page 11: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

11

Track2A awardRecommended to TACC (U.T. Austin) in 2006, awarded in 2007Sun System, AMD processors, CBBW network interconnect, custom (Magnum) switch (2)Over 60,000 cores, 500+ TF, ~125TB, ~3.4MWBegan production Q1, 2008

EGEE/OGF Meeting, Catania, March’09

Page 12: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

12

Track2A, Texas Advanced Computing Center, Austin, Texas

EGEE/OGF Meeting, Catania, March’09

Page 13: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

13

Track2B: Kraken Cray XT5 at NICS

13 EGEE/OGF Meeting, Catania, March’09

Page 14: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

14

Kraken XT5 OverviewProcessor: Quad-core 2.3 GHz AMD OpteronNumber of Nodes/Cores: 8256 nodes / 66,048 cores Peak Performance: 607.64 TFInterconnect: Seastar 2 (3D Torus)Memory: 100 TBDisk: 3.3 PB 30 GB/s I/OUpgrade to ~100,000 cores, ~1PF later this yearProviding over 400M CPU hours/year

EGEE/OGF Meeting, Catania, March’09

Page 15: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

15

Current TeraGrid

Very heterogeneous, both size, architectureCTSS now in “kits”; sites can opt in or outAdded NCAR, LONI; lost CACRSplit services into “common”, provided by TeraGrid, “specific” provided by RP’sCommon allocation process, accountingAlmost 1.5 PF aggregate

EGEE/OGF Meeting, Catania, March’09

Page 16: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

16

SDSC

TACC

UC/ANL

NCSA

ORNL

PU

IU

PSC

NCAR

Resource Provider (RP)

Current TeraGrid Map

NICS

LONI

Network Hub

EGEE/OGF Meeting, Catania, March’09

Page 17: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

17

Future TeraGrid

Track2C: ~1PF SGI system at PSC (2009)Track2D: data intensive, experimental, throughput, and test-bed systems (2010)Track1: ~10PF IBM system at NCSA (2011)Current TeraGrid expires in 2011Follow-on: “eXtreme Digital” (XD) proposals; two competing partnerships

EGEE/OGF Meeting, Catania, March’09

Page 18: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

Early XT5 Users Mathieu Luisier –

Purdue University

Accelerating Nanoscale Transistor Innovation through Petascale Simulation

~3M hours allocated on Kraken (with Gerhard Klimeck)

Objectives:•

Test Nanoelectronics Simulator OMEN –

full-band, atomistic quantum transport simulator for 1-D, 2-D, 3-D nanoelectronics devices

Full-machine scaling runs•

Achieve new science (model larger devices)

Results:•

Almost perfect scaling up to 64K cores•

Tuning of structure dimensions –

210 Tflops•

Realistic circular nanowire transistor structures with diameters up to 10nm

Impacts:•

Impressively short startup time•

High reliability on 64K cores: ~15 jobs started, no failure•

Straightforward migration from XT4 to XT5•

Strong support from NICS through website and [email protected]

18

Momentum

Energy

Domain Decomposition

• 173 TFLOPs for Matrices of Size N=55,440• 208 TFLOPs for N=129,360

3-D Nanowire

Bias

Points

Page 19: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

Early XT5 Users Diego Donzis –

University of Maryland

Direct Numerical Simulations (DNS): compute turbulent fluctuations at all scales according to exact, unsteady, 3D conservation equations

~15M hours allocated on Kraken (with PK Yeung)•

Computationally very demanding: simple geometry, efficient codes

Computational power today (Kraken XT4, XT5): simulations at conditions not possible earlier

4096^3 grid (current record by Kaneda 03, but short simulations and no passive scalars)

Understanding of mixing, resolve long standing issues

Strong scaling from 32K to 64K cores:

80% for 4096^3; 85% for 8192^3

Alltoalls: bottleneck–

Better performance with 4 cores/node (solid blue symbol) –

Strong scaling ~ 95% 32k -> 64k

Science: production 4096^3 simulation reached stationary state; ready to include Lagrangian particles

study dispersion at record Reynolds number

19

Circles: 4096^3Squares: 8192^3

Number of cores

1/M

•3D FFTs: N^3 log2(N) for N^3 grid points•2D domain decomposition: best proc grid•Perfect scaling: N^3 log2(N) / M

•(M: # of cores)

Page 20: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

Early XT5 Users Yifeng Cui

Southern California Earthquake Center

Largest and most detailed earthquake simulation on San Andreas fault to improve public awareness and readiness for the ‘Big One’

~8M hours allocated on Kraken (with Tom Jordan)

600 x 300 x 80 km domain, 100m resolution, 14.4 billion grids, upper frequency limit to 1-

Hz, 3 minutes, 50k time steps, min surface velocity 500m/s, dynamic source, velocity properties SCEC CVM4.0, 1 terabyte inputs, 5 terabytes output

Structured 3D with 4th

order staggered-grid finite differences for velocity and stress, Fortran 90, message passing done with MPI using domain decomposition, I/O with MPI-IO

Simulation completed within 5 hours on Kraken XT5 during Friendly User week, a milestone in earthquake simulation

PetaShake: a petascale cyberfacility for physics-based seismic hazard analysis funded by NSF

PetaShake: a petascale cyberfacility for physics-based seismic hazard analysis funded by NSF

Page 21: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

Early XT5 Users Robert Harkness –

UC San Diego

~6M hours allocated on Kraken (with Mike Norman)•

ENZO Cosmology Simulations–

Lyman Alpha Forest–

100 Mpc/h box, 2048^3 mesh, 8 billion dark matter particles

H, He ioinization, heating & cooling. 15 Baryon fields,

8 particle properties (position, velocity, mass and index).–

Re-started* from Kraken-XT4 simulation at redshift Z=1.3.–

Hybrid MPI/OpenMP code, 2048 MPI tasks, 4 and 8 OpenMP threads.

Input dataset ~ 1 TByte. Output datasets ~1 TByte per dump.

Over 100 TB written by Feb 10th -

with NO ERRORS.–

Results archived in NICS HPSS -

rates above 300 MB/sec.•

Strong Scaling & Hybrid Scaling Test–

1024^3 mesh, 1 billion dark matter particles.–

128, 256, 512 & 1024 MPI tasks and 1, 2, 4 or 8 OpenMP threads.

AMR code testing–

1024^3 L10 restart 512 MPI tasks, 4 threads, 443,802 subgrids!

21

Page 22: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

Intermittency in Interstellar Turbulence Alexei Kritsuk, UC, San Diego•

Developed and tested a new numerical method for ideal MHD.

Compared its performance with that of ZEUS, FLASH, RAMSES, and other ideal MHD codes

Grid resolutions for the decaying turbulence problem: 256^3 and 512^3

Carried out a parameter survey of supersonic MHD turbulence at resolution of 512^3 with different degrees of magnetization

Largest runs used 1024 cores for simulations on 512^3 grids, up to 75,000 integration time steps per run

S. D. Ustyugov, M. V. Popov, A. G. Kritsuk, and M. L. Norman “Piecewise Parabolic Method on a Local Stencil for Magnetized Supersonic Turbulence Simulation”, submitted to J. Comp. Phys, Dec. 2008.

22

Figure: A PPML simulation of supersonic, super-

Alfvénic ideal MHD turbulence.

Flow evolution in a MHD turbulence models. Shows the density field on three faces of the cubic domain for a Mach 10 simulation with a weak magnetic field.

Page 23: TeraGrid: Past, Present, and Future - Open Grid Forum · PDF fileBlk Diamond. OC-12 ATM. Calren. 16. 10. 6. ... ~15M hours allocated on Kraken (with PK Yeung) ... Carlos Simmerling,

Using computer simulations to investigate conformational dynamics of HIV-1 protease

Carlos Simmerling, SUNY Stony Brook

HIV-1 protease–

This enzyme is responsible for an essential step in the viral maturation and lifecycle of HIV-1. Inhibition of this enzyme results in noninfectious immature virus particles.

However,

this enzyme mutates to elude drugs which bind to its active site and inhibit its action.

How do mutations of HIV-1 protease affect inhibitor binding?

Conformational Dynamics of HIV-1 protease–

Mutations occur both inside and outside of the active site. It is not well understood how mutations outside the active site affect inhibitor binding.

The mutations exterior to the active site are thought to affect the mobility of two gate-keeping molecular flaps near the active site.

Both experimental and computational methods were utilized and compared to examine the affect of mutations on HIV-1 protease.

23

Galiano, L. et. al. J. Am. Chem. Soc. 2009, 131, 430-431. Chemical & Engineering News 2009, 87(1), 26