1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center [email protected].

22
1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center [email protected]

Transcript of 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center [email protected].

Page 1: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

1

An Introduction to the

Jeffrey P. GardnerPittsburgh

Supercomputing [email protected]

Page 2: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 2

National Science Foundation TeraGrid

The world’s largest collection of supercomputers

Page 3: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 3

Pittsburgh Supercomputing Center

Founded in 1986 Joint venture between Carnegie Mellon

University, University of Pittsburgh, and Westinghouse Electric Co.

Funded by several federal agencies as well as private industries.

Main source of support is National Science Foundation

Page 4: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 4

Pittsburgh Supercomputing Center

PSC is the third largest NSF sponsored supercomputing center

BUT we provide over 60% of the computer time used by the NSF research

AND PSC most recently had the most powerful supercomputer in the world (for unclassified research)

Page 5: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 5

Pittsburgh Supercomputing Center

SCALE: 3000 processors

SIZE: 1 basketball court

COMPUTING POWER: 6 TeraFlops (6 trillion

floating point operations per second)

Will do in 3 hours what a PC will do in a year

The Terascale Computing System (TCS) at the Pittsburgh Supercomputing Center

Upon entering production in October 2001, the TCS was the most powerful computer in the world for

unclassified research

Page 6: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 6

Pittsburgh Supercomputing Center

HEAT GENERATED: 2.5 million BTUs (169 lbs of coal per hour)

AIR CONDITIONING: 900 gallons of water per

minute (375 room air

conditioners) BOOT TIME:

~3 hoursThe Terascale Computing System (TCS) at the Pittsburgh Supercomputing Center

Upon entering production in October 2001, the TCS was the most powerful computer in the world for

unclassified research

Page 7: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 7

Pittsburgh Supercomputing Center

Page 8: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 8

NCSA: National Center for Super-computing Applications

SCALE: 1774 processors

ARCHITECHTURE: Intel Itanium2

COMPUTING POWER: 10 TeraFlops

The TeraGrid cluster “Mercury” at NCSA

Page 9: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 9

TACC:Texas Advanced Computing Center

SCALE: 1024 processors

ARCHITECHTURE: Intel Xeon

COMPUTING POWER: 6 TeraFlops

The TeraGrid cluster “LoneStar” at TACC

Page 10: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 10

Before the TeraGrid:Supercomputing “The Old Fashioned way”

Each supercomputer center was it’s own independent entity.

Users applied for time at a specific supercomputer center

Each center supplied its own: compute resources archival resources accounting user support

Page 11: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 11

The TeraGrid Strategy

Creating a unified user environment…

Single user support resources. Single authentication point Common software functionality Common job management

infrastructure Globally-accessible data

storage …across heterogeneous

resources 7+ computing architectures 5+ visualization resources diverse storage technologies

Create a unified national HPC infrastructure that is both heterogeneous and extensible

Page 12: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 12

The TeraGrid Strategy A major

paradigm shift for HPC resource providers

Make NSF resources useful to a wider community

Strength through uniformity!Strength through diversity!

TeraGrid Resource Partners

Page 13: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 13

TeraGrid Components

Compute hardware Intel/Linux Clusters Alpha SMP clusters IBM POWER3 and POWER4 clusters SGI Altix SMPs SUN visualization systems Cray XT3 (PSC July 20) IBM Blue Gene/L (SDSC Oct 1)

Page 14: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 14

TeraGrid Components

Large-scale storage systems hundreds of terabytes for secondary storage

Very high-speed network backbone (40Gb/s) bandwidth for rich interaction and tight

coupling Grid middleware

Globus, data management, … Next-generation applications

Page 15: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 15

Building a System of Unprecidented Scale

40+ teraflops compute

1+ petabyte online storage

10-40Gb/s networking

Page 16: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 16

TeraGrid ResourcesANL/UC

CaltechCACR

IU NCSA ORNL PSC Purdue

SDSC TACC

ComputeResources

Itanium2(0.5 TF)

IA-32(0.5 TF)

Itanium2(0.8 TF)

Itanium2(0.2 TF)

IA-32(2.0 TF)

Itanium2

(10 TF)

SGI SMP(6.5 TF)

IA-32(0.3 TF)

XT3(10 TF)TCS (6 TF)Marvel(0.3 TF)

Hetero (1.7 TF)

Itanium2

(4.4 TF)

Power4(1.1 TF)

IA-32(6.3 TF)

Sun (Vis)

Online Storage

20 TB 155 TB

32 TB 600 TB 1 TB 150 TB

540 TB 50 TB

MassStorage

1.2 PB 3 PB 2.4 PB 6 PB 2 PB

Data Collections

Yes Yes Yes Yes Yes

Visualization

Yes Yes Yes Yes Yes

Instruments

Yes Yes Yes

Network(Gb/s,Hub)

30CHI

30LA

10CHI

30CHI

10ATL

30CHI

10CHI

30LA

10CHI

Page 17: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 17

“Grid-Like” Usage ScenariosCurrently Enabled by the TeraGrid

“Traditional” massively parallel jobs Tightly-coupled interprocessor communication storing vast amounts of data remotely remote visualization

Thousands of independent jobs Automatically scheduled amongst many TeraGrid

machines Use data from a distributed data collection

Multi-site parallel jobs Compute upon many TeraGrid sites

simultaneously

TeraGrid is working to enable more!

Page 18: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 18

Allocations Policies

Any US researcher can request an allocation

Policies/procedures posted at: http://www.paci.org/Allocations.html

Online proposal submission https://pops-submit.paci.org/

Page 19: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 19

Allocations Policies

Different levels of review for different size allocations

DAC: “Development Allocation Committee” up to 30,000 Service Units (“SUs”, 1 SU =~ 1 CPU Hour) only a one paragraph abstract required Must focus on developing an MRAC or NRAC application accepted continuously!

MRAC: “Medium Resource Allocation Committee” <200,000 SUs/year reviewed every 3 months next deadline July 15, 2005 (then October 21)

NRAC: “National Resource Allocation Committee” >200,000 SUs/year reviewed every 6 months next deadline July 15, 2005 (then January 2006)

Page 20: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 20

Accounts and Account Management

Once a project is approved, the PI can add any number of users by filling out a simple online form

User account creation usually takes 2-3 weeks

TG accounts created on ALL TG systems for every user single US mail packet arriving for user accounts and usage synched through centralized

database

Page 21: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 21

Roaming and Specific Allocations

R-Type: “roaming” allocations can be used on any TG resource usage debited to a single (global) allocation

of resource maintained in a central database S-Type: “specific” allocations

can only be used on specified resource (All S-only awards come with 30,000 roaming

SUs to encourage roaming usage of TG)

Page 22: 1 An Introduction to the Jeffrey P. Gardner Pittsburgh Supercomputing Center gardnerj@psc.edu.

Boulder, CO 22

Useful links

TeraGrid website http://www.teragrid.org

Policies/procedures posted at: http://www.paci.org/Allocations.html

TeraGrid user information overview http://www.teragrid.org/userinfo/index.html

Summary of TG Resources http://www.teragrid.org/userinfo/guide_hardware_table.html

Summary of machines with links to site-specific user guides (just click on the name of each site)

http://www.teragrid.org/userinfo/guide_hardware_specs.html

Email: [email protected]