Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What...

25
Challenges in High Performance Computing William Gropp www.cs.illinois.edu/~wgropp

Transcript of Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What...

Page 1: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

Challenges in High Performance Computing

William Gropp www.cs.illinois.edu/~wgropp

Page 2: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

2

What is HPC?

• High Performance Computing is the use of computing to solve challenging problems that require significant computing resources ♦ Supercomputers ♦ Clusters (Large and Small) ♦ Accelerator-equipped workstations ♦ Custom hardware (e.g., Anton)

• Traditionally FLOPS ♦ But could be memory, data, bandwidth,

real-time,…

Page 3: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

3

What Sort of Problems are Solved with HPC?

• Engineering everything from consumer products to spacecraft ♦ Getting Pringles into a can (3-D, time-

dependent CFD) ♦ Optimizing bottles for minimum weight with

required robustness ♦ Fuel-efficient aircraft with novel materials

• Insights in science ♦ Formation of galaxies; effects of different

theories of dark matter and energy ♦ Weather and climate forecasting ♦ Formation (and points of attack) of viruses

such as HIV

Page 4: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

4

Advancing Science and Engineering

Molecular Science Weather & Climate Forecasting

Earth Science Astronomy Health

Advances in a broad range of science and engineering disciplines will be enabled by sustained petascale computers:

Page 5: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

5

Messages

• Big is big ♦ Data driven is an important area, but not all data

driven problems are big data (despite current hype). The distinction is important

♦ There are different measures of big, but a TB of data that can be processed by a linear algorithm is not big

• Key feature of an extreme computing system is a fast interconnect ♦ Low latency, high link bandwidth, high bisection

bandwidth ♦ Provides fast access to data everywhere in system,

particularly with one-sided access models • Think map(r1,r2, … ) – function that requires more

than one record, where the specific input records are unpredictable (e.g., data dependent on previous result)

Page 6: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

6

HPC and Clouds and Big Data

• Clouds can provide some HPC services ♦ Esp. where each “experiment” runs on one

node • Single nodes can do a lot!

• But clouds very poor at applications that require tightly coordinated computations across tens or hundreds of thousands of nodes ♦ The one thing that makes a supercomputer

super today is a high-bandwidth, low-latency interconnect, and the software to match

• Big Data also requires big compute ♦ Some applications only need independent

computation (e.g., clouds)

Page 7: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

7

Extrapolation is Risky

• 1989 – T – 24 years ♦ Intel introduces 486DX ♦ Eugene Brooks writes “Attack of the Killer

Micros” ♦ 4 years before TOP500 ♦ Top systems at about 2 GF Peak

• 1999 – T – 14 years ♦ NVIDIA introduces its GPU (GeForce 256)

• Programming GPUs still a challenge 14 years later ♦ Top system – ASCI Red, 9632 cores, 3.2 TF

Peak (about 3 GPUs in 2013) ♦ MPI is 7 years old

Page 8: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

8

HPC Today

• High(est)-End systems ♦ 1 PF (1015 Ops/s) achieved on a few “peak friendly” by

applications 2011; Blue Waters demonstrated > 1PF end-to-end on a larger set this year

♦ Much worry about scalability, how we’re going to get to an ExaFLOPS

♦ Systems are all oversubscribed • DOE INCITE awarded almost 900M processor hours in 2009;

1600M-1700M hours in 2010-2012; over 5B hours in 2013 • NSF PRAC awards for Blue Waters similarly competitive

• Widespread use of clusters, many with accelerators; cloud computing services ♦ These are transforming the low and midrange

• Laptops (far) more powerful than the supercomputers I used as a graduate student

Page 9: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

9

HPC in 2011

• Sustained PF systems ♦ K Computer (Fujitsu) at RIKEN, Kobe, Japan (2011) ♦ “Sequoia” Blue Gene/Q at LLNL ♦ NSF Track 1 “Blue Waters” at Illinois ♦ “Milky Way-2” (TH-2) in China (apps yet to be shown)

• Still programmed with MPI and MPI+other (e.g., MPI+OpenMP or MPI+OpenCL/CUDA or MPI+OpenACC) ♦ But in many cases using toolkits, libraries, etc.

• And not so bad – applications will be able to run when the system is turned on

♦ Replacing MPI will require some compromise – e.g., domain specific (higher-level but less general)

• Lots of evidence that fully automatic solutions won’t work

Page 10: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

10

Blue Waters and Sequoia Computing Systems

NCSA LLNL System Attribute Blue Waters Sequoia Vendors Cray/AMD/NVIDIA IBM Processors Interlagos/Kepler PowerPCA2 variant

Total Peak Performance (PF) 11.9 20.1 Total Peak Performance (CPU/GPU) 7.6/4.3 20.1/0.0

Number of CPU Chips (8, 16 cores/chip) 48,576 98,304 Number of GPU Chips 3,072 0

Amount of CPU Memory (TB) 1,510 1,572

Interconnect 3D Torus 5D Torus

Amount of On-line Disk Storage (PB) 26 50(?) Sustained Disk Transfer (TB/sec) >1 0.5-1.0 Amount of Archival Storage >300 ? Sustained Tape Transfer (GB/sec) >100 ?

Page 11: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

11

Blue Waters and MilkyWay-2 Computing Systems

NCSA NUDT System Attribute Blue Waters Milky Way 2 Vendors Cray/AMD/NVIDIA NUDT/Inspur Processors Interlagos/Kepler IvyBridge/Phi

Total Peak Performance (PF) 11.9 54.9 Total Peak Performance (CPU/GPU) 7.6/4.3 6.8/48.1

Number of CPU Chips (8, 16 cores/chip) 48,576 (12 cores) 32,000 Number of GPU Chips 3,072 48,000

Amount of CPU Memory (TB) 1,510 1,404

Interconnect 3D Torus Fat Tree

Amount of On-line Disk Storage (PB) 26 12(?) Sustained Disk Transfer (TB/sec) >1 ? Amount of Archival Storage >300 ? Sustained Tape Transfer (GB/sec) >100 ?

Page 12: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

12

HPC in 2018-2020

• Exascale systems are likely to have ♦ Extreme power constraints, leading to

• Clock Rates similar to today’s systems • A wide-diversity of simple computing elements (simple for

hardware but complex for software) • Memory per core and per FLOP will be much smaller • Moving data anywhere will be expensive (time and power)

♦ Faults that will need to be detected and managed • Some detection may be the job of the programmer, as

hardware detection takes power ♦ Extreme scalability and performance irregularity

• Performance will require enormous concurrency • Performance is likely to be variable

− Simple, static decompositions will not scale ♦ A need for latency tolerant algorithms and

programming • Memory, processors will be 100s to 10000s of cycles

away. Waiting for operations to complete will cripple performance

2020-2023

Page 13: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

13

Algorithms and Applications Will Change

• Applications need to become more dynamic, more integrated

• System software must work with application: ♦ Code complexity (Autotuning) ♦ Dynamic resources (no simple PGAS) ♦ Latency hiding (Nonblocking algorithms,

interfaces, futures) ♦ Resource sharing (more performance

information, performance asserts, runtime coordination)

Page 14: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

14

How Do We Make Effective Use of These Systems?

• Better use of our existing systems ♦ Blue Waters provides a sustained PF, but that typically

requires ~10PF peak • Improve node performance

♦ Make the compiler better ♦ Give better code to the compiler ♦ Get realistic with algorithms/data structures

• Improve parallel performance/scalability • Improve productivity of applications

♦ Better tools and interoperable languages, not a (single) new programming language

• Improve algorithms ♦ Optimize for the real issues – data movement, power,

resilience, …

Page 15: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

15

Common Themes

• Multiple operations must be pending at any time ♦ Asynchronous I/O, communication, even

computation ♦ “Split” computations and communication

• Complex systems require adaptive approaches ♦ “Autotuning” for likely choices, runtime optimization

• Operations must be on aggregates ♦ CPU: “vectors” (GPU gangs/workers/vectors) ♦ I/O: Collective, parallel I/O

• Example: Parallel collective I/O for a distributed data structure ♦ mesh distributed across all nodes

Page 16: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

16

Four Levels of Collective I/O

16 Processes 3 2 1 0

Level 0

Level 1

Level 2

Level 3

Page 17: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

17

Distributed Array Access: Write Bandwidth

17

128 procs 256 procs 256procs 128 procs 32 procs

Array size: 512 x 512 x 512

Thanks to Weikuan Yu, Wei-keng Liao, Bill Loewe, and Anthony Chan for these results.

Not

e:Lo

g Sca

le!

1GB data

Page 18: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

18

What’s Different at Peta/Exascale

• Performance Focus ♦ Only a little – basically, the resource is expensive, so a

premium placed on making good use of resource ♦ Quite a bit – node is more complex, has more features

that must be exploited, interconnect performs operations • Scalability

♦ Solutions that work at 100-1000 way often inefficient at 100,000-way

♦ Some algorithms scale well • Explicit time marching in 3D

♦ Some don’t • Direct implicit methods

♦ Some scale well for a while • FFTs (communication volume in Alltoall)

♦ Load balance, latency are critical issues • Fault Tolerance becoming important

♦ Now: Reduce time spent in checkpoints ♦ Soon: Lightweight recovery from transient errors

Page 19: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

19

Preparing for the Next Generation of HPC Systems

• Better use of existing resources ♦ Performance-oriented programming ♦ Dynamic management of resources at all levels ♦ Embrace hybrid programming models (you have

already if you use SSE/VSX/OpenMP/…) • Focus on results

♦ Adapt to available network bandwidth and latency ♦ Exploit I/O capability (available space grew faster

than processor performance!) • Prepare for the future

♦ Fault tolerance ♦ Hybrid processor architectures ♦ Latency tolerant algorithms ♦ Data-driven systems

Page 20: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

20

Research Directions

• Integrated, interoperable, component oriented languages ♦ Generalization of so-called domain-specific language

• Really data-structure-specific languages

• Performance modeling and tuning ♦ Performance info in language; performance

considered as part of correctness • Fault tolerance at the high end

♦ Fault tolerance features in the language, working with hardware and algorithms

• Correctness ♦ Correctness features for testing in the language ♦ Support for special cases (e.g., provably

deterministic expression of deterministic algorithms)

Page 21: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

21

Recommended Reading

• Bit reversal on uniprocessors (Alan Karp, SIAM Review, 1996)

• Achieving high sustained performance in an unstructured mesh CFD application (W. K. Anderson, W. D. Gropp, D. K. Kaushik, D. E. Keyes, B. F. Smith, Proceedings of Supercomputing, 1999)

• Experimental Analysis of Algorithms (Catherine McGeoch, Notices of the American Mathematical Society, March 2001)

• Reflections on the Memory Wall (Sally McKee, ACM Conference on Computing Frontiers, 2004)

Page 22: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

22

Still open – a once a year opportunity to work with high-end networks sc13.supercomputing.org/content/scinet-network-research-program

Page 23: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

23

Six Questions

1. What is the appropriate balance between HPC needs at the extreme scale (fundamental research that can be done in no other way) and the needs of the “long tail” (research that needs more than a desktop computer)? How do you support all computing needs for research?

2. Applications have needs and wants. These may not be the same. E.g., applications want to use their existing algorithms and code, but it may not be possible to run those fast enough. The application may need to change approach. How do you get application scientists to separate their needs and wants?

3. HPC is about performance. How do you support both the research and especially the engineering work to make codes efficient? If you don’t do this, where do you find the funds to buy the additional computational capacity required to meet the additional need created by running less than optimized codes?

Page 24: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

24

Six Questions (con’t)

4. Data and HPC should be closely connected. Truly big data (much greater than 10 PB) requires significant compute, for example. How do you change the perception that big data and big compute are antagonistic? What Big Data problems would be best solved on big compute platforms such as Blue Waters?

5. Current computer Architecture is reaching its limits. Where are the new architectures? How do you solve the chicken and egg problem – do new architectures require a demand from new applications, and applications require a well-established, dependable architecture? Should NSF only consume architectures created by others (whether industry or other agencies) or should it have some control of its core computational technology?

6. How do we get past the “usual suspects” of applications (computational fluid dynamics, n-body problems, etc.)? How do we extend the use of HPC into new areas in science, engineering, and the humanities?

Page 25: Challenges in High Performance Computingcyberbridges.org/2013/presentations/BillGropp.pdf · What is HPC? • High Performance Computing is the use of computing to solve challenging

25

Six Questions: The Short Form

1. What is the right balance between HPC and other computing infrastructure?

2. How can we encourage applications to try new approaches?

3. How do we ensure applications make efficient use of infrastructure?

4. What Big Data problems need HPC? 5. How can we support innovative

computer architecture research? 6. How do we bring computing to new

areas of science?