Taking Supercomputer Power to the End of Moore’s Law and...

24
Review & Approval System - Search Detail https://cfwebprod.sandia.gov/cfdocs/RAA/templates/index.cfm 1 of 2 1/2/2008 8:50 PM New Search Refine Search Search Results Clone Request Edit Request Cancel Request Search Detail Submittal Details Document Info Title : Copy of Taking Supercomputer Power to the End of Moore’s Law and Beyond Document Number : 5228960 SAND Number : 2005-0454 P Review Type : Electronic Status : Approved Sandia Contact : DEBENEDICTIS,ERIK P. Submittal Type : Viewgraph/Presentation Requestor : DEBENEDICTIS,ERIK P. Submit Date : 01/18/2005 Peer Reviewed? : N Author(s) DEBENEDICTIS,ERIK P. Event (Conference/Journal/Book) Info Name : Meetings with various visitors (NSA, Lockheed-Martin) at Sandia City : Albuquerque State : NM Country : USA Start Date : 12/16/2004 End Date : 12/31/2005 Partnership Info Partnership Involved : No Partner Approval : Agreement Number : Patent Info Scientific or Technical in Content : Yes Technical Advance : No TA Form Filed : No SD Number : Classification and Sensitivity Info Title : Unclassified-Unlimited Abstract : Document : Unclassified-Unlimited Additional Limited Release Info : None. DUSA : None. Routing Details Role Routed To Approved By Approval Date Derivative Classifier Approver SUMMERS,RANDALL M. SUMMERS,RANDALL M. 01/18/2005 Conditions:

Transcript of Taking Supercomputer Power to the End of Moore’s Law and...

Page 1: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Review & Approval System - Search Detail https://cfwebprod.sandia.gov/cfdocs/RAA/templates/index.cfm

1 of 2 1/2/2008 8:50 PM

New SearchRefine SearchSearch Results

Clone RequestEdit RequestCancel Request

Search Detail

Submittal DetailsDocument Info Title : Copy of Taking Supercomputer Power to the End of Moore’s Law and Beyond Document Number : 5228960 SAND Number : 2005-0454 P Review Type : Electronic Status : Approved Sandia Contact : DEBENEDICTIS,ERIK P. Submittal Type : Viewgraph/Presentation Requestor : DEBENEDICTIS,ERIK P. Submit Date : 01/18/2005 Peer Reviewed? : NAuthor(s) DEBENEDICTIS,ERIK P. Event (Conference/Journal/Book) Info Name : Meetings with various visitors (NSA, Lockheed-Martin) at Sandia City : Albuquerque State : NM Country : USA Start Date : 12/16/2004 End Date : 12/31/2005 Partnership Info Partnership Involved : No Partner Approval : Agreement Number : Patent Info Scientific or Technical in Content : Yes Technical Advance : No TA Form Filed : No SD Number : Classification and Sensitivity Info

Title : Unclassified-Unlimited Abstract : Document : Unclassified-Unlimited

Additional Limited Release Info : None.

DUSA : None.

Routing DetailsRole Routed To Approved By Approval Date

Derivative Classifier Approver SUMMERS,RANDALL M. SUMMERS,RANDALL M. 01/18/2005 Conditions:

Page 2: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Review & Approval System - Search Detail https://cfwebprod.sandia.gov/cfdocs/RAA/templates/index.cfm

2 of 2 1/2/2008 8:50 PM

Preliminary Manager Approver PUNDIT,NEIL D. PUNDIT,NEIL D. 01/18/2005 Conditions:

Classification Approver WILLIAMS,RONALD L. WILLIAMS,RONALD L. 01/24/2005 Conditions:

Manager Approver PUNDIT,NEIL D. Auto-Approved 01/24/2005 Conditions:

Administrator Approver LUCERO,ARLENE M. KRAMER,SAMUEL 04/18/2007

Created by WebCo Problems? Contact CCHD: by email or at 845-CCHD (2243).

For Review and Approval process questions please contact the Application Process Owner

Page 3: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Erik P. DeBenedictisErik P. DeBenedictisSandia National Laboratories

December 16, 2004

Taking Supercomputer Power to the End of Moore’s Law and Beyond

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.

SAND2005-0454P

Page 4: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Applications & Hardware

1 Zettaflops

100 Exaflops

10 Exaflops

1 Exaflops

100 Petaflops

10 Petaflops

1 Petaflops

100 Teraflops

System Performance

2000 2010 2020 2030 Year

Red Storm Cluster/MPP

Technology

Plasma Fusion

Simulation [Jardin 03]

2000 20202010

No schedule provided by source

Applications

[Jardin 03] S.C. Jardin, “Plasma Science Contribution to the SCaLeS Report,” Princeton Plasma Physics Laboratory, PPPL-3879 UC-70, available on Internet. [Malone 03] Robert C. Malone, John B. Drake, Philip W. Jones, Douglas A. Rotman, “High-End Computing in Climate Modeling,” contribution to SCaLeS report. [NASA 99] R. T. Biedron, P. Mehrotra, M. L. Nelson, F. S. Preston, J. J. Rehder, J. L. Rogers, D. H. Rudy, J. Sobieski, and O. O. Storaasli, “Compute as Fast as the Engineers Can Think!” NASA/TM-1999-209715, available on Internet. [NASA 02] NASA Goddard Space Flight Center, “Advanced Weather Prediction Technologies: NASA’s Contribution to the Operational Agencies,” available on Internet. [SCaLeS 03] Workshop on the Science Case for Large-scale Simulation, June 24-25, proceedings on Internet a http://www.pnl.gov/scales/. [DeBenedictis 04], Erik P. DeBenedictis, “Matching Supercomputing to Progress in Science,” July 2004. Presentation at Lawrence Berkeley National Laboratory, also published as Sandia National Laboratories SAND report SAND2004-3333P. Sandia technical reports are available by going to http://www.sandia.gov and accessing the technical library.

Compute as fast as the engineer

can think [NASA 99]

100× ↑1000×

[SCaLeS 03]

Geo

data

Ear

th

Sta

tion

Ran

ge

[NA

SA

02]

Full Global Climate [Malone 03]

Nanotech + Reversible Logic μP

(green) best-case logic (red)

Quantum Computing Not Obviously Measured

in FLOPS

Architecture: IBM Cyclops, FPGA, PIM

Page 5: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Outline

• Architecture Advances to Complete the Run of Moore’s Law

• Nanotech and Reversible Logic to Solve the Most Ambitious Problems

• Quantum Computing Alternative

Page 6: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

FPGA, PIM, and ASIC

• Sandia-based FPGA accelerator work– Arithmetic IP with interface to supercomputers and

algorithms – Keith Underwood, Scott Hemmert

• In-house work and collaborations on PIM– Collaboration with Notre Dame and Caltech/JPL;

instruction sets for PIMs– BG/Cyclops with NASA/NSA (see subsequent slides)– AMD collaboration re. PIM/DIMM

• 9200 manages ASIC development– Cray SeaStar for supercomputer communications

• Sandia has microelectronic fab capability (1700/CINT)

FPGA = Field Programmable Gate Array; IP = Intellectual Property (logic designs); PIM = Processor in Memory; DIMM = memory chip; 9200 = Bill Camp’s Organization; ASIC = Application Specific Integrated Circuit; 1700 = Sandia’s rad-hard fab line; CINT = Center for Integrated NanoTechnologies

Page 7: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

1 Petaflops Cyclops Sept. 2005 $15-18M

Page 8: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Cyclops Collaboration

• Cyclops POC is Baron Mills

• Project Objectives– Puts systems software

onto Cyclops to make it a “production” supercomputer

– Demonstrate 4 Science Applications @ 100×

FLOPS/$ over conventional supercomputer

• Funding status– Proposal to NASA,

currently under consideration

– Seeking other funding

• Possible Value of a Collaboration– Special-purpose

machine becomes general purpose, increasing utility

Page 9: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Outline

• Architecture Advances to Complete the Run of Moore’s Law

• Nanotech and Reversible Logic to Solve the Most Ambitious Problems

• Quantum Computing Alternative

Page 10: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Global Warming Requires Zettaflops

1 Zettaflops

1 Exaflops

10 Petaflops

100 Teraflops

10 Gigaflops

Ensembles, scenarios 10×

Embarrassingly Parallel

New parameterizations 100×

More Complex Physics

Model Completeness 100×

More Complex Physics

Spatial Resolution 104×

(103×-105×)Resolution

Issue Scaling

Clusters Now In Use(100 nodes, 5% efficient)

100 Exaflops Run length 100×

Longer Running Time

Ref. “High-End Computing in Climate Modeling,” Robert C. Malone, LANL, John B. Drake, ORNL, Philip W. Jones, LANL, and Douglas A. Rotman, LLNL (2004)

Page 11: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Nanotech + Reversible Logic

• Leadership– Extreme Computing/

Zettaflops workshop• www.zettaflops.org

– Conference on Extreme Computing in planning stage

• Similar in theme to Petaflops workshops of the 1990s

• Technology– Sandia work on

• architecture• performance modeling

of science apps.– Nanotech

• Notre Dame Quantum Dots

• Sandia/LANL CINT– Reversible Logic

• Mike Frank, Florida State University

CINT = Center for Integrated NanoTechnologies

Page 12: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

An Exemplary Device: Quantum Dots

• Pairs of molecules create a memory cell or a logic gate

Ref. “Clocked Molecular Quantum-Dot Cellular Automata,” Craig S. Lent and Beth Isaksen IEEE TRANSACTIONS ON ELECTRON DEVICES, VOL. 50, NO. 9, SEPTEMBER 2003

Page 13: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Atmosphere Simulation at a ZettaflopsSupercomputer is 211K chips, each with 70.7K nodes of 5.77K cells of 240 bytes; solves 86T=44.1Kx44.1Kx 44.1K cell problem. System dissipates 332KW from the faces of a cube 1.53m on a side,for a power density of 47.3KW/m2. Power: 332KW active components; 1.33MW refrigeration; 3.32MW wall power; 6.65MW from power company.System has been inflated by 2.57 over minimum size to provide enough surface area to avoid overheating.Chips are at 99.22% full, comprised of 7.07G logic, 101M memory decoder, and 6.44T memory transistors.Gate cell edge is 34.4nm (logic) 34.4nm (decoder); memory cell edge is 4.5nm (memory).Compute power is 768 EFLOPS, completing an iteration in 224µs and a run in 9.88s.

Page 14: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Outline

• Architecture Advances to Complete the Run of Moore’s Law

• Nanotech and Reversible Logic to Solve the Most Ambitious Problems

• Quantum Computing Alternative

Page 15: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Quantum Computing Alternative

• Quantum Computing algorithms for “physical simulation” headed toward addressing our mission space

• Sandia has top notch physicists and mathematicians

• Action (across Sandia)– LDRD on architcture

below– Ion trap research (1700)– DES algorithm (5600)

Quantum Core

Future Red Storm

Visualization I/O

9200 = Bill Camp’s Organization; 1700 = Sandia’s rad-hard fab line; 5600 = Sandia Information Operations Organization CINT = Center for Integrated Nanotechnology; LDRD = Lab Directed Research and Development

Diff

eren

tiato

r • Sandia has mfg. capability (1700/CINT)

• Track record assembling software and tools for production super- computers (9200)

Page 16: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Summary

• Architectures– 100×

over Moore’s Law– Physics limit = 10 Exaflops– Action: Blue Gene/Cyclops

proposal to NASA, etc.• For science applications

with legacy code, etc.• BG/C=Baron Mills

– Action: FPGA + PIM• Nanotech + Reversible Logic

– Most ambitious problems in science peak at 1 Zettaflops (today)

• Global Warming, Whole Cell Simulation, …

• Collaboration opportunity– Action: R&D, algorithms– Action: Workshops

• Can you participate?

• Quantum Computing– Plan for alternative

“Quantum Red Storm” & use for ambitious science and engineering problem

– Action: R&D, planning

FPGA = Field Programmable Gate Array; PIM = Processor in Memory

Page 17: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Backup

Page 18: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

8 Petaflops

80 Teraflops

Projected ITRS improvement to 22 nm

(100×)

Lower supply voltage (2×)

ITRS committee of experts

ITRS committee of experts

Expert Opinion

Scientific Supercomputer Limits

Reliability limit 750KW/(80kB T)2×1024 logic ops/s

Esteemed physicists (T=60°C junction temperature)

Best-Case Logic

Microprocessor Architecture

Physical Factor

Source of Authority

Assumption: Supercomputer is size & cost of Red Storm: US$100M budget; consumes 2 MW wall power; 750 KW to active components

100 Exaflops

Derate 20,000 convert logic ops to floating point

Floating point engineering(64 bit precision)

40 Teraflops Red Storm contract

1 Exaflops

800 Petaflops

125:1

Uncertainty (6×) Gap in chartEstimate

Improved devices (4×) Estimate4 Exaflops 32 Petaflops

Derate for manufacturing margin (4×)

Estimate

25 Exaflops 200 Petaflops

Page 19: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Cyclops Project

• Concept– Scientific applications

can run with Cyclops’ mix of features and will benefit from its performance

• We proposed 4 (next slides)

– New ideas on threaded programming are OK, but can be compatible with current methods

• So we do both

• Technical Effort Required– Start by creating a

software environment that is nearly Red Storm compliant

• Run existing apps, tools, file system

• MPI+OpenMP– Add multi-threaded

programming model from within full featured RS-like environment

Page 20: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Application 1: DSMC

• Direct Simulation Monte Carlo (DSMC)– Ran fine on nCUBE with

500K memory/node; ought to run on Cyclops with internal memory only

– Needs FLOPS for simulating spacecraft at lower altitude

• Earth• Mars

Page 21: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Application 2: Solar System Orbital Planning

• Mathematically, the solar system is full of tunnels that spacecraft can follow between planets at very low fuel consumption

• Basic calculation is just a orbital integration

Halo Orbit Around Earth L2 , Portal to the IPS

Earth

MoonLunar L1Halo Orbit

Lunar L2Halo Orbit

A Piece of Earth’s IPS

Earth’s IPS Approaching the Halo Orbit Portal

Tunnels of the Lunar IPSHalo Orbit Around Earth L2 , Portal to the IPS

Earth

MoonLunar L1Halo Orbit

Lunar L2Halo Orbit

A Piece of Earth’s IPS

Earth’s IPS Approaching the Halo Orbit Portal

Tunnels of the Lunar IPS

Page 22: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Application 3: Structural Simulation

• Salinas structural simulation– Finite element code– Multi physics

Figure 8: Payload simulated for vibrational modes by Salinas at 500K degrees of freedom freedom

Page 23: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Application 4: Climate Modeling

• Model Earth’s atmosphere– Runs a multi-physics

cloud-resolving model– One processor in each

chip runs global atmospheric dynamics

– Other 64 processors run a cloud-resolving sub-model

Cyclops Chip

1. Cell of dynamical core 2. Local threads of (10x8) cloud resolving model

Nei

ghbo

r Chi

p

Neighbor C

hip

Neighbor Chip

Figure 6: Atmospheric Dynamics and Cloud Resolving Simulation on Cyclops

Page 24: Taking Supercomputer Power to the End of Moore’s Law and ...debenedictis.org/...Above-Moores-Law-DeBenedictis.pdf · Erik P. DeBenedictis Sandia National Laboratories. December

Quantum Computing for Physical Science

Classical 8D integration264 function evaluations

double sum = 0.0;for (int x1 = 0; x1 < 256; x1++)for (int x2 = 0; x2 < 256; x2++)for (int x3 = 0; x3 < 256; x3++)for (int x4 = 0; x4 < 256; x4++)for (int x5 = 0; x5 < 256; x5++)for (int x6 = 0; x6 < 256; x6++)for (int x7 = 0; x7 < 256; x7++)for (int x8 = 0; x8 < 256; x8++)sum += E(x1, x2, x3, x4, x5, x6, x7 x8);

Quantum 8D integration64 function evaluations

double sum = 0.0;quantum int x1 = x2 = x3 = x4 =

x5 = x6 = x7 = x8 = 1/√8(|00000000> + |11111111>);

sum = SUM E(x1, x2, x3, x4, x5, x6, x7 x8);

Problem: Perform a numerical integration over an 8 dimensional space, with 256 mesh points in each dimension (total 264 points).

“wildcard” quantum

value

Evaluate E for all 264 values of

argument

Sort of a “global sum”