National Energy Research Scientific Computing Center – Established 1974, first unclassified...

• National Energy Research Scientific Computing Center – Established 1974, first unclassified

supercomputer center– Original mission: to enable

computational science as a complement to magnetically controlled plasma experiment

NERSC

• Today’s mission: Accelerate scientific discovery at the DOE Office of Science through high performance computing and extreme data analysis

• Diverse workload:– 4,500 users, 600 projects – 700 codes; 100s of users daily

• Allocations controlled primarily by DOE– 80% DOE Annual Production

awards (ERCAP):• From 10K hour to ~10M hour• Proposal-based; DOE chooses

– 10% DOE ASCR Leadership Computing Challenge

– 10% NERSC reserve (“NISE”)

2

NERSC: Production Computing for the DOE Office of Science

DOE View of Workload

ASCR Advanced Scientific Computing Research

BER Biological & Environmental Research

BES Basic Energy Sciences

FES Fusion Energy Sciences

HEP High Energy Physics

NP Nuclear PhysicsNERSC 2013 Allocations

By DOE Office

ASCR6%

BER18%

BES

31%

FES

20%

HEP13%

NP12%

Fusion20%

Materials Science

15%

Climate12%

QCD13%

Chemistry11%

Accelerator Science3%

Applied Math2%

Astrophysics6%

Biosciences5%

Combustion1%

Computer Science3%

Engineering1%

Environmental Science

1%

Geoscience3%

High Energy Physics2%

Nuclear Physics1%

Science View of Workload

NERSC 2013 Allocations By Science Area

ASCR Facilities

6

NERSC High End Computing and Storage Capabilities

Large-Scale Computing SystemsHopper (NERSC-6): Early Cray Gemini System• 6,384 compute nodes, 153,216 cores• 144 Tflop/s on applications; 1.3 Pflop/s peakEdison (NERSC-7): Early Cray Aries System (2013)• Over 200 Tflop/s on applications, 2 Pflop/s peak• 333 TB of memory, 6.4 PB of disk

HPSS Archival Storage• 240 PB capacity• 5 Tape libraries• 200 TB disk cache

NERSC Global Filesystem (NGF)Uses IBM’s GPFS

• 8.5 PB capacity• 15GB/s of bandwidth

Midrange 275 Tflops peak

Carver• IBM iDataplex cluster• 10740 cores; 132TF

PDSF (HEP/NP)• ~2300 core cluster; 30TF

GenePool (JGI)• ~8200 core cluster; 113TF• 2.1 PB Isilon File System

Analytics & Testbeds

IBM x3850 1TB, 2TB nodes

Dirac 50 Nvidia GPU nodes

Jesup IBM iDataPlex

Data Analytics; HTC

JGI Historical Usage

Year Initial Allocation Usage % charged*

2007 769 0.5 0.1%

2008 1500 100 4%

2009 1 million 1.6 million 81%

2010 280,000 451,000 95%

2011 5.5 million 5.6 million 86%

2012 10 million 10.9 million 92%

2013 10 million 1.4 million < 60%; 40% taken away thus far

Year Initial Allocation Usage % charged*

2010 104,000 127,265 99%



2013 10 million 1.1 million <40%; 60% taken away thus far

Repo: m342PI: Eddy Rubin

Repo: m1045PI: Victor Markowitz

*Note: % charged may differ from usage/allocation due to refunds

Why are we giving hours back?2012

2013

System Instability

Genepool doubled in compute power end of 2012; nodes configured for both MPP and traditional JGI workloads (e.g. Hadoop jobs now run here)

System Expansion / Configuration

System Stability Improvements

Some analysis jobs couldn’t run on GenepoolHadoop-style jobs, MPP work (RaxML, Ray,) needed to be run on Carver/Hopper

Major improvements to the scheduler, file systems and user workload has improved Genepool stability/availability; fewer jobs being rerun

JGI clusters consolidated into Genepool; system went through period of instability (June-Sept 2012), users relied more heavily on other systems and checkpointing

JGI’s Available Hours: 31,536,000 (GP1) + 30,835,200 (GP2) + 6,937,920 (highmem)

+ 20,000,000 (NERSC) = 89,309,120

JGI’s Available Hours: 31,536,000 (GP1) + 4,905,600 (highmem)

+ 20,000,000 (NERSC) = 56,441,600

Not everything can run on Hopper/Edison/Carver

• Phylogenetic Tree Reconstruction

• USEARCH• HMMER• BLAST (inefficient)• Metagenome Assembly

Research (MPI-based assemblers like Ray)

• Hadoop

• Illumina pipeline• SMRTPortal• Fungal annotation

pipeline• RQC pipeline• Jigsaw• Large memory

assemblies• IMG production runsHigh-throughput, automated, slot-scheduled jobs/pipelines; require large memory nodes, local disk, external database access

Single large analysis runs, codes that can run at scale, jobs that tolerate long queue waits

Goal: Determine which workflows could migrate to Hopper/Edison (e.g. Jigsaw)

This analysis CAN and has been done on Hopper or Carver

This analysis is critical to JGI’s mission and CAN NOT currently run on Hopper or Carver

Goal: Improve efficiency (particularly I/O) in existing workflows

Genepool is sufficient today, but what about next year?

Step 1: Collect data• On Genepool (jobs run by

program, type of analysis, time to complete, queue wait time) - procmon

• On file systems (amount of data created per job, access patterns of the job) –NGF scripts

Step 2: Analyze the data• Predict compute time needed per

sample sequenced; define acceptable queue wait and turn around times - MATH

• Predict space needed per project - MATH

Step 3: Sanity check predictions• Add columns to LIMS system

giving PMs the ability to enter predictions for compute and disk space needs

Goal: Accurately predict that given X compute nodes, we can analyze Y samples over the course of Z months.

IN PROGRESS

2013-14

2014

- 11 -

Edison Quick Facts

First Petaflop system with Intel “Ivy Bridge” processors & Cray Aires High

Speed Network

Nodes 5,200 dual-socket with 64 GB memory

Processors Intel “Ivy Bridge” 12-core, 2.4 GHz

Network Cray “Aires” Dragonfly Topology

Scratch Disk 6.4 PB with >140 GB/sec bandwidth

Peak / Sustained 2.4 PF / 260 TF

Global Network Bandwidth > 11 TB/sec

Node Memory Bandwidth > 100 GB/s

High-Impact Results on Day OneNERSC’s users started running production codes

immediately on Edison.

Edison is very similar to

Hopper, but with 2-5

times the performance

per core on most codes.

408 M MPP hours delivered in 2013 through Oct. 16.

NERSC 8 Benchmark Performance

Top projects: carbon sequestration, artificial photosynthesis, complex novel materials, cosmic background radiation analysis

Edison is the premier production computing platform for DOE Office of Science

• New Cray XC30 with Intel Ivy Bridge processors and Aries interconnect

• Designed to support HPC and data-intensive work

• Performs 2-4 x Hopper per node on real applications

• Outstanding scalability for massively parallel apps

• Easy adoption for users – runs current apps unmodified

• Ambient cooled for extreme energy efficiency

- 12 -

5200 compute nodes124.5K processing cores333 Terabytes memory

2.4 petaflops peak530 TB/s memory bandwidth

11TB/s global bandwidth1.3MW per PF

6.4PB storage @ 140TB/s

Hopper Edison Mira TitanPeak Flops (PF) 1.29 2.4 10.0 5.26 (CPU) 21.8

(GPU)CPU cores 152,408 124,800 786,432 299,008 (GPU)

18,688 (CPU)

Frequency (GHz) 2.1 2.4 1.6 2.2 (CPU) 0.7 (GPU)

Memory (TB)Total / Per-node

217 / 32 333 / 64 786 / 15 598 / 32 (CPU) 112 / 6 (GPU)

Memory BW*(TB/s)

331 530.4 1406 614 (CPU)3,270 (GPU)

Memory BW/node* (GB/s)

52 102 29 33 (CPU)175 (GPU)

Filesystem 2 PB 70 GB/s 6.4 PB 140 GB/s 35 PB 240 GB/s 10 PB 240 GB/s

Peak Bisection BW (TB/s)

5.1 11.0 24.6 11.2

Sq ft 1956 1200 ~1500 4352Power (MW Linpack)

2.91 1.9 3.95 8.21

- 13 - * STREAM

Supported Programming Languages

Supported Programming Models

Supported Compilers

Fortran MPI Intel (default)

C, C++ OpenMP Cray

UPC Cray Shmem Gnu

Python, Perl, Shells POSIX Threads

Java POSIX Shared Memory

Chapel UPC

Coarray Fortran

Chapel

- 14 - * STREAM

Supported Programming Languages, Models and Compilers

- 15 - * STREAM

How to compile

• Hopper and Edison are Cray super computers and have specialized compilers/compiler wrappers that are optimized for these systems (demo)

- 16 - * STREAM

How to compile

• Use modules to find available software (same as Genepool)

• For Cray and GNU programming environments all Cray scientific and math libraries are available (compile as you would on Genepool)

• For Intel programming environment, some libraries are different (contact [email protected] if you have trouble with your builds)

mailto:[email protected]

National Energy Research Scientific Computing Center – Established 1974, first unclassified...

Documents

Transcript of National Energy Research Scientific Computing Center – Established 1974, first unclassified...