Michael Bender, SUNY Stony Brook David Bunde, Knox College Vitus Leung, Sandia National Laboratories...

Michael Bender, SUNY Stony BrookDavid Bunde, Knox CollegeVitus Leung, Sandia National LaboratoriesKevin Pedretti, Sandia National LaboratoriesCynthia Phillips, Sandia National Laboratories

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy under contract DE-AC04-94AL85000.

New Experimental Results in Communication-Aware Processor Allocation for Supercomputers

• Commodity-based supercomputers at Sandia National Laboratories (off-the-shelf components)

• Up to 2048 processors• Production computing environment

• Our Job: Improve parallel node allocation on Cplant to optimize performance.

Computational Plant (Cplant)

The Cplant System

• DEC alpha processors

• Myrinet interconnect (Sandia modified)

• MPI

• Different sizes/topologies: usually 2D or 3D grid with toroidal wraps – Ross = 2048 proc, 3D mesh

– Zermatt = 128-proc 2D mesh

– Alaska = ~600, heavily-augmented 2D mesh (cannibalized).

• Modified Linux OS (now public domain)

• Four processors/switch (compute, I/O, service nodes)

Scheduling Environment

• Users submit jobs to queue (online)

• Users specify number of processors and runtime estimate– If a job runs past this estimate by 5 min, it is killed

• No preemption, no migration, no multitasking (security)

• Actual runtime depends on set of processors allocated and placement of other jobs

Goals:

• User - minimum response time

• Bureaucracy (GAO) - high utilization

Scheduler/Allocator Association

Scheduler and allocator effect each others’ performance.

Scheduler Allocator

Performance dependencies

Scheduler/Allocator Dissociation

• Scheduler enforces policy– Management sets priorities for access, utilization policy

• Allocator can optimize performance

UserExecutable# processorsRequested time

PBSScheduler

NodeAllocator

Cplant

What’s a Good Allocation?

Objective: Allocate jobs to processors to minimize network contention processor locality.

• Especially important for commodity networks

Good allocationFor 2D mesh

Bad allocationFor 2D mesh

Quantitative Effect of Processor Locality

But, speed-up anomaly

faster than

= empty processor

Communication Hops on a 2D grid

• L1 distance = # hops (~ # switches) between 2 processors on grid

Allocation Problem

• Given n available points on grid (some unavailable)• Find a set of k available points with minimum average (or

total) L1 distance.• Example: green allocation: 3(2) + 3(1) = 9

EmpiricalCorrelation

Leung et al, 2002

Related support:Mache and Lo, 1996

Previous Work

• Various Work forcing a convex set– Insufficient processor utilization

• Mache, Lo, Windisch MC algorithm

• Krumke et al 2-approximation, NP-hard w/general metric

• Complexity open for grids

• Dispersion problem (max distance) linear time for fixed k (Fekete and Meijer)

Optimal Unconstrained Shape[Bender,Bender,Demaine,Fekete 2004]

Almost a circle but not quite.

Only .05 percent difference in area.

0.650 245 952 951

Previous Results (Bender et al 2005)

• 7/4-approximation (2 - in d dimensions)

• PTAS ((1+)-approximation in poly time for fixed )

• MC is a 4-approximation

• Linear-time exact dynamic program 1D

• O(n log n) time for k=3

• Simulations (performance on job streams)

Experiments: Placement Algorithm MC

• Search in shell from minimum-size region of preferred shape.

• Weight processors by shells

• Return processor set with minimum weight.

Alternative: One-Dimensional Reduction

• Order processors so that

close in linear order close in physical processor graph

• Consider one-dimensional processor allocation– Bin packing (best fit, first fit, sum of squares)

– Pack jobs onto the line (or ring), allowing fragmentation

New System Red Storm

• 12,960 Dual-Core AMD Opteron 2.4Ghz

• 39.19 TB Memory, 340 TB disk

• 124 TF peak performance

• 3D Mesh

Impact

• Changed the node allocator on Cplant– 1D default allocator

– 2D algorithms implemented

– Carried over to Red Storm system software• 1D and 2D algorithms implemented

• Selectable at compilation

• R&D 100 winner (Leung, Bender, Bunde, Pedretti, Phillips 2006)

Red Storm Development Machine

1 Cray XT3/4 Cabinet

I/O node Compute node

Does Bandwidth Make a Difference?

Real time (seconds)

User time (seconds)

Sys time (seconds)

1/4 link bandwidth

15623.353 1012.302 50.298

Full bandwidth

6314.818 1010.752 50.003

• Yes!

YZ S Curve

ZY S Curve

Hilbert (Space-Filling) Curves

• For 2D and 3D grids• Previous applications

– I/O efficient and cache-oblivious computation– Compression (images)– Domain decomposition

Zoltan Hilbert-Space-Filling Curve

Spliced Hilbert-Space-Filling Curve

Results (Makespan in Seconds)

YZ ZY random Zoltan spliced

MC1x1 5807.1

SS 5830.6 7003.2 6610.1 6699.6 6021.1

FF 5868.6 7039.5 6639.6 6758.7 6052.3

BF 5826.2 7022.6 6631.9 6739.1 6023.4

simple 6102.4

• Consistent with simulations (Bender et al 2005)

Results (Makespan Normalized)

MC1x1 1

SS 1.0040 1.206 1.1383 1.1537 1.0369

FF 1.0106 1.2122 1.1434 1.1639 1.0422

BF 1.0033 1.2093 1.1420 1.1605 1.0372

simple 1.0509

Is it I/O or interprocess communication?

Results (Makespan Normalized)

BF 1 1.2053 1.1383 1.1567 1.0338

BF2 1 1.2398 1.176 1.1828 1.0443

• Not I/O

• Consistent with Cplant experiments (Leung et al 2002)

• Consistent with Pittsburgh Supercomputing Center experiments (Weisser et al 2006)

Experiments- Test Set

• All-to-All Communications

Job Size Number of Jobs

2 1820

15 620

20 660

• High communication, best-case for runtime improvements

• Small number of repetitions (3)

Questions

• What’s the right allocation for a stream (online)?

• Scheduling + Allocation

Michael Bender, SUNY Stony Brook David Bunde, Knox College Vitus Leung, Sandia National Laboratories...

Documents

Transcript of Michael Bender, SUNY Stony Brook David Bunde, Knox College Vitus Leung, Sandia National Laboratories...

Sandia National Laboratories Yucca Mountain Project ...

Cold War Context Statement Sandia National Laboratories ...c… · Cold War Context Statement Sandia National Laboratories California Site Rebecca A. Ullrich Prepared by Sandia National

Sandia National Laboratories · Sandia National Laboratories is a multiprogram laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin

Dave Keicher, Sandia National Laboratories

Sandia National Laboratories Health, Benefits, and Employee Services Rob Nelson Sandia National Laboratories (HBE) Health, Benefits.

LLNL A Sandia and Lawrence Livermore National Laboratories Joint Project Nathaniel Bowden Detection Systems and Analysis Sandia National Laboratories,

Data Sciences at Sandia National Laboratories

PLTMG - Sandia National Laboratories

Paul S. Crozier August 10, 2011 Sandia National Laboratories Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia.

Sandia National Laboratories - DOE

Sandia National Laboratories (01541) · 16/11/2011 · Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary

Energy - Sandia National Laboratories : Sandia Energy - PART ... · Web view2021/03/08 · Sandia National Laboratories is a multimission laboratory managed and operated by National

Sandia National Laboratories...Sandia National Laboratories is a multi -mission laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin

Sandia National Laboratories · Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin

SAND 2006-5997 Sandia National Laboratories Advances ... · 2 Issued by Sandia National Laboratories, operated for the United States Department of Energy by Sandia Corporation. NOTICE:

Vincent C. Tidwell Sandia National Laboratories · Sandia National Laboratories is a multi -program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary

Mark D. Tucker (505)844-7264 mdtucke@sandia Sandia National Laboratories

A Brief History of Sandia National Laboratories and the ...jytsao/DOE_SC_SNL_History_v23.pdf · SANDIA REPORT SAND2011-5462 . Unlimited Release . ... Sandia National Laboratories

14. Response - Sandia National Laboratories

Sandia National Laboratories - energy.gov · Sandia National Laboratories CNG, H 2, CNG‐H 2 Blends – Critical Fuel Properties and Behavior Jay Keller, Sandia National Laboratories.