Load Balancing Techniques for Asynchronous Spacetime...

24
Load Balancing Techniques for Asynchronous Spacetime Discontinuous Galerkin Methods Aaron K. Becker ([email protected] ) Robert B. Haber Laxmikant V. Kalé University of Illinois, Urbana-Champaign Parallel Programming Lab Center for Process Simulation and Design UNSCCM ’09 NSF: ITR/AP DMR 01-21695 NSF: ITR/AP DMR 01-21695 ITR/AP DMR 03-25939 Tuesday, July 21, 2009

Transcript of Load Balancing Techniques for Asynchronous Spacetime...

Page 1: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Load Balancing Techniques for Asynchronous Spacetime Discontinuous Galerkin Methods

Aaron K. Becker ([email protected])Robert B. HaberLaxmikant V. Kalé

University of Illinois, Urbana-Champaign

Parallel Programming LabCenter for Process Simulation and Design

UNSCCM ’09

NSF: ITR/AP DMR 01-21695 ITR/AP DMR 03-25939

NSF: ITR/AP DMR 01-21695ITR/AP DMR 03-25939

Tuesday, July 21, 2009

Page 2: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Fixed Timestep 1D Algorithm

2

Space

Time

!t

Tuesday, July 21, 2009

Page 3: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Tentpitcher: Causal Spacetime MeshAdvancing-Front Solution Strategy

3

Space

Time

Tuesday, July 21, 2009

Page 4: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Tentpitcher: patch by patch solution & meshing

1 2

3 4

4

Tuesday, July 21, 2009

Page 5: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Crack-tip Wave Scattering

5

Tuesday, July 21, 2009

Page 6: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Parallelizing Tentpitcher

6

• Approach

• take advantage of local decision-making algorithm to avoid global communication and promote scalability

• build in latency tolerance to support large grain sizes

• Decompose and distribute space mesh

• All non-boundary operations are purely local

• Perform boundary communication on-demand using a message driven approach

Tuesday, July 21, 2009

Page 7: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Message-driven SDG

• Over-decomposition and virtualization

• multiple mesh partitions per processor

• computation on one partition can be overlapped with blocking communication on another local partition

7

System ViewUser View

Tuesday, July 21, 2009

Page 8: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

System Overview

ParFUM

Charm++ Runtime System

Tentpitcher Algorithm

IncrementalAdaptivity

PartitioningGhost Layer Maintenance

Element Migration

Virtualization Migration Scheduling

Tuesday, July 21, 2009

Page 9: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Partition and Distribute Mesh

Combine Results

...

Virtu

al P

roce

sso

rs

Local Adaptivity

Pitch Local Vertex

Local Adaptivity

Pitch Local Vertex

Local Adaptivity

Pitch Local Vertex

Code Structure

9

Tuesday, July 21, 2009

Page 10: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

1 2 3 4 5 6 7 8

Processors

50

100

150

200

250

Pitche

s/s

Virtualized, Non-adaptive

Virtualized, Adaptive

Non-virtualized, Non-adaptive

Non-virtualized, Adaptive

Perfect Scaling, Non-adaptive

Perfect Scaling, Adaptive

Performance Effects of Virtualization

10

Tuesday, July 21, 2009

Page 11: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

10 100

Processors

100

1000

Pitch

es/s

Non-adaptive, Weak scaling

Adaptive, Weak scaling

Perfect Scaling

SDG Cluster Performance (Abe)

11

Tuesday, July 21, 2009

Page 12: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Dealing with Load Imbalance

Aside from load imbalance, few barriers to scalability

This method naturally tolerates small imbalances

But, for some problems we expect large imbalances

12

Tuesday, July 21, 2009

Page 13: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Partition Migration

• Idea: take advantage of virtualization: there are multiple partitions per processor, so they can be rearranged to improve load balance

• Standard approach in virtualized environments: Charm++ supports a variety of algorithms for relocating partitions

Advantages

• built-in support, requires little modification of application

• effective for moderate imbalances

Disadvantages

• global, synchronous approach is a poor fit for tentpitcher

• really large imbalances may not be fixable--the presence of dramatically overloaded partitions cannot be covered up without unacceptable overhead

13

Tuesday, July 21, 2009

Page 14: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Diffusion Load Balancing

• Idea: apply purely local decision making process to load balance by migrating individual mesh elements across partition boundaries once load imbalance crosses a particular threshold value

• If neighboring partitions i and j have loads λi and λj, choose r >1 and migrate elements from i to j when r λi > λj

• Advantages: requires only local synchronization and communication

14

Tuesday, July 21, 2009

Page 15: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Partition and Distribute Mesh

Combine Results

...

Virtu

al P

rocessors

Local Adaptivity

Pitch Local Vertex

Load Balancing

Local Adaptivity

Pitch Local Vertex

Load Balancing

Local Adaptivity

Pitch Local Vertex

Load Balancing

Code Structure

15

Tuesday, July 21, 2009

Page 16: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Diffusion Load Balancing

Partition i

Partition j

Initially, λi ≈ λj so no loadbalancing is needed.

16

Tuesday, July 21, 2009

Page 17: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Diffusion Load Balancing

After local refinement, λi > rλj

so boundary elements willmove from i to j

Partition j

Partition i

17

Tuesday, July 21, 2009

Page 18: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Diffusion Load Balancing

After local refinement, λi > rλj

so boundary elements willmove from i to j

Partition j

Partition i

18

Tuesday, July 21, 2009

Page 19: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Diffusion Load Balancing

After local refinement, λi > rλj

so boundary elements willmove from i to j

Partition j

Partition i

19

Tuesday, July 21, 2009

Page 20: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Diffusion Load Balancing

After local refinement, λi > rλj

so boundary elements willmove from i to j

Partition j

Partition i

20

Tuesday, July 21, 2009

Page 21: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Diffusion Load Balancing

After local refinement, λi > rλj

so boundary elements willmove from i to j

Partition j

Partition i

21

Tuesday, July 21, 2009

Page 22: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Diffusion Load Balancing

We attempt to migrate elementsin a way that maintains orimproves boundary quality.

Partition j

Partition i

22

Tuesday, July 21, 2009

Page 23: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Diffusion Load Balancing Issues

23

• Maintaining boundary quality

• Maintaining accurate load estimates

• Choosing r to avoid unneeded transfers while still avoiding serious imbalance

• Determining the right termination condition for the load balancing step

• Minimizing lock contention on boundary elements

Tuesday, July 21, 2009

Page 24: Load Balancing Techniques for Asynchronous Spacetime ...charm.cs.illinois.edu/newPapers/09-23/talk.pdf · Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu)

Load Balancing Techniques for Asynchronous Spacetime Discontinuous Galerkin Methods

Aaron K. Becker ([email protected])Robert B. HaberLaxmikant V. Kalé

University of Illinois, Urbana-Champaign

Parallel Programming LabCenter for Process Simulation and Design

UNSCCM ’09

Tuesday, July 21, 2009