Scalable Algorithms for Constructing Balanced Spanning...

34
Scalable Algorithms for Constructing Balanced Spanning Trees on System-ranked Process Groups Akhil Langer, Ramprasad Venkataraman, Laxmikant V. Kale Parallel Programming Laboratory University of Illinois September 25 2012 Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 1 / 26

Transcript of Scalable Algorithms for Constructing Balanced Spanning...

Page 1: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Scalable Algorithms for Constructing Balanced SpanningTrees on System-ranked Process Groups

Akhil Langer, Ramprasad Venkataraman, Laxmikant V. Kale

Parallel Programming LaboratoryUniversity of Illinois

September 25 2012

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 1 / 26

Page 2: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Pitch

Not all messaging needs fully-capable communicators

Its worthwhile to consider cheaper constructs

We propose:

Unranked or System-ranked Process Groups

User cannot choose member ranks

Cheap and Scalable Creation Mechanisms

I Shrink-and-Balance

I Rank-and-Hash

∼ 100X faster than MPI Comm split on 32K cores of IBM BG/P

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 2 / 26

Page 3: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Is process group creation/management scalable?

Memory capacity is growing slower than available concurrency

Runtime systems have to adopt resource-conserving mechanisms

Typical Process Group ImplementationsI Each member can id everyone elseI Storage: O(n) (on each member process)I Time for creation: O(n log n)

Applications can create many such groups simultaneously

How can we use less than O(n) memory?

Distributed enrollmentDistributed storage

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 3 / 26

Page 4: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Is process group creation/management scalable?

Memory capacity is growing slower than available concurrency

Runtime systems have to adopt resource-conserving mechanisms

Typical Process Group ImplementationsI Each member can id everyone elseI Storage: O(n) (on each member process)I Time for creation: O(n log n)

Applications can create many such groups simultaneously

How can we use less than O(n) memory?

Distributed enrollmentDistributed storage

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 3 / 26

Page 5: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Achieving distributed enrollment

Central specification of membership

MPI Group incl(MPI Group group,int n,int *ranks, ← not scalableMPI Group *newgroup)

Distributed enrollment

MPI Comm split(MPI Comm comm,int color,int key,MPI Comm *newcomm)

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 4 / 26

Page 6: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Achieving distributed storage

Distributed Tables

EuroMPI 2010

A Scalable MPI Comm split Algorithm for Exascale Computing.Sack, P., Gropp, W.In: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Process Chains

EuroMPI 2011

Exascale algorithms for generalized MPI comm split.Moody, A., Ahn, D., Supinski, B.In: Recent Advances in the Message Passing Interface. pp. 9-18. EuroMPI11 (2011)

Unranked / System-Ranked Process Groups

EuroMPI 2012

Scalable Algorithms for Constructing Balanced Spanning Trees on System-RankedProcess Groups Langer, A., Venkataraman, R., Kale L.In: Recent Advances in the Message Passing Interface. pp. 9-18. EuroMPI12 (2012)

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 5 / 26

Page 7: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Achieving distributed storage

Distributed Tables

EuroMPI 2010

A Scalable MPI Comm split Algorithm for Exascale Computing.Sack, P., Gropp, W.In: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Process Chains

EuroMPI 2011

Exascale algorithms for generalized MPI comm split.Moody, A., Ahn, D., Supinski, B.In: Recent Advances in the Message Passing Interface. pp. 9-18. EuroMPI11 (2011)

Unranked / System-Ranked Process Groups

EuroMPI 2012

Scalable Algorithms for Constructing Balanced Spanning Trees on System-RankedProcess Groups Langer, A., Venkataraman, R., Kale L.In: Recent Advances in the Message Passing Interface. pp. 9-18. EuroMPI12 (2012)

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 5 / 26

Page 8: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Achieving distributed storage

Distributed Tables

EuroMPI 2010

A Scalable MPI Comm split Algorithm for Exascale Computing.Sack, P., Gropp, W.In: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Process Chains

EuroMPI 2011

Exascale algorithms for generalized MPI comm split.Moody, A., Ahn, D., Supinski, B.In: Recent Advances in the Message Passing Interface. pp. 9-18. EuroMPI11 (2011)

Unranked / System-Ranked Process Groups

EuroMPI 2012

Scalable Algorithms for Constructing Balanced Spanning Trees on System-RankedProcess Groups Langer, A., Venkataraman, R., Kale L.In: Recent Advances in the Message Passing Interface. pp. 9-18. EuroMPI12 (2012)

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 5 / 26

Page 9: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

System-Ranked Process Group

User cannot specify or influence ranks of members

MPI Comm split(MPI Comm comm,int color,int key,MPI Comm *newcomm)

Ranks are assigned by runtime system

Hence, any mapping of application logic / data to ranks has to behandled manually after creation

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 6 / 26

Page 10: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Are user-supplied ranks needed all the time?

barrier, broadcast, reduce, allreduce

Input / output not dependent on ranks

Assume commutative operators

Sizeable fraction of collective communication in applications involvethese operations 1, 2, 3

Several algorithms can be expressed with just these collectives

1NERSC6 Workload Analysis and Benchmark Selection Process.Antypas, K., Shalf, J., Wasserman, H.Tech. Rep. LBNL-1014E, Lawrence Berkeley National Lab(2008)

2Automatic MPI Counter Profiling.Rabenseifner, R.In: 42nd CUG Conference(2000)

3Parallel Scaling Characteristics of Selected NERSC User Project CodesSkinner, D., Verdier, F., Anand, H., Carter, J., Durst, M., Gerber, R.Tech. Rep. LBNL/PUB-904, Lawrence Berkeley National Lab (2005)

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 7 / 26

Page 11: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Problem Statement

Represent process groups using spanning trees

Low memory footprint(distributed storage)

Recursive, splitting of original tree(distributed enrollment)

Immediate availability of efficientsynchronization / housekeeping

Can use spanning tree for the targetcollectives too

0

1

3

7 8

4

9 10

2

5

11 12

6

13 14

To support system-ranked groups

Starting from a parent tree, construct balanced spanning tree over enrolledmembers only

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 8 / 26

Page 12: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

The Reference Centralized Algorithm

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 9 / 26

Page 13: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

The Reference Centralized Algorithm

Upward pass: gathervMembers of new group contribute their process ids

Downward passpick immediate children and split the remaining list

O(m+ log n) time and O(m) memory 4

4m is size of the new subtreeLanger, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 10 / 26

Page 14: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

The Shrink-and-Balance Algorithm

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 11 / 26

Page 15: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Algorithm: Shrink-and-BalanceUpward Pass

Use enrollment data to shrink original spanning tree by excludingnon-participating processes

“fill” holes with member processesI leaf processI immediate child process

leaf process - send min(di,k , subtree size(v)) = O(log n) candidatefillers to the parent

O(log n) space and O(log2 n) time

4di,k is depth of a rank i process in a balanced spanning tree of branching factor kdi,k = blogk(i(k − 1) + 1)c

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 12 / 26

Page 16: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Algorithm: Shrink-and-BalanceUpward pass

0

1

3

7 8

4

9 10

2

5

11 12

6

13 14

(a)

0

1

7 4

9

2

5 6

13 14

(b)

0

1

7 4

9

13

6

14

(c)

9

1

7 4

13

6

14

(d)

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 13 / 26

Page 17: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Algorithm: Shrink-and-BalanceDownward Pass

Upward pass yields participants-only spanning tree that need not bebalanced

Balance tree while minimizing vertex migrationsI compute ideal height of a perfectly balanced spanning treeI target height yields max size of subtrees5

I based on current size, mark subtrees as vertex suppliers and consumers,respectively

I request supplier for vertex if child is missing (takes O(log n) time)I “matchmaking” step to assign suppliers to one or more consumersI vertex concludes its role by calling balancing step on its children

O(log2 n) time, as a child could be missing at each level

5max size = kh−1k−1

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 14 / 26

Page 18: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

The Rank-and-Hash Algorithm

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 15 / 26

Page 19: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Algorithm: Rank-and-HashUpward Pass

Reduction

Store size of each subtree

07

14

31

7

1

8

0

42

9

1

10

0

23

50

11

0

12

0

63

13

1

14

1

(a) Subtree sizes after the upward pass

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 16 / 26

Page 20: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Algorithm: Rank-and-HashUpward Pass

Reduction

Store size of each subtree

07

14

31

7

1

8

0

42

9

1

10

0

23

50

11

0

12

0

63

13

1

14

1

(a) Subtree sizes after the upward pass

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 16 / 26

Page 21: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Algorithm: Rank-and-HashDownward Pass

Size of tree determines available ranks [0,m)

Range split amongst subtrees based on their sizes

Splitting continues down the original spanning tree until all availableranks divided

Non-participating processes not assigned any ranks

10

7

1

42

9

3

64

13

5

14

6

(b) Ranks after the downward pass

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 17 / 26

Page 22: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Algorithm: Rank-and-HashDownward Pass

Size of tree determines available ranks [0,m)

Range split amongst subtrees based on their sizes

Splitting continues down the original spanning tree until all availableranks divided

Non-participating processes not assigned any ranks

10

7

1

42

9

3

64

13

5

14

6

(c) Ranks after the downward pass

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 17 / 26

Page 23: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Algorithm: Rank-and-HashIdentifying Tree Neighbors

Process ids of parent and children discovered through intermediary processes

id of intermediary process (Hi), for rank i computed via a hash function

Each rank i, sends its id to Hi and Hp (where, p is rank of its parent)

Receive msgs from Hi and Hp with ids of children and parent, respectively

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 18 / 26

Page 24: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Algorithm: Rank-and-HashIdentifying Tree Neighbors

Process ids of parent and children discovered through intermediary processes

id of intermediary process (Hi), for rank i computed via a hash function

Each rank i, sends its id to Hi and Hp (where, p is rank of its parent)

Receive msgs from Hi and Hp with ids of children and parent, respectively

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 18 / 26

Page 25: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Experimental Setup

Time measured between broadcast on original spanning tree andreduction on the newly constructed tree

Sample from a uniform distribution u(0, 1) and use participationprobability p to determine participation of a process in the group.

Repeatable seeds to ensure identical groups across runs

Algorithms implemented in Charm++

Runs on BG/P “Intrepid”

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 19 / 26

Page 26: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

ResultsPerformance Comparison on up to 128k cores of BG/P

4k 8k 16k 32k 64k 128k28

29

210

211

212

213 p=0.01

4k 8k 16k 32k 64k 128k28

29

210

211

212

213 p=0.1

4k 8k 16k 32k 64k 128k28

29

210

211

212

213 p=0.3

4k 8k 16k 32k 64k 128k28

29

210

211

212

213 p=0.6

4k 8k 16k 32k 64k 128k28

29

210

211

212

213 p=0.9

0.0 0.2 0.4 0.6 0.8 1.00.0

0.2

0.4

0.6

0.8

1.0

Legend

centralized

Rank-and-Hash

Shrink-and-Balance

tim

e (

in m

icro

seco

nds)

tim

e (

in m

icro

seco

nds)

number of processes

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 20 / 26

Page 27: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

ResultsPerformance Comparison on up to 128k cores of BG/P

Distributed schemes outperform the centralized scheme at modestprocess counts (except for very small p)

Shrink-and-Balance slower than Rank-and-HashI longer critical path

Both schemes attain the goal of reduced memory footprint!

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 21 / 26

Page 28: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

ResultsPerformance Comparison on up to 128k cores of BG/P

Distributed schemes outperform the centralized scheme at modestprocess counts (except for very small p)

Shrink-and-Balance slower than Rank-and-HashI longer critical path

Both schemes attain the goal of reduced memory footprint!

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 21 / 26

Page 29: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

ResultsNormalized message counts w.r.t. the centralized scheme on 128k cores of BG/P

0.001 0.01 0.1 0.3 0.6 0.9 0.99participation probability

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

Norm

aliz

ed M

ess

age C

ount

Shrink-and-BalanceRank-and-Hash

Shrink-and-Balance has far fewer messagesthan Rank-and-Hash

at p = 0.6, number of messages sent byCentralized, Shrink-and-Balance andRank-and-Hash were 2.1, 2.6 and4.9× 105, respectively

Shrink-and-Balance may perform better whenI multiple groups are being formed simultaneouslyI group formation occurs simultaneous with other communication in the application

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 22 / 26

Page 30: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

ResultsNormalized message counts w.r.t. the centralized scheme on 128k cores of BG/P

0.001 0.01 0.1 0.3 0.6 0.9 0.99participation probability

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

Norm

aliz

ed M

ess

age C

ount

Shrink-and-BalanceRank-and-Hash

Shrink-and-Balance has far fewer messagesthan Rank-and-Hash

at p = 0.6, number of messages sent byCentralized, Shrink-and-Balance andRank-and-Hash were 2.1, 2.6 and4.9× 105, respectively

Shrink-and-Balance may perform better whenI multiple groups are being formed simultaneouslyI group formation occurs simultaneous with other communication in the application

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 22 / 26

Page 31: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

ResultsComparison with MPI Comm split on 32k cores of BG/P

MPI Comm split comparison with multi-color Rank-and-Hash

Group Creation Time (in milliseconds)

# splits MPI-Comm-split Rank-and-Hash

1 134.968 0.7082 106.573 0.7134 96.989 0.7608 93.536 0.785

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 23 / 26

Page 32: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Related Work

Moody et al6 proposed generalized MPI Comm split

I process groups as chain - O(1) space and O(logn) timeI requires O(n) messaging to exchange process ids during collective callI does collective communication using binary spanning trees

Several differences

I lesser dependencies on remote information for progress of collectivehence, more prominent for one-sided transfer calls supported by some networkmessaging APIs

I construct spanning trees of arbitrary branching factors

2 3 4 5 6 7branching factor

200

400

600

800

1000

1200

1400

1600

1800

time

(mic

rose

cond

s)

Message size (bytes)32256409616384

Broadcast time on 32k core of BG/P

6Moody, A., Ahn, D., de Supinski, B.: Exascale algorithms for generalized MPI comm split. In: Recent Advances in the

Message Passing Interface. pp. 9 - 18. EuroMPI11 (2011)

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 24 / 26

Page 33: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Summary

Space and time complexities for different group creation schemes

MPI(typical) Centralized Shrink-&-Balance Rank-&-Hash

Space O(n) O(m) O(logn) O(1)Time O(n + m logm) O(m + logn) O(log2 n) O(logn)Msg Count n logn n + m Ω(n + m) n + 4m + m

k

Max Msg Size O(n) O(m) O(logn) O(1)

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 25 / 26

Page 34: Scalable Algorithms for Constructing Balanced Spanning ...charm.cs.illinois.edu/newPapers/12-40/paper.pdfIn: Recent Advances in the Message Passing Interface. pp. 110. EuroMPI10 (2010)

Summary

System assigned ranks eliminate sorting of user-supplied keys

Spanning-Tree based groupsI BalancedI k-aryI Low memory usageI Outperforms traditional creation mechanisms

Evaluate performance in the presence of other computation andcommunication akin to real application execution scenarios

Account for network-topology by executing these algorithmshierarchically

Langer, Venkataraman, Kale (PPL, UIUC) Spanning Trees on Unranked Process Groups September 25 2012 26 / 26