Two allocations of a 16X16 array to 16 processes: (a) 2-dimensional blocks; (b) rows.
-
Upload
ciara-bowen -
Category
Documents
-
view
27 -
download
1
description
Transcript of Two allocations of a 16X16 array to 16 processes: (a) 2-dimensional blocks; (b) rows.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-1
Two allocations of a 16X16 array to 16 processes: (a) 2-dimensional blocks; (b) rows.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Overlap regions
• If values from one processor must be communicated to another, then those values are “duplicated” on each processor
• It is as if that region exists on both processors– Hence, overlap
5-2
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-3
Overlap regions (gray) show the non-local values; once the overlay regions are filled, the stencil computation is local.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Cyclic and Block Allocations
• Some algorithms will cause some processors to finish before others when using trivial data mappings.
• For example, Gaussian Elimination– After first pass, done with first column and row, etc.– Row, column or block assignment will have some
processors idle while just a few are working at the end of the process.
5-4
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-5
(a) LU decomposition algorithm; (b) 16 processes arranged in a grid; (c) the allocation of the array elements to processes.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-6
Illustration of a cyclic distribution of an 8 × 8 array onto five processes.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-7
Block-cyclic allocation of 3 × 2 blocks to a 14 × 14 array distributed to four processes (colors).
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-8
The block-cyclic allocation midway through the computation; the blocks to the right summarize the active values for each process.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-9
Example of an unstructured grid representing the pressure distribution on two airfoils. Image from http://fun3d.larc.nasa.gov/example-24.html.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-10
Cap allocation for a binary tree on P = 8 processes. Each process is allocated one of the leaf subtrees, along with a copy of the cap (shaded).
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 5-11
Logical tree representations: (a) a binary tree where P = 8; (b) a binary tree where P = 6. Useful solution when the tree is known at the beginning of the computation.