A Framework for Particle Advection for Very Large Data
description
Transcript of A Framework for Particle Advection for Very Large Data
![Page 1: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/1.jpg)
A Framework for Particle Advection for
Very Large DataHank Childs, LBNL/UCDavisDavid Pugmire, ORNL
Christoph Garth, Kaiserslautern
David Camp, LBNL/UCDavisSean Ahern, ORNL
Gunther Weber, LBNLAllen Sanderson, Univ. of
Utah
![Page 2: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/2.jpg)
Particle advection basics
• Advecting particles create integral curves
• Streamlines: display particle path (instantaneous velocities)
• Pathlines: display particle path (velocity field evolves as particle moves)
![Page 3: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/3.jpg)
Particle advection is the duct tape of the
visualization world
Many thanks to Christoph Garth
Advecting particles is essential to understanding flow!
![Page 4: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/4.jpg)
Outline Efficient advection of particles A general system for particle-advection
based analysis.
![Page 5: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/5.jpg)
Advecting particles
![Page 6: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/6.jpg)
Particle Advection Performance
N particles (P1, P2, … Pn), M MPI tasks (T1, …, Tm)
Each particle takes a variable number of steps, X1, X2, … Xn
Total number of steps is ΣXi We cannot do less work than this (ΣXi)
Goal: Distribute the ΣXi steps over M MPI tasks such that problem finishes in minimal time Sounds sort of like a bin-packing problem … but particles don’t need to be tied to
processors But: big data significantly complicates this
picture…. … data may not be readily available, introducing
busywait.
![Page 7: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/7.jpg)
Advecting particles
Decomposition of large data set into blocks on filesystem
?
What is the right strategy for getting particle and data together?
![Page 8: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/8.jpg)
Strategy: load blocks necessary for advection
Decomposition of large data set into blocks on filesystem
Go to filesystem and read block
![Page 9: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/9.jpg)
Decomposition of large data set into blocks on filesystem
Strategy: load blocks necessary for advection
This strategy has multiple benefits:1) Indifferent to data size: a serial program can
process data of any size2) Trivial parallelization (partition particles over
processors)
BUT: redundant I/O (both over MPI tasks and within a task) is a significant problem.
![Page 10: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/10.jpg)
“Parallelize over Particles” “Parallelize over Particles”: particles are
partitioned over processors, blocks of data are loaded as needed.
Some additional complexities: Work for a given particle (i.e. Xi) is variable
and not known a priori: how to share load between processors dynamically?
More blocks than can be stored in memory: what is the best caching/purging strategy?
![Page 11: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/11.jpg)
Strategy: parallelize over blocks and dynamically
assign particles
P1 P2
P4P3
This strategy has multiple benefits:1) Ideal for in situ processing.2) Only load data once.
BUT: busywait is a significant problem.
![Page 12: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/12.jpg)
Both parallelization schemes have serious flaws.
Two extremes:
Parallelizing Over I/O EfficiencyData Good BadParticles Bad Good
Parallelizeover particles
Parallelizeover dataHybrid algorithms
![Page 13: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/13.jpg)
The master-slave algorithm is an example of a hybrid technique. Algorithm adapts during runtime to
avoid pitfalls of parallelize-over-data and parallelize-over-particles. Nice property for production visualization tools.
Implemented inside VisIt visualization and analysis package.
D. Pugmire, H. Childs, C. Garth, S. Ahern, G. Weber, “Scalable Computation of
Streamlines on Very Large Datasets.” SC09, Portland, OR, November, 2009
![Page 14: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/14.jpg)
Master-Slave Hybrid Algorithm• Divide processors into groups of N
• Uniformly distribute seed points to each group
Master:- Monitor workload- Make decisions to optimize resource
utilization
Slaves:- Respond to commands from
Master- Report status when work
complete
SlaveSlaveSlave
Master
SlaveSlaveSlave
Master
SlaveSlaveSlave
Master
SlaveSlaveSlave
MasterP0P1P2P3
P4P5P6P7
P8P9P10P11
P12P13P14P15
![Page 15: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/15.jpg)
Master Process Pseudocode
Master(){ while ( ! done ) { if ( NewStatusFromAnySlave() ) { commands = DetermineMostEfficientCommand()
for cmd in commands SendCommandToSlaves( cmd ) } }}
What are the possible commands?
![Page 16: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/16.jpg)
Commands that can be issued by master
Master Slave
Slave is given a streamline that is contained in a block that is already loaded
1. Assign / Loaded Block2. Assign / Unloaded Block3. Handle OOB / Load4. Handle OOB / Send
OOB = out of bounds
![Page 17: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/17.jpg)
Master Slave
Slave is given a streamline and loads the block
Commands that can be issued by master
1. Assign / Loaded Block2. Assign / Unloaded Block3. Handle OOB / Load4. Handle OOB / Send
OOB = out of bounds
![Page 18: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/18.jpg)
Master Slave
Load
Slave is instructed to load a block. The streamline in that block can then be computed.
Commands that can be issued by master
1. Assign / Loaded Block2. Assign / Unloaded Block3. Handle OOB / Load4. Handle OOB / Send
OOB = out of bounds
![Page 19: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/19.jpg)
Master Slave
Send to J
Slave J
Slave is instructed to send a streamline to another slave that has loaded the block
Commands that can be issued by master
1. Assign / Loaded Block2. Assign / Unloaded Block3. Handle OOB / Load4. Handle OOB / Send
OOB = out of bounds
![Page 20: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/20.jpg)
Master Process Pseudocode
Master(){ while ( ! done ) { if ( NewStatusFromAnySlave() ) { commands = DetermineMostEfficientCommand()
for cmd in commands SendCommandToSlaves( cmd ) } }} * See SC 09 paper
for details
![Page 21: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/21.jpg)
Master-slave in action
P0P0
P1
P1P2
P2P3
P4
Iteration
Action
0 P0 reads B0,P3 reads B1
1 P1 passes points to P0,P4 passes points to P3,P2 reads B0
0: Read
0: Read
Notional streamlineexample
1: Pass
1: Pass1: Read
- When to pass and when to read?- How to coordinate communication?
Status? Efficiently?
![Page 22: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/22.jpg)
Algorithm Test Cases
- Core collapse supernova simulation- Magnetic confinement fusion simulation- Hydraulic flow simulation
![Page 23: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/23.jpg)
Workload distribution in parallelize-over-data
Starvation
![Page 24: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/24.jpg)
Workload distribution in parallelize-over-particles
Too much I/O
![Page 25: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/25.jpg)
Workload distribution in hybrid algorithm
Just right
![Page 26: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/26.jpg)
Particles Data Hybrid
Workload distribution in supernova simulation
Parallelization by:
Colored by processor doing integration
![Page 27: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/27.jpg)
Astrophysics Test Case: Total time to compute 20,000 Streamlines
Sec
onds
Sec
onds
Number of procs Number of procs
Uniform Seeding
Non-uniform Seeding
DataPart-icles
Hybrid
![Page 28: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/28.jpg)
Astrophysics Test Case: Number of blocks loaded
Blo
cks
load
ed
Blo
cks
load
ed
Number of procs Number of procs
DataPart-icles
Hybrid
Uniform Seeding
Non-uniform Seeding
![Page 29: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/29.jpg)
Summary: Master-Slave Algorithm
First ever attempt at a hybrid parallelization algorithm for particle advection
Algorithm adapts during runtime to avoid pitfalls of parallelize-over-data and parallelize-over-particles. Nice property for production visualization tools.
Implemented inside VisIt visualization and analysis package.
![Page 30: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/30.jpg)
Outline Efficient advection of particles A general system for particle-advection
based analysis.
![Page 31: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/31.jpg)
Goal Efficient code for a variety of particle
advection based techniques Cognizant of use cases with >>10K
particles. Need handling of every particle, every
evaluation to be efficient. Want to support diverse flow techniques:
flexibility/extensibility is key.
![Page 32: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/32.jpg)
Motivating examples of PICS FTLE Stream surfaces Streamline Poincare Statistics based analysis + more
![Page 33: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/33.jpg)
Design PICS filter: parallel integral curve system Execution:
Instantiate particles at seed locations Step particles to form integral curves
Analysis performed at each step Termination criteria evaluated for each step
When all integral curves have completed, create final output
![Page 34: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/34.jpg)
Design Five major types of extensibility:
Initial particle locations? How do you evaluate velocity field? How do you advect particles? How to parallelize? How do you analyze the particle paths?
![Page 35: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/35.jpg)
Inheritance hierarchy
avtPICSFilter
Streamline Filter
Your derived type of PICS
filter
avtIntegralCurve
avtStreamlineIC
Your derived type of integral
curve
We disliked the “matching inheritance” scheme, but this achieved all of our design goals cleanly.
![Page 36: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/36.jpg)
#1: Initial particle locations avtPICSFilter::GetInitialLocations() = 0;
![Page 37: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/37.jpg)
#2: Evaluating velocity field
avtIVPField
avtIVPVTKField
avtIVPVTK- TimeVarying-
FieldavtIVPM3DC1
Field
avtIVP-HigherOrder-
FieldDoesn’t exist
IVP = initial value problem
![Page 38: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/38.jpg)
#3: How do you advect particles?
avtIVPSolver
avtIVPDopri5 avtIVPEuler avtIVPLeapfrog
avtIVP-M3DC1Integrato
r
IVP = initial value problem
![Page 39: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/39.jpg)
#4: How to parallelize?
avtICAlgorithm
avtParDomIC-Algorithm
(parallel over data)
avtSerialIC-Algorithm
(parallel over seeds)
avtMasterSlave-
ICAlgorithm
![Page 40: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/40.jpg)
#5: How do you analyze particle path?
avtIntegralCurve::AnalyzeStep() = 0; All AnalyzeStep will evaluate termination criteria
avtPICSFilter::CreateIntegralCurveOutput( std::vector<avtIntegralCurve*> &) = 0;
Examples: Streamline: store location and scalars for current
step in data members Poincare: store location for current step in data
members FTLE: only store location of final step, no-op for
preceding steps NOTE: these derived types create very
different types of outputs.
![Page 41: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/41.jpg)
Putting it all togetherPICS Filter
avtICAlgorithmavtIVPSolv
er
avtIVPFieldVector<
avtIntegral-Curve>
Integral curves sent to other processors with some derived types of avtICAlgorithm.
::CreateInitialLocations() = 0;
::AnalyzeStep() = 0;
![Page 42: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/42.jpg)
VisIt is an open source, richly featured, turn-key application for large data.
Used by: Visualization experts Simulation code developers Simulation code consumers
Popular R&D 100 award in 2005 Used on many of the Top500 >>>100K downloads
217 pin reactor cooling simulation Run on ¼ of Argonne BG/P
Image credit: Paul Fischer, ANL
1 billion grid points / time slice
![Page 43: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/43.jpg)
Final thoughts… Summary:
Particle advection is important for understanding flow and efficiently parallelizing this computation is difficult.
We have developed a freely available system for doing this analysis for large data.
Documentation: (PICS) http://www.visitusers.org/index.php?title=
Pics_dev (VisIt) http://www.llnl.gov/visit
Future work: UI extensions, including Python Additional analysis techniques (FTLE & more)
![Page 44: A Framework for Particle Advection for Very Large Data](https://reader035.fdocuments.us/reader035/viewer/2022081517/56816258550346895dd2a978/html5/thumbnails/44.jpg)
Acknowledgements Funding: This work was supported by the
Director, Office of Science, Office and Advanced Scientific Computing Research, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 through the Scientific Discovery through Advanced Computing (SciDAC) program's Visualization and Analytics Center for Enabling Technologies (VACET).
Program Manager: Lucy Nowell Master-Slave Algorithm: Dave Pugmire (ORNL),
Hank Childs (LBNL/UCD), Christoph Garth (Kaiserslautern), Sean Ahern (ORNL), and Gunther Weber (LBNL)
PICS framework: Hank Childs (LBNL/UCD), Dave Pugmire (ORNL), Christoph Garth (Kaiserslautern), David Camp (LBNL/UCD), Allen Sanderson (Univ of Utah)