Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or-...
-
Upload
pierce-booth -
Category
Documents
-
view
216 -
download
0
Transcript of Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or-...
![Page 1: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/1.jpg)
Dan Bennett
Oct 26, 2005
Steering and Visualization of Batch Style Distributed
Computations-or-
What I did on my Summer Vacation
![Page 2: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/2.jpg)
Overview
● Review of Parallel and Distributed Computing
● STV middle-ware● Applications of STV middle-ware
– Steering and Visualization of a MD simulation
– Model coupling, weather simulation– Check pointing, liquid crystal simulation
● My ultimate project● Conclusions
![Page 3: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/3.jpg)
Types of Parallel Computers● Shared Memory
Multiprocessor● Cluster ● MPP● Vector Processors● Hybrid● Supercomputer - anything on
the Top 500 list
Rachel and Jonas at PSC, SMPs with 64 processors
Cray XT3 MPP at PSC2068 processors, custom interconnect#33 Top 500 (July 05)
Lemieux at PSC,750 4-processor node cluster,#68 Top 500
Retired T3E at PSC
![Page 4: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/4.jpg)
What I Work With
● Cluster, but perhaps of SMP● Message Passing (MPI,
PVM)● SPMD – Single Program
Multiple Data● Braveheart● Interprocess
communications dominate
the computation● CUMULVS
Bravheart, MCS Cluster
![Page 5: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/5.jpg)
Supercomputer Center
Most large machines require batch batch submissionsubmission
– Allows for priority scheduling
– And maximum utilization of resources
PSC allows for interactive jobs, but they must be submitted to the batch queue
OSC states: “The fill in computer name herefill in computer name here has fixed usage limits for any interactive execution”
![Page 6: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/6.jpg)
Parallel Simulations in Batch
● This has the effect of removing the scientist from “the loop”– Set up Parameters– Submit Job– Post-process Output
● The computation becomes a black
box– No way to observe what is happening
inside
![Page 7: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/7.jpg)
Distributed Programs/Data
● Collecting data to a central location takes
time.● You need to collect it for
visualization/inspection of the computation● You don't always want to collect it for
processingIn a weather simulation, data at processor borders (indicated in yellow) needs to be shared with the neighbors.
![Page 8: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/8.jpg)
Steering and Visualization Software
● Extracts data from distributed computation● Minimally invasive (few lines of added code)● Minimal Impact on performance● CUMULVS and others.
●Cumulvs is from ORNL
●It is part of the ACTS toolkit
●It is part of an ongoing research project
●It runs under PVM, but works with MPI
![Page 9: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/9.jpg)
CUMULVS
● Extracts the data from
the distributed computation● Synchronizes this data ● Delivers this data to a visualization client● Receives data from a steering client● Delivers this data to the distributed computation.
![Page 10: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/10.jpg)
Case 1, Steering and
Visualization of ALCMD● ALCMD-Ames Lab
Classical Molecular Dynamics simulation– MPI based FORTRAN
code, ~6K lines
– Spatial decomposition
of data, irregularly
distributed
A MD computation spread over 8 nodes. The data from node four is shown in the exploded view
![Page 11: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/11.jpg)
Case 1, Instrument the Code
● Insert Commands to
– Initialize the STV environment call stvfinit(FSIMPLGRP, TAG, nprocs, myproc, info)
– Define parameters● Class 1, global scaler steering parameters (ex
temperature) call stvfparamdefine(temp,'Temp',STVDOUBLE,STVVIZONLY,parID(2))
● Class 2, Distributed data (ex ID of each molecule) call stvfpfielddefne('id',STVINT,1,getid,1,0,0,STVVISONLY,ipd)
![Page 12: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/12.jpg)
ResultsConditions
● 16K atoms● 4K time steps● 6 processors
4 Runs● No
instrumentation● Instrumentation,
no extraction● Extract a scalar
value● Extract 3 scalar
values
I II III IVMFLOPS 2448 2453 2043 2037Time 100 100 120 120
A screen shot of ALCMD running on 8 nodes, along with visualizations of extracted data.
![Page 13: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/13.jpg)
Further Results
● 6413 total lines of code● 200 lines added ● 3% increase● Found problems with CUMULVS
– Poor documentation– Missing commands– Missing functionality
● But the package is usable for our purposes.
![Page 14: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/14.jpg)
Case 1, Work to Do● Re-do Steering of a scalar variable
– This was working once, but never on a cluster● Produce a number of steering clients
– Scalar Change Widget– Delta Widget– Molecule Browser
● Improve visualization● Add Check pointing● Extract Additional Data● Work on STV toolkit
![Page 15: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/15.jpg)
Why Did I Do This?
● To gain experience with CUMULVS– Look for areas of improvement– Understand how CUMULVS works
● Basis for future experimentation● Plan to experiment with other STV
packages, this is a “control” package.
![Page 16: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/16.jpg)
Case 2, Vorticity Budget of an MCV
● An opportunity for work is with Jim Kirk● Jim studies Mesoscale Convective Vortices● He wishes to perform Vorticity Budgets and
Thermodynamic Budgets ● Data comes from MM5, which can run as a
distributed computation.
![Page 17: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/17.jpg)
Case 2, MM5
● mm5 produces huge amounts of data each
timestep (90 seconds of simulated time)● Normally this is only saved every half hour of
simulated time (at most)● We don’t know when to save more often (ie when
the MCV starts)● Solution, use CUMULVS to extract the data,
detect the MCV and save data.
![Page 18: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/18.jpg)
Case 2, Predicted Work
● This will involve:– Intelligent viewers to detect the MCV– Intelligent viewers to perform the various budgets
● Detecting an MCV may be a parallel computation
as well.● This is model coupling
![Page 19: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/19.jpg)
Model Coupling● Build a viewer that is a parallel computation.● Each node only extracts the data that it needs.● This is a natural extension of STV software● And is an active area of
research
![Page 20: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/20.jpg)
Checkpointing● Save a copy of a process so if it fails it can be
restarted.
● The state includes registers and memory.
● To do this: Save the state of a process in execution
– Save local states
– Save messages in communication channels
– We must be able to recreate a state of the computation
![Page 21: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/21.jpg)
Checkpointing and STV Software● Checkpointing is tough because
– You must describe what values to be saved– You must collect the save file to a central location
● STV packages allow you to do this for visualization already● This is a natural extension to these packages.
– They need to software to manage checkpoint files– They need software to restart a computation
![Page 22: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/22.jpg)
Case 3, Liquid Crystal
Modelling● Liquid Crystalline material behaves both like a
liquid and a crystal– Molecules are free to fit the container– But have orientational and some positional order
Visualization of a liquid crystal simulation
Relative energy is indicated by color
The eigenvalues of the Q vector determine shape
The eigenvalues of the Q vector determine orientation
![Page 23: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/23.jpg)
The Model● Represented on a regular 3D grid by a 3x3
symmetric traceless tensor Q.● The desire is to minimize the free energy F(Q,τ).
– τ is a parameter equivalent to temperature.
Set τfind equilibriumcalculate new τ
● The model encounters bifurcation points during
this procedure.
![Page 24: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/24.jpg)
Bifurcation
As I don't know it!
● Some systems of equations can experience a
dramatic change in behavior from the slightest
change of a parameter.● The following represent the behavior of a
continuously stirred tank reactor
![Page 25: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/25.jpg)
Small Changes in Lambda
Lead to large
changes in the
behavior of the
model.
I used Runge-Kutta
to solve the
previous equations
varying lambda as
shown.
Lambda =0.128225884245
Lambda =0.128225884244
![Page 26: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/26.jpg)
Bifurcation Points
● Occur when the Jacobian becomes non-singular● Can be detected, but it seems a real bear to do so.
– Probably involves some human intervention (now)– Read Computational Steering
● Can apparently cause multiple“ paths” that can be
followed, which can lead to physically unrealistic
situations in simulations.
![Page 27: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/27.jpg)
Bifurcation, and Liquid
Crystals● It happens, and we don’t like it!● The solution to date:
Pick some path out of the
bifurcation point and follow it● This leads to “useful” but not
necessarily correct simulations.
![Page 28: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/28.jpg)
Finding Bifurcation Points in the Model
● The system yields a system of n nonlinear equations (n>3x105)
● Solve this using Newton’s method.● Things get really touchytouchy near the bifurcation point
– May need to switch numeric methods used.● After the paths out of the bifurcation point are
found, there is no way to predict the correct path without taking a few steps down the path.
● This leads the need for checkpointing/rollback
![Page 29: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/29.jpg)
Final
Project
● Full Steering and visualization to liquid crystal code– Including numerical methods
● Model coupling to determine bifurcation points● Rollback/restart when the wrong path is chosen.● Steering through bifurcation points● Distributed rendering
![Page 30: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/30.jpg)
And …● Improve some STV package
– Make it more Bifurcation friendly● Specialized checkpointing/rollback features● Other necessary improvements
– Add other needed features● The ability to document extracted data better
● Create a STV client toolkit– New steering clients– New visualization clients
● Whatever else I am told to do!
![Page 31: Dan Bennett Oct 26, 2005 Steering and Visualization of Batch Style Distributed Computations -or- What I did on my Summer Vacation.](https://reader035.fdocuments.us/reader035/viewer/2022062720/56649efe5503460f94c13a84/html5/thumbnails/31.jpg)
The End
Questions?
Thanks!