Digital Human: Simulation of the Heart and other...
Transcript of Digital Human: Simulation of the Heart and other...
![Page 1: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/1.jpg)
Digital Human: Simulation of the Heart and other Organs
Kathy YelickEECS Department
U.C. Berkeley
![Page 2: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/2.jpg)
The 20+ Year Vision
• Imagine a “digital body double” – 3D image-based medical record– Includes diagnostic, pathologic, and other
information• Used for:
– Diagnosis– Less invasive surgery-by-robot– Experimental treatments
• Digital Human Effort– Lead by the Federation of American Scientists
![Page 3: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/3.jpg)
Digital Human Today: Imaging
• The Visible Human Project– 18,000 digitized sections of the body
• Male: 1mm sections, released in 1994• Female: .33mm sections, released in 1995
– Goals• study of human anatomy• testing medical imaging algorithms
– Current applications: • educational, diagnostic, treatment planning,
virtual reality, artistic, mathematical and industrial
• Used by > 1,400 licensees in 42 countries
Image Source: www.madsci.org
![Page 4: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/4.jpg)
Digital Human Roadmap
1995 2000 2005 2010
1 organ 1 model
multiple models
multiple organs
organ system
digital human
scalable parallel implementations
faster computers
improved programmability
computer systems
3D model construction
faster algorithms
coupled models
algorithms &mathematics
![Page 5: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/5.jpg)
Organ Simulation
Cardiac cells/muscles– SDSC, Auckland, UW, Utah,
Cardiac flow– NYU,…
Lung transport– Vanderbilt
Lung flow– ORNL
Cochlea– Caltech, UM, UCB
Kidney mesh generation
– DartmouthElectrocardiography
– Johns Hopkins,…Skeletal mesh generation
Brain– ElIisman
Just a few of the efforts at understanding and simulating parts of the human body
![Page 6: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/6.jpg)
Immersed Boundaries within the Body
• Fluid flow within the body is one of the major challenges, e.g., – Blood through the heart– Coagulation of platelets in clots– Effect of sounds waves on the inner ear– Movement of bacteria
• A key problem is modeling an elastic structure immersed in a fluid– Irregular moving boundaries– Wide range of scales– Vary by structure, connectivity, viscosity, external
forced, internally-generated forces, etc.
![Page 7: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/7.jpg)
Heart Simulation
Developed by Peskin and McQueen at NYU– Ran on vector and shared memory machines– 100 CPU hours on a Cray C90– Models blood flow in the heart– Immersed boundaries are individual muscle fibers
–Rules for contraction, valves, etc. included
–Applications: • Understanding structural
abnormalities• Evaluating artificial heart
valves• Eventually, artificial hearts Source: www.psc.org
![Page 8: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/8.jpg)
Platelet Coagulation
• Developed by Fogelson and Peskin– Simulation of blood clotting in 2D– Immersed boundaries are
• Cell walls, represented by polygons• Artery walls
– Rules added to simulate adhesion– For vector and shared memory machines– We did earlier work on this 2D problem in Split-C
![Page 9: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/9.jpg)
Cochlea Simulation
– Simulates fluid-structure interactions due to incoming sound waves
– Potential applications: design of cochlear implants
• Simulation by Givelberg and Bunn–In OpenMP–18 hours on HP
Superdome• Embedded 2D structures are–Elastic membranes
and shells
![Page 10: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/10.jpg)
Insect Flight Simulation
• Work by on insect flight– Wings are
immersed 2D structure
• Under development – UW (Wang) and
NYU (Miller) • Applications to
– Insect robot design
Source: Dickenson, UCB
![Page 11: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/11.jpg)
Small Animal Motion
• Simulation of small animal motion by Fauci, Dillon and other– Swimming of eels, sperm, and bacteria– Crawling motion of amoeba– Biofilms, such as plaque, with multiple
micro-organisms• Applications at a smaller scale
– Molecular motors, fluctuations in DNA– Thermal properties may become important
• Brownian motion extension by Kramer, RPI
![Page 12: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/12.jpg)
Other Applications
• The immersed boundary method has also been used, or is being applied to– Flags and parachutes– Flagella – Embryo growth– Valveless pumping (E. Jung)– Paper making– Whirling instability of an elastic filament (S. Lim)– Flow in collapsible tubes (M. Rozar)– Flapping of a flexible filament in a flowing soap
film (L. Zhu)– Deformation of red blood cells in shear flow
(Eggleon and Popel)
![Page 13: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/13.jpg)
Immersed Boundary Simulation Framework
Model Builder
ImmersedBoundarySimulation
VisualizationData Analysis
C++ workstation
TitaniumVector Machines
Shared and Distributed memory parallel machines
PC Clusters
C++/OpenGLJava3D
workstationPC
![Page 14: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/14.jpg)
Building 3D Models from Images
Image source: John Sullivan et al, WPI
Image data from • visible human• MRI• Laboratory experiments
Automatic construction• Surface mesh• Volume mesh• John Sullivan et al, WPI
![Page 15: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/15.jpg)
Heart Structure Model
• Current model is based on three types of cones to construct ventricals– Outer/Inner layer– Right-Inner/Left-Outer– Left Cylinder layer
• Advantages: simple model• Disadvantages: unrealistic and time-
consuming to compute
![Page 16: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/16.jpg)
Old Heart Model
• Full structure shows cone shape
• Includes atria, ventricles, valves, and some arteries
• The rest of the circulatory system is modeled by sources and sinks
![Page 17: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/17.jpg)
New Heart Model
• New model replaces the geodesics with triangulated surfaces• Based on CT scans
from a healthy human.• Triangulated surface of
left ventricle is shown
Work by:• Peskin & McQueen, NYU• Paragios & O’Donnell, Siemens• Setserr, Cleveland Clinic
![Page 18: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/18.jpg)
![Page 19: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/19.jpg)
Structure of the Middle Ear
Transmission ofsound wave energy by the ossicles from
the ear drum into the cochlear canal
The ossicles:malleus, incus, stapes
Ear drum Cochlear canal
Sound Energy ! Cochlear Waves
![Page 20: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/20.jpg)
Cochlea and Semi-circular Canals• The inner ear is a fluid-
filled cavity containing the cochlea and the semi-circular canals
• Semi-circular canals responsible for balance
• The fluid is incompressible and viscous
• Input is from the stapesknocking on the ovalwindow; the round window is covered by a membrane to conserve volume
1 cm
![Page 21: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/21.jpg)
Schematic Description of the Cochlea
scala tympani
scala vestibuli The cochlear partition
The cochlear partition
bony shelfbasilar membrane
3.5 cm
0.52 mm
0.15 mm
oval window
round window
helicotrema
![Page 22: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/22.jpg)
Apical Turn of the Cochlea
![Page 23: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/23.jpg)
Geometry of the Cochlea Model
![Page 24: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/24.jpg)
Immersed Boundary Equations
![Page 25: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/25.jpg)
First Order Immersed Boundary Method
Compute the force f the immersed material applies to the fluid.
Compute the force applied to the fluid grid:
Solve the Navier-Stokes equations:
Move the material:
![Page 26: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/26.jpg)
Immersed Boundary Method
Hooke’s spring law viscousin-compressible
fluid
Navier-Stokes equations discretized on a periodic
rectangular 3-d grid
Fourth order PDEdiscretized on a 2-d
grid
Hooke’sspring lawDiscretized
on a 1-d grid
Combines Lagrangean and Eulerian Components
![Page 27: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/27.jpg)
Immersed Boundary Method Structure
Material activation & force calculation
InterpolateVelocity
Navier-StokesSolver
SpreadForce
4 steps in each timestep
Material Points
Interaction
Fluid Lattice
2D Dirac Delta Function
![Page 28: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/28.jpg)
Challenges to Parallelization• Irregular material points need to interact
with regular fluid lattice. – Trade-off between load balancing of
material and minimizing communication• Efficient “scatter-gather” across processors
• Need a scalable fluid solver– Currently based on 3D FFT
• Requires an all-to-all “transpose”– May try to use multigrid in the future
• Adaptive Mesh Refinement would help
![Page 29: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/29.jpg)
Parallel Algorithm
• Immersed materials are described by a hierarchy of 1d or 2d arrays
• Grids in current code– Reside on a single processor– Previous (and possibly future) versions may split them
• The 3D fluid grid uses a 1D distribution (slabs)• Interactions between the fluid and material structures
requires inter-processor communication.• Special data structures are maintained for efficient
communication.
![Page 30: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/30.jpg)
Fluid Solver
• The incompressible requires an elliptic solver– High communication demand– Information propagates across domain
• FFT-based solver divides domain into slabs– Transposes before last direction of FFT– Would like to use finer decomposition
1D FFTs
![Page 31: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/31.jpg)
Load Balancing
Egg slicer Pizza cutter
050
100150200250300350400450
0 5 10 15 20No. of processors.
MFL
OPS
Spread Force (Pizzacutter)
Interpolate Velocity(Pizza cutter)
Spread Force (Eggslicer)
Interpolate Velocity(Egg slicer)
Fluid grid is divided in slabs for 3D FFT
![Page 32: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/32.jpg)
Data Structures for Interaction• Experimented with several method• Bounding box is the best (although it sends
significantly more data than necessary)
Cost of Interaction Methods
0
500
1000
1500
2000
original boundingbox
hash booleangrid
time
(mse
c) communication
setup
![Page 33: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/33.jpg)
Data Structures for Interaction
3210 3210
• Bounding box computes only low/high• Logical grid of 4x4x4 cubes used to balance cost of
communication and setup• Communication aggregation also done
![Page 34: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/34.jpg)
Software Architecture
Application Models
Generic Immersed Boundary Method (Titanium)
Heart(Titanium)
Cochlea(Titanium+C)
FlagellateSwimming
…
Spectral(Titanium)
Multigrid MLC(KeLP) AMR
Extensible Simulation
SolversMultigrid(Titanium)
– Can add new models by extending material points– Uses Java inheritance for simplicity
![Page 35: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/35.jpg)
Performance Analysis
time breakdown
0%10%20%30%40%50%60%70%80%90%
100%
256 o
n 225
6 on 4
256 o
n 825
6 on 1
625
6 on 3
225
6 on 6
451
2 on 3
251
2 on 6
451
2 on 1
28
move Unpack USend velocitiesPack UCopy fluid UInverse FFTsSolveXformXEqnsForward FFTsUpwindExchange ghostUnpack FSet F = 0Send FPack FSpread FCompute F
![Page 36: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/36.jpg)
Tools for High Performance
Challenges to parallel simulation of a digital human are generic
• Parallel machines are too hard to program– Users “left behind” with each new major generation
• Efficiency is too low– Even after a large programming effort – Single digit efficiency numbers are common
• Approach– Titanium: A modern (Java-based) language that
provides performance transparency– BeBOP: Self-tuning scientific kernels– GASNet: Portable fast communication
![Page 37: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/37.jpg)
Titanium Overview
Object-oriented language based on Java with:• Scalable parallelism
– Single Program Multiple Data (SPMD) model of parallelism, 1 thread per processor
• Global address space– Processors can read/write memory on other
processor– Pointer dereference can cause communication
• Intermediate point between message passing and shared memory
![Page 38: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/38.jpg)
Language Support for Performance
• Multidimensional arrays– Contiguous storage– Support for sub-array operations without copying
• Support for small objects– E.g., complex numbers– Called “immutables” in Titanium– Sometimes called “value” classes
• Semi-automatic memory management– Create named “regions” for new and delete– Avoids distributed garbage collection
• Optimizations on parallel code– Communication and memory hierarchies
![Page 39: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/39.jpg)
Global Address Space Languages
• Static parallelism (like MPI)
Object heapsare shared
Glo
bal a
ddre
ss s
pace x: 1
y: 2
Program stacks are private
l: l: l:
g: g: g:
x: 5y: 6
x: 7y: 8
p0 p1 pn
• Titanium is similar to UPC and Co-Array Fortran• Globally shared address space is partitioned • References (pointers) are either local or global
(meaning possibly remote)• Distributed arrays and pointer-based structures
![Page 40: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/40.jpg)
Performance of Titanium Compiler
Performance on a Pentium IV (1.5GHz)
050
100150200250300350400450
Overall FFT SOR MC Sparse LU
MFl
ops
java C (gcc -O6) Ti Ti -nobc
![Page 41: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/41.jpg)
Titanium Research Problems
• Analysis of explicitly parallel code• Optimizations for
– Memory hierarchies– Communication (overlap and aggregation)– Synchronization
• Dynamic as well as static optimizations– For sparse and unstructured data– Extensible language (compiler support for
scientific data structures)• Lightweight one-sided communication
– Joint with UPC group– GASNet layer
![Page 42: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/42.jpg)
Semantics: Sequential Consistency
• When compiling sequential programs:
Valid if y not in expr1 and x not in expr2 (roughly)• When compiling parallel code, not sufficient test.
y = expr2;
x = expr1;
x = expr1;
y = expr2;
Initially flag = data = 0
Proc A Proc B
data = 1; while (flag!=1);
flag = 1; ... = ...data...;
![Page 43: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/43.jpg)
Cycle Detection: Analysis Problem• Processors define a “program order” on accesses from
the same threadP is the union of these total orders
• Memory system define an “access order” on accesses to the same variable
A is access order (read/write & write/write pairs)
• A violation of sequential consistency is cycle in P U A.• Intuition: time cannot flow backwards.
write data read flag
write flag read data
![Page 44: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/44.jpg)
Automatic Performance Tuning• Problem: low single-processor performance
– 100s of arithmetic operations per memory operation– Complex processors and memory systems are challenging– Techniques like tiling help, but parameters are hard to find
• Solution: let computers do automatic tuning– FFTW, Atlas (dense linear algebra), Titanium for multigrid – BeBOP: sparse matrix kernels, optimizations depend on matrix
machineprofiler
RepresentativeMatrix
Machine Profile
Maximum # vectors
optimizer
Data Structure Definition &
Code
Matrix Conversion
routineFor sparse matrix-vector multiply and related kernels:
![Page 45: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/45.jpg)
Summary of Optimizations
• Optimizations for sparse matrix-vector multiply– Register blocking: up to 4x– Variable block splitting: 2.1x– Diagonals: 2x– Reordering to create dense structure + splitting: 2x– Symmetry: 2.8x– Cache blocking: 2.2x– Multiple vectors (SpMM): 7x– And combinations…
• Sparse triangular solve– Hybrid sparse/dense data structure: 1.8x
• Higher-level kernels– AAT*x, ATA*x: 4x– A2222*x: 2x
![Page 46: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/46.jpg)
Example: The Difficulty of Tuning
• Register blocking: store each block contiguously with a single index
• Use 8x8 blocks to math structure, right?
• Source: NASA structural analysis problem
• Matrix:• n = 21216• nnz = 1.5 M
• Sparse matrix-vector multiply
![Page 47: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/47.jpg)
Speedups from Blocking on Itanium 2
Reference
Best: 4x2
Mflop/s
Mflop/s
The “natural” block size is far from optimal: search for best.
![Page 48: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/48.jpg)
Register ProfilesUltra 3 - 5% 90 Mflop/s
50 Mflop/s
108 Mflop/s
42 Mflop/s
122 Mflop/s
58 Mflop/s
Itanium 2 - 33% 1.2 Gflop/s
190 Mflop/s
Power4 - 16% 820 Mflop/s
459 Mflop/s
Pentium III - 21%
![Page 49: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/49.jpg)
Extra Work Can Improve Efficiency!
• More complicated non-zero structure in general
• Example: 3x3 blocking– Logical grid of 3x3 cells– Fill-in explicit zeros– Unroll 3x3 block multiplies– “Fill ratio” = 1.5
• On Pentium III: 1.5x speedup!
![Page 50: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/50.jpg)
Network Performance Tuning• Two-sided message passing (MPI) is not the fastest
form of communication• Low latency/overhead allows for easier
implementations
0
5
10
15
20
25
T3E/MPI
T3E/ShmemT3E/E-R
egIBM/M
PIIBM/LAPI
Quadrics/
MPIQuadri
cs/Put
Quadrics/
GetM2K
/MPI
M2K/GM
Dolphin/M
PI
Giganet/V
IPLus
ec
Send Overhead (alone) Send & Rec Overhead Rec Overhead (alone) Added Latency
![Page 51: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/51.jpg)
Putting it all together
Performance of the Immersed Boundary code using:
• Titanium • GASNet• Automatically tuned FFTs (FFTWs)
(No sparse matrices yet)
![Page 52: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/52.jpg)
Scaling Behavior (Synthetic Problem)
• Measured on the IBM SP at NERSC• Also run on Itanium/Myrinet clusters and
elsewhere
Time per timestep
49
25.5
137.1
4.1 2.9 1.7
42
23
137.9
0
10
20
30
40
50
60
1 2 4 8 16 32 64 12# procs
time
(sec
s)
256^3
512^3
![Page 53: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/53.jpg)
A Performance Model
• 5123 in < 1 second per timestep not possible • 10x increase in bisection bandwidth would fix this
Performance Model Validation
0.1
1
10
100
1000
2 4 8 16 32 64 128
256
512
1024
2048
# procs
time
(sec
s)
Total time (256 model)
Total time (256 actual)
Total time (512 model)
Total time (512 actual)
![Page 54: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/54.jpg)
Heart Simulation
Source: www.psc.org
Animation of lower portion of the heart
![Page 55: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/55.jpg)
Traveling Wave in the Cochlea
Basilar Membrane
12
34
stapes helicotrema
high frequencies low frequencies
wave envelope
wave envelopecharacteristic
frequencylocation
4 successivewave snapshots
![Page 56: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/56.jpg)
Sound Wave Propagation in Cochlea
• Centerline of the Basilar membrane• Response to 10 KHz frequency input
![Page 57: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/57.jpg)
Future Research
• Variable timestepping• Improved scalability
– Finer decomposition of materials and fluid– Multigrid or other solvers
• Second-order accurate method• Adaptive Mesh Refinement• Multi-physics models
– Incorporating diffusion, electrical models, etc.– Needed for cardiac muscle contraction– Needed for Outer Hair Cell in cochlea
![Page 58: Digital Human: Simulation of the Heart and other Organsaustin/ense622.d/lecture-resources/Digital...Digital Human: Simulation of the Heart and other Organs ... • Imagine a “digital](https://reader031.fdocuments.us/reader031/viewer/2022030505/5ab2a8a07f8b9aea528d83af/html5/thumbnails/58.jpg)
CollaboratorsTitanium Faculty:• Susan Graham• Paul Hilfinger• Alex Aiken• Phillip Colella, LBNLBebop faculty• Jim Demmel• Eun-Jin Im, KookminNYU IB Method:• Charlie Peskin• Dave McQueenStudents, Postdocs, Staff:• Christian Bell• Wei Chen• Greg Balls, SDSC• Dan Bonachea• Ed Givelberg*
• Peter McQuorquodale, LBNL
• Tong Wen, LBNL• Mike Welcome, LBNL• Jason Duell, LBNL• Paul Hargrove, LBNL• Christian Bell• Wei Chen• Sabrina Merchant• Kaushik Datta• Dan Bonachea• Rich Vuduc• Amir Kamil• Omair Kamil• Ben Liblit• Meling Ngo • Geoff Pike, ISI
http://titanium.cs.berkeley.edu/http://upc.lbl.govhttp://bebop.cs.berkeley.edu
• Jimmy Su• Siu Man Yau• Shaoib Kamil• Benjamin Lee• Rajesh Nishtala• Costin Iancu, LBNL• David Gay, Intel• Armando Solar-
Lezama
* Primary researchers on the IB simulation