Post on 19-Jan-2016
Dog Heart
Human Heart
Atrial fibrillationAtrial flutterAtrial tachycardiaAV nodal reentrant tachycardiaAV reentrant tachycardiaBigeminPremature ventricular contractionVentricular tachycardiaVentricular fibrillationBundle branch blackHeart block
Heart Problems
Mesh with millions of nodes for good spacial resolution
Bidomain
GPU
CNC
CUBLAS
Reaction
Diffusion
Serial Implementation
Reaction PhaseMesh Nodes GPU Threads
• 33 state + 1 voltage + 1 current variables / node update per time step
• Transmembrane voltage (Vm) and current (Itot) output for diffusion phase
Interfacing With Thirdparty Tools
OpenNL Sparse Matrix interface with CNC ELL support
Exposed Sparse Matrix Vector Multiply in CNC
Conjugate Gradient (CG) handled by CNC
CUBLAS for vector operations
ResultsTime Step 0 ms Time Step 40 ms
8000 iterations
Timing
Limitations: Large number of registers needed per thread per calculation
Frequent use of slow math functions
Limited data storage per thread
Solutions: Tesla architecture with more registers
MPI to better distribute data load
Hardware math functions at a cost of precision
Size of the Laplacian matrices, 32K nodes equals about a 4GB matrix
Using libraries for sparse matrix support, compress 32K x 32K matrix to 16 MB storage
Loss of precision and numerical stability by using single precision floating point numbers
Related WorkRabbit model:• 50 ms activity, timestep 20 μs, 425,000 nodes • run time: 3316 seconds
Carp (MPI, PETSc)
Rabbit model:• 30 ms of activity, 3 million nodes • run time: 13 hours
Human model:• 600 ms activity, time step 10 μs, 55 million nodes• run time: 2 days
Mouse, rabbit model, GPU and GPU/MPI hybrid version:• membrane current computation , time step 25 μs, 10e4 - 10e6 nodes• run time: 104 - 105 seconds
CHASTE(MPI,PETSc)
Potse, Dubé, Vinet, Cardinal (OpenMP)