EAVL Extreme-scale Analysis and Visualization Library
description
Transcript of EAVL Extreme-scale Analysis and Visualization Library
EAVL
EXTREME-SCALE ANALYSIS AND VISUALIZATION LIBRARY
Jeremy Meredith
SDAV Next-Gen Library MeetingSeptember, 2012
History
• Originally ORNL LDRD– Jeremy Meredith, Sean Ahern, Dave Pugmire– plus Rob Sisneros joined as a postdoc
• Many hours sitting in conference rooms arguing over things like “what does it mean to have one of your dimensions be unstructured?”– then determine what to do that’s practical without
falling off the data modeling deep end . . . .
• Exascale focus
Approaching the Exascale Problems
• Update traditional data model to handle modern simulation codes and a wider range of data.
• Investigate how an updated data and execution model can achieve the necessary computational, I/O, and memory efficiency.
• Explore methods for visualization algorithm developers to achieve these efficiency gains and better support exascale architectures.
DATA MODELING CHALLENGES
Connectivity3D Point CoordinatesCell FieldsPoint Fields
Dimensions3D Point CoordinatesCell FieldsPoint Fields
Dimensions3D Axis CoordinatesCell FieldsPoint Fields
A Traditional Data Set Model
Data Set
Rectilinear Structured Unstructured
Challenge: Non-Physical Data Analysis
• Graph Data– topologically 0D vertices, 1D edges– non-spatial; storing X/Y/Z values is wasted space
• Pure Parameter Studies– e.g. reaction rate of combustion• FOUR “spatial” dimensions
– e.g. methane concentration vs oxygen concentration vs temperature vs pressure
• more complex reaction higher dimensionality
oxyg
en
met
hane
temperature
pressure
Challenge: Molecular Data(e.g., LAMMPS, VASP)
• To represent using vtkPolyData or vtkUnstructuredGrid:– VTK_VERTEX cells for the atoms– VTK_LINE cells for the bonds
• Any field data must exist on both element types– Not only inefficient:
• dummy bond strengths on the atoms?• dummy atomic numbers on the bonds?
– But also incorrect:• e.g. average(BondStrength) uses dummy values from atoms?
H
C
H
C
H
H
BondStr11112
AtomicNum661111
Challenge: Side Sets(e.g. Exodus, flux surfaces)
• The flow from A to B is defined on a set of faces• The flux variable is defined only on those faces– do you combine them into a single mesh?• waste space on dummy values, potentially introducing errors
– or create a separate mesh and lose the mapping info?• horribly expensive and error-prone to recalculate mapping
A Bflux surface lives inside the volumetric mesh
Challenge: Dimensionality, Refinement(e.g. GenASiS)
• (a) seven (or eight) dimensional mesh– f(x,y,z,ϴ,ϕ,λ,F)=E, plus time
• (b) refinement occurs on a per-cell basis– can’t assume per-block refinement– sometimes referred to as “unstructured AMR”
www.vacet.org
Challenge: Unique Mesh Topologies(e.g. MADNESS)
• MADNESS does not have a traditional mesh– Just a quad-tree with polynomial coefficients– Up to 30 refinement levels / tree depth
31 2 4
5
root
1
2
345
spatial structure internal tree representation
www.vacet.org
Challenge: Very High Order Fields(e.g. MADNESS)
• Legendre polynomial series at each tree node– Each tree node has Kdim coefficients– K can be up to approx. 20• i.e. 400 coeffs per tree node in 2D, 8000 in 3D
(example with K=3, dim=2)
0.834 0.592 0.0030.592 0.003 0.0100.003 0.010 0.007
THE EAVL DATA MODEL
Connectivity3D Point CoordinatesCell FieldsPoint Fields
Dimensions3D Point CoordinatesCell FieldsPoint Fields
Dimensions3D Axis CoordinatesCell FieldsPoint Fields
A Traditional Data Set Model (again)
Data Set
Rectilinear Structured Unstructured
TreeConnectivity Dimensions
FieldNameComponent
NameAssociationValues
Cells[]Points[]Fields[]
The EAVL Data Set Model
Data Set
CellSet
Explicit Structured
CoordsField
QuadTreeCellList
Subset
Connectivity: (a bunch of cells)
FieldName: “c” “c” “c”Component: 0 1 2
Name: “c”Association: PointsValues[3*npts]
Cells[1]Points[1]Fields[1]
Example: An Unstructured Grid(with interleaved coordinates)
eavlDataSet
eavlExplicitCellSet
eavlCoordinates
eavlField
Connectivity: (a bunch of cells)
FieldName: “x” “y” “z”Component: 0 0 0
Name: “x”Association: PointsValues[npts]
Cells[1]Points[1]Fields[3]
Example: An Unstructured Grid(with separated coordinates)
eavlDataSet
eavlExplicitCellSet
eavlCoordinates
eavlField #0Name: “y”Association: PointsValues[npts]
eavlField #1Name: “z”Association: PointsValues[npts]
eavlField #2
RegularStructure: 30 40 30
FieldName: “x” “y” “z”Component: 0 0 0
Name: “x”Association: PointsValues[npts]
Cells[1]Points[1]Fields[3]
Example: A Curvilinear Grid
eavlDataSet
eavlStructuredCellSet
eavlCoordinates
eavlField #0Name: “y”Association: PointsValues[npts]
eavlField #1Name: “z”Association: PointsValues[npts]
eavlField #2
RegularStructure: 30 40 30
FieldName: “x” “y” “z”Component: 0 0 0
Name: “x”Association: LogicalDim0Values[ni]
Cells[1]Points[1]Fields[3]
Example: A Rectilinear Grid
eavlDataSet
eavlStructuredCellSet
eavlCoordinates
eavlField #0Name: “y”Association: LogicalDim1Values[nj]
eavlField #1Name: “z”Association: LogicalDim2Values[nk]
eavlField #2
RegularStructure: 30 40 30 4 360
FieldName: “x” “y” “z” “μ” “ϴ”Component: 0 0 0 0 0
Name: “x”Association: LogicalDim0Values[ni]
Cells[1]Points[1]Fields[5]
Example: High-Dimensional Grid
eavlDataSet
eavlStructuredCellSet
eavlCoordinates
eavlField #0Name: “y”Association: LogicalDim1Values[nj]
eavlField #1Name: “z”Association: LogicalDim2Values[nk]
eavlField #2Name: “μ”Association: LogicalDim3Values[nμ]
eavlField #3Name: “ϴ”Association: LogicalDim4Values[nϴ]
eavlField #4
RegularStructure: 30 40
FieldName: “lat” “lon”Component: 0 0
Name: “lat”Association: LogicalDim0Values[ni]
Cells[1]Points[2]Fields[3]
Example: Geospatial Data
eavlDataSet
eavlStructuredCellSet
eavlCoordinates
eavlField #0Name: “lon”Association: LogicalDim1Values[nj]
eavlField #1Name: “c”Association: PointsValues[3*npts]
eavlField #2
FieldName: “c” “c” “c”Component: 0 1 2
eavlCoordinates
Example: Molecular Data
Connectivity: the atoms
FieldName: “c” “c” “c”Component: 0 1 2
Name:”atomic number”Association: Cell Set #0Values[ncells #0]
Cells[2]Points[1]Fields[3]
eavlDataSet
eavlExplicitCellSet #0
eavlCoordinates
eavlField #1
Connectivity: the bonds
eavlExplicitCellSet #1
Name: “c”Association: PointsValues[3*npts]
eavlField #0Name: “bond strength”Association: Cell Set #1Values[ncells #1]
eavlField #2
Example: Face-centered Data
Connectivity: volumetric
FieldName: “c” “c” “c”Component: 0 1 2
Cells[2]Points[1]Fields[2]
eavlDataSet
eavlExplicitCellSet
eavlCoordinates
Parent: ( )
eavlAllFacesOfExplicit
Name: “c”Association: PointsValues[3*npts]
eavlField #0Name: “facevariable”Association: Cell Set #2Values[nfaces]
eavlField #1
FILTERING IN EAVL
Data flow networks in EAVL (or not)
• A “Filter” is a stage in a data flow network– Creates a new data set from an old one
• Many operations do not change a mesh structure (assuming data model is sufficiently descriptive)– Arithmetic expressions: only modifies fields– External facelist: points and structure remain– Feature edges: just a new cell set with old points– Smooth, displace, elevate: only modify coordinates
• So: eavlMutator is an alternative to eavlFilter– Modifies a data set in-place
eavlMutator• In-place data set modification• Support for destructive in-place operation– free memory as you go
• Execute multiple mutators simultaneously on the same data set (barring conflicts)– e.g. displace (coords) + threshold (cells) concurrently
• How about data flow network support?– encapsulate an eavlMutator through a
eavlFilterFromMutator facade • Of course, some operations are natively eavlFilters– can facade through eavlMutatorFromFilter (?)
FieldName: “x” “y” “z”Component: 0 0 0
• Explicit cells can be combined with structured coordinates.
Example: Thresholding an RGrid (a)
eavlCoordinates
Name: “x”
Association: LogicalDim0Values[ni]
eavlField#0Name: “y”
Association: LogicalDim1Values[nj]
eavlField#1Name: “z”
Association: LogicalDim2Values[nk]
eavlField#2
RegularStructure: 30 40 30
eavlStructuredCellSet
FieldName: “x” “y” “z”Component: 0 0 0
eavlCoordinates
Name: “x”
Association: LogicalDim0Values[ni]
eavlField#0Name: “y”
Association: LogicalDim1Values[nj]
eavlField#1Name: “z”
Association: LogicalDim2Values[nk]
eavlField#2
Connectivity: (a bunch of cells)
eavlExplicitCellSet
Cells: (…)
Parent: ( )
• A second Cell Set can be added which refers to the first one
Example: Thresholding an RGrid (b)
RegularStructure: 30 40 30
eavlStructuredCellSet eavlSubset
FieldName: “x” “y” “z”Component: 0 0 0
eavlCoordinates
Name: “x”
Association: LogicalDim0Values[ni]
eavlField#0Name: “y”
Association: LogicalDim1Values[nj]
eavlField#1Name: “z”
Association: LogicalDim2Values[nk]
eavlField#2
RegularStructure: 30 40 30
eavlStructuredCellSet
Name: “x”
Association: LogicalDim0Values[ni]
eavlField#0Name: “y”
Association: LogicalDim1Values[nj]
eavlField#1Name: “z”
Association: LogicalDim2Values[nk]
eavlField#2
FieldName: “x” “y” “z”Component: 0 0 0
eavlCoordinates
30x30 or 30x40
eavlStructSubset
30x30 or 30x40
eavlStructSubset
• Add six new subset-cell sets to original mesh
Example: Structured External Facelist
FieldName: “x” “y” “z”Component: 0 0 0
eavlCoordinates
Name: “x”
Association: PointsValues[npts]
eavlField#0Name: “y”
Association: PointsValues[npts]
eavlField#1Name: “z”
Association: PointsValues[npts]
eavlField#2
FieldName: “x” “y” “z”Component: 0 0 0
eavlCoordinates
Name: “x”
Association: PointsValues[npts]
eavlField#0Name: “y”
Association: PointsValues[npts]
eavlField#1Name: “z”
Association: PointsValues[npts]
eavlField#2
RegularStructure: 30 40 30
eavlStructuredCellSet30 40 30
eavlStructCellSet30x30 or 30x40
eavlStructSubset
x6
• No problem-sized data modifications.– Interleaved and separated coordinates can be used simultaneously.
Example: Elevating a Structured Grid
FieldName: “c” “c”Component: 0 1
eavlCoordinates
Name: “c”Association:PointsValue[2*npts]
eavlField#0Name: “val”Association:PointsValues[npts]
eavlField#1
RegularStructure: 30 40
eavlStructuredCellSet
FieldName: “c” “c” “val”Component: 0 1 0
eavlCoordinates
RegularStructure: 30 40
eavlStructuredCellSet
Name: “c”Association:PointsValue[2*npts]
eavlField#0Name: “val”Association:PointsValues[npts]
eavlField#1
• No problem-sized data modifications.– Some axes on logical dims, with others on the points.
Example: Elevating a Regular Grid
FieldName: “x” “y”Component: 0 0
eavlCoordinates
Name: “x”
Association: LogicalDim0Values[ni]
eavlField#0Name: “y”
Association: LogicalDim1Values[nj]
eavlField#1Name: “val”
Association: PointsValues[npts]
eavlField#2
RegularStructure: 30 40
eavlStructuredCellSet
Name: “x”
Association: LogicalDim0Values[ni]
eavlField#0Name: “y”
Association: LogicalDim1Values[nj]
eavlField#1Name: “val”
Association: PointsValues[npts]
eavlField#2
FieldName: “x” “y” “val”Component: 0 0 0
eavlCoordinates
RegularStructure: 30 40
eavlStructuredCellSet
DEALING WITH CONCURRENCY
Concurrency at Multiple Levels
• Distributed Parallelism– Message passing still works well– Avoid global communication• local domain interconnectivity information
– Hybrid (e.g. spatiotemporal) parallelism• Task Parallelism– Fine-grain dependency tracking• e.g. displace (coords) + threshold (cells) concurrently
– eavlMutator helps– single eavlDataSet container class helps
• Thread Parallelism– Fine-grain data parallelism; CUDA, OpenMP
Data Parallelism for Developers
• Functor + iterator paradigm
• Iteration patterns for mesh topologies
• CUDA + OpenMP execution back-ends
A Simple Data-Parallel Operation
void CellToCellDivide(Field &a, Field &b, Field &c){ for_each(i) c[i] = a[i] / b[i];}
void CalculateDensity(...){ //... CellToCellDivide(mass, volume, density);}
Internal Library API Provides ThisAlgorithm Developer Writes This
Functor + Iterator Approach
void CalculateDensity(...){ //... CellToCellBinaryOp(mass, volume, density, Divide());}
template <class T> void CellToCellBinaryOp<T>(Field &a, Field &b, Field &c T &f){ for_each(i) f(a[i],b[i],c[i]);}
struct Divide{ void operator()(float &a, float &b, float &c) { c = a / b; }};
Internal Library API Provides ThisAlgorithm Developer Writes This
Custom Functor
void CalculateDensity(...){ //... CellToCellBinaryOp(mass, volume, density, MyFunctor());}
template <class T> void CellToCellBinaryOp<T>(Field &a, Field &b, Field &c T &f){ for_each(i) f(a[i],b[i],c[i]);}
struct MyFunctor{ void operator()(float &a, float &b, float &c) { c = a + 2*log(b); }};
Algorithm DeveloperWrites These
Internal Library API Provides This
Functor Efficiency on CPU and GPU
• Data: noise.silo• Surface normal
0 μs
100 μs
200 μs
300 μs
400 μs
500 μs
600 μs
unoptimized optimized unoptimized optimized
Intel Xeon L5420 NVIDIA GeForce 8800GTS
Functor Inline
Binding Values to Functorsstruct ScaleByConst{ float scale; ScaleByConst(float s) : scale(s) { } void operator()(float &a, float &b) { b = a * scale; }};
void CalculateDensity(...){ //... cell_volume = mesh_volume / mesh_numcells; CellToCellUnaryOp(mass, density, ScaleByConst(1.0/cell_volume));}
DATA PARALLELISM BASICS
Map with 1 input, 1 output
Simplest data-parallel operation. Each result item can be calculated from its corresponding input item alone.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
6 14 0 2 8 0 0 8 10 6 2 0
struct f { float operator()(float x) { return x*2; }};
result
Map with 2 inputs, 1 output
With two input arrays, the functor takes two inputs. You can also have multiple outputs.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
5 11 2 2 12 3 9 9 10 4 3 1
struct f { float operator()(float a, float b) { return a+b; }};
result
y 2 4 2 1 8 3 9 5 5 1 2 1
Scatter with 1 input (and thus 1 output)
Possibly inefficient, risks of race conditions and uninitialized results. (Can also scatter to larger array if desired.)Often used in a scatter_if–type construct.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
0 1 3 0
No functor
result
indices
4
2 4 1 5 5 0 4 2 1 2 1 4
Gather with 1 input (and thus 1 output)
Unlike scatter, no risk of uninitialized data or race condition. Plus, parallelization is over a shorter indices array, and caching helps more, so can be more efficient.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
7 3 0 3 1
No functor
result
indices 1 9 6 9 3
Reduction with 1 input (and thus 1 output)
Example: max-reduction. Sum is also common.Often a fat-tree-based implementation.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
7result
struct f { float operator()(float a, float b) { return a>b ? a : b; }};
Inclusive Prefix Sum (a.k.a. Scan)with 1 input/output
Value at result[i] is sum of values x[0]..x[i].Surprisingly efficient parallel implementation.Basis for many more complex algorithms.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
3 10 10 11 15 15 15 19 24 27 28 28
No functor.
result
+ + + + + + + + + + +
Exclusive Prefix Sum (a.k.a. Scan)with 1 input/output
Initialize with zero, value is sum of only up to x[i-1].May be more commonly used than inclusive scan.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
0 10 10 11 15 15 15 19 24 27 28
No functor.
result 3
+ + + + + + + + + + +0
DATA PARALLELISM ON MESHES
Example: Surface Normal
• For each 2D cell(i.e. each polygon):– Get three adjacent points– Pair-wise vector subtract– Cross product
• Data-parallel:– Repeat for all cells
Example: Surface Normal
• INPUT:– 3-dimensional
coordinates array on the mesh NODES
– example: length = 9
• OUTPUT:– 3-component surface
normals array onthe mesh CELLS
– example: length = 4
Under the Covers: Node-to-Cell on CPUvoid NodeToCellOp3::ExecuteCPU(){#pragma omp parallel for for (int i=0; i<input->NumCells(); i++) { // get cell node indices
int nNodes, nodeIds[8]; float nodeValues[3][8]; conn.GetCellNodes(index, nNodes, nodeIds);
// get coordinates for nodes for (int i=0; i<nNodes; i++) { nodeValues[0][i] = array0[nodeIds[i]]; nodeValues[1][i] = array1[nodeIds[i]]; nodeValues[2][i] = array2[nodeIds[i]]; }
// call functor functor(nodeValues[0], nodeValues[1], nodeValues[2], &out0[i], &out1[i], &out2[i]); } }
Under the Covers: Node-to-Cell on GPU
void NodeToCellOp3::ExecuteGPU(){ float *d_arr0 = (float*)array0->GetCUDAArray(); float *d_arr1 = (float*)array1->GetCUDAArray(); float *d_arr2 = (float*)array2->GetCUDAArray(); float *d_out = (float*)output0->GetCUDAArray();
// calculate CUDA thread grid num blocks & threads // based on input->NumCells()
nodeToCellKernel3<<<nb,nt>>>(d_arr0, d_arr1, d_arr2, d_out0, d_out1, d_out2, conn, functor);}
Under the Covers: The CUDA Kerneltemplate <class F>__global__ void NodeToCellKernel3(float *array0, float *array1, float *array2, float *out0, float *out1, float *out2, ExplicitConnectivity conn, F functor){ const int index = blockIdx.x * blockDim.x + threadIdx.x; // get cell node indices int nNodes, nodeIds[8]; float nodeValues[3][8]; conn.GetCellNodes(index, nNodes, nodeIds); // get coordinates for nodes for (int i=0; i<nNodes; i++) { nodeValues[0][i] = array0[nodeIds[i]]; nodeValues[1][i] = array1[nodeIds[i]]; nodeValues[2][i] = array2[nodeIds[i]]; } // call functor functor(nodeValues[0], nodeValues[1], nodeValues[2], &out0[i], &out1[i], &out2[i]);}
WRITING ALGORITHMS IN EAVL
Face Surface Normal Functor+Iteratorstruct PolyNormalFunctor{ void operator()(int nvals, int shapetype, float x[], float y[], float z[], float *nx, float *ny, float *nz) { // get two adjacent edge vectors float ax = x[1]-x[0], ay = y[1]-y[0], az = z[1]-z[0]; float bx = x[2]-x[1], by = y[2]-y[1], bz = z[2]-z[1]; // calculate their cross product *nx = ay*bz - az*by; *ny = az*bx - ax*bz; *nz = ax*by - ay*bx; }};
void CalculateFaceNormals(...){ eavlExecutor::AddOperation( new eavlTopologyMap_3_3(inputcells, EAVL_NODES_OF_CELLS, xcoord, ycoord, zcoord, xnormal, ynormal, xnormal, PolyNormalFunctor()));}
Making it Easy for Developers
• Single functor code works on either CPU or GPU• Iteration patterns have multiple back-ends• CPU vs GPU execution choice is made at runtime• EAVL array class automatically supports
heterogeneous memory spaces for GPUs• Internal APIs support CUDA if developers want to
write their own CUDA kernels
Algorithms are Sequences of Operations(a) calculate face surface normals
with node-to-cell operation
(b) average to point normals via cell-to-node operation
EXAMPLE: THRESHOLD
47 52 63
32 49
31
Starting Mesh
We want to threshold a mesh based on its density values (shown here).
43 47 52 63
32 38 42 49
31 37 41 38
density 43 47 52 63 32 38 42 49 31 37 41 38
0 1 2 3 4 5 6 7 8 9 10 11
If we threshold 35 < density < 45, we want this result:
43
38 42
37 41 38
Which Cells to Include?
Evaluate a Map operation with this functor:struct InRange { float lo, hi; InRange(float l, float h) : lo(l), hi(h) { } int operator()(float x) { return x>lo && x<hi; }}
1 0 0 0
0 1 1 0
0 1 1 1
density 43 47 52 63 32 38 42 49 31 37 41 38
0 1 2 3 4 5 6 7 8 9 10 11
1 0 0 0 0 1 1 0 0 1 1 1inrange
InRange()
How Many Cells in Output?
Evaluate a Reduce operation using the Add<> functor.We can use this to create output cell length arrays.
1 0 0 0
0 1 1 0
0 1 1 1
0 1 2 3 4 5 6 7 8 9 10 11
1 0 0 0 0 1 1 0 0 1 1 1inrange
6result
plus
Where Do the Output Cells Go?
Inputindices
Outputindices
0 1 2 3
4 5 6 7
8 9 10 11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5output cell
0
1 2
3 4 5
input cell
How do we create this mapping?
Create Input-to-Output Indexing?
Exclusive Scan (exclusive prefix sum) gives us the output index positions.
0
1 2
3 4 5
0 1 2 3 4 5 6 7 8 9 10 11
1 0 0 0 0 1 1 0 0 1 1 1inrange
0 1 1 1 1 1 2 3 3 3 4 5startidx
+ + + + + + + + + + +0
startidx
Create Output-to-Input Indexing?
We want to work in the shorter output-length arrays and use gathers. A specialized scatter in EAVL creates this reverse index.
0 1 2 3 4 5 6 7 8 9 10 11
0 5 6 9 10 11revindex
0 1 1 1 1 1 2 3 3 3 4 5
0
5 6
9 10 11
density
Gather Input Mesh Arrays to Output?
We can now use simple gathers to pull input arrays (density, pressure) into the output mesh.
0 1 2 3 4 5 6 7 8 9 10 11
43 47 52 63 32 38 42 49 31 37 41 38
0 5 6 9 10 11revindex
43 38 42 37 41 38
43
38 42
37 41 38
output_density
PLANS AND STATUS
Plans and Status• Now Open Source (BSD “2-clause”)– Project website at http://ft.ornl.gov/eavl – Source code, docs at https://github.com/jsmeredith/EAVL
• Data Model– Most infrastructure in place– Still a few areas not fully fleshed out– The client-facing API needs work
• Execution Model– Currently imperative style– Exploring pipeline paths, contract/metadata needs
• Productization Status– More iterators– More optimization– Lots of internal cleanup