Lattice Boltzmann with CUDA - … · Page 1 27.01.2009 Lattice Boltzmann with CUDA Lan Shi, ......

Friedrich-Alexander-Universität Erlangen-NürnbergHardware-Software-Co-Design

27.01.2009

Lattice Boltzmann with CUDA

Lan Shi, Li Yi & Liyuan Zhang

Hauptseminar: Multicore Architectures and Programming


27.01.2009

Outline

Overview of LBMAn usage of LBM AlgorithmImplementation in CUDA and OptimizationPerformanceDemo


27.01.2009

Overview of LBM

Lattice Boltzmann Method is a class of computational fluid dynamics methods for fluid simulation

CFD Methods:volume mesh (irregular/regular) - Euler equations- Navier-Stokes equationsSmoothed particle hydrodynamics (SPH): - Lagrangian method Spectral methods:- spherical harmonics - Chebyshev polynomialsLBM: simulate an equivalent mesoscopic system on a Cartesian grid


27.01.2009

Overview of LBM

from macroscropic to mesoscopic to microscropic

ier

vrρur

T


27.01.2009

Overview of LBM

lattice structure:D2Q9, D3Q19 ...


27.01.2009

Overview of LBM

boundary condition:Domain boundary: - the out-most surrounding lattice nodes Obstacle boundary: - the objects as obstacles inside the lattice grid to block the fluid

flowSolution:- not change- bounce-back


27.01.2009

Overview of LBM

LBM is Rresource intensive! > 100x100x100 grid points

not practical due to the slow speed of memory access and long processing timeexplicit in nature & require only next neighbor interaction

very suitable for the implementation on GPUs

Parallel computingSingle-Program Multiple-Data (SPMD) Modelwithin-processor memory


27.01.2009

Outline



27.01.2009

Target Model

Lid Driven Cavity


27.01.2009

Reforming of LBM EquationDiscrete Lattice Boltzmann equation

Collide Step:

Stream Step:


27.01.2009

Stream StepFluid particles propagate to neighboring cells


27.01.2009

Collide Step

4/91/91/36 --1 0 11 0 1

1 0 1 0 -- 11


27.01.2009

For non-moving walls:

For moving wall:

: Velocity of the moving wall

Boundary Condition (BC) Treatment

--1 0 11 0 1

1 0 1 0 -- 11


27.01.2009

Initialization

Boundary Condition Treatment

Perform Stream operation

Perform Collide operation

End time is reached

False

End

True

Incremented by time step

1. Initialize distribution functions , density , and velocity for each cell

2. Set initial time (t0)

3. Treat boundary cells

4. Perform Stream operation

5. Perform Collide operation

6. Increment time by step

7. Go to step 3 unless end time reached

Algorithm


27.01.2009

Outline

Overview of LBMAn usage of LBM AlgorithmImplementation in CUDA and OptimizationPerformance Demo


27.01.2009

Implementation in CUDA und Optimization

Kernels

#define BLOCK_SIZE 16

…

dim3 dimBlock( BLOCK_SIZE, BLOCK_SIZE );

dim3 dimGrid( (cmd.sizex+2) / BLOCK_SIZE, (cmd.sizey+2) / BLOCK_SIZE );…

BC<<<dimGrid,dimBlock>>>(d_cell, d_rho, d_wall_velocity, d_sizex, d_sizey);

Stream<<<dimGrid,dimBlock>>>( d_cell, d_temp_cell, d_sizex );

Collide<<<dimGrid,dimBlock>>>( d_cell, d_rho, d_u, d_omega, d_sizex, d_sizey );…


27.01.2009

Implementation in CUDA und OptimizationCoalesce

Block: 16x16 =256 cellCell: 0..9 means (C,N,S,W,E,NW,NE,SW,SE,Flag)Uncoalesced access :

Coalesced access:

0..9 0..9 0..9 0..9 0..9 0..9All 256 cells

0,0,…,0 1,1,…,1 2,2,…,2 3,3,…,3 4,4,…,4 9,9,…,9All 10 elements

10-vectors

256-vectors


27.01.2009

Implementation in CUDA und OptimizationGhost Cell

Block( i , j ) Block( (i+1) , j )

0,0 1,0 2,0 … 15,0 16,00,1

0,0 1,0 2,0 … 15,00,1


27.01.2009

Implementation in CUDA und OptimizationGhost Cell

How it works

……


27.01.2009

Implementation in CUDA und OptimizationMatrix vs. Standard Block

Matrix complementationdecomposed in blocksevery block must be 16x16 cells

If the block on the edge is small than 16x16, then completed with “0”

a

b

x

y

Original Matrix

Standard matrix


27.01.2009

Outline



27.01.2009

Chart : optimization


27.01.2009

Chart : GPU vs GPU


27.01.2009

Outline



27.01.2009

References

http://www.wikipedia.orghttp://www10.informatik.uni-erlangen.dehttp://www12.informatik.uni-erlangen.dehttp://math.nist.gov/mcsd/savg/parallel/index.html

Lattice Boltzmann with CUDA - … · Page 1 27.01.2009 Lattice Boltzmann with CUDA Lan Shi, ......

Documents

Transcript of Lattice Boltzmann with CUDA - … · Page 1 27.01.2009 Lattice Boltzmann with CUDA Lan Shi, ......