GPU-based Visualization Algorithms Han-Wei Shen Associate Professor Department of Computer Science...
-
date post
22-Dec-2015 -
Category
Documents
-
view
222 -
download
3
Transcript of GPU-based Visualization Algorithms Han-Wei Shen Associate Professor Department of Computer Science...
GPU-based Visualization Algorithms
Han-Wei Shen
Associate ProfessorDepartment of Computer Science and Engineering
The Ohio State University
• A process of converting numerical data into visual images
• The images should contain useful information to help the scientist to obtain understanding about his/her data
Scientific Visualization
Applications Large Scale Time-Dependent
Simulations Richtmyer-Meshkov Turbulent
Simulation (LLNL) 2048x2048x1920 grid per time
step (7.7 GB) Run 27,000 time steps output size > 2 TB LLNL IBM ASCI system
Applications
Oak Ridge Terascale Supernova Initiative (TSI) 640x640x640 floats > 1000 time steps Total size > 1 TB
NASA’s turbo pump simulation Multi-zones Moving meshes 300+ time steps Total size > 100GB
ORNL TSI data
NASA turbo pump
Current Research Projects
Time-Varying Data Visualization
Flow Visualization View Dependent
Algorithms Parallel Rendering
Time-Varying Data Visualization
Key - Data are huge (~100 TBs) Research:
Spatio-Temporal Multiresolution Hierarchy Feature Tracking High Dimensional Rendering
Flow Visualization
Key – visualize the dynamics Research
Texture synthesis and animation Streamline placements
View Dependent Algorithms
Key – Give the user the best view with a minimal effort
Research Occlusion culling Automatic view selection
Parallel Rendering
Key – have an optimal utilization of computation resources (CPU and storage)
Research Large format display Dynamic Load Balancing
Computer Graphics Technology
Has advanced at an amazing speed
The Programmable GPU
GPU = vertex shader (vertex program) + fragment shader (fragment program, pixel program)
Vertex shader replaces per-vertex transform & lighting Fragment shader replaces texture stages Fragment testing after the fragment shader Flexibility to do framebuffer pixel blending
vertices
primitives
TransformAnd Lighting
Clipping
Vertex Shader
PrimitiveAssembly
AndRasterization
TextureStages
FragmentTesting
Fragment Shader
GPU-based Wavelet Reconstruction
Wavelets are useful for multiresolution analysis and compression of 3D volumetric datasets.
Previous 3D wavelet solutions are mostly implemented by convolution operators or by software.
Our work reconstructs 3D wavelets using the GPUs.
Wavelet Theory
Wavelets are defined on basis functions that filter a set of original values (A values) into low-frequency coefficients (L values) and high-frequency coefficients (H values).
L values are also known as averages, and H values as details.
A0 A1 A2 A3 A4 A5 ...H0 H1 H2 ...
L0 L1 L2 ...
2D Wavelet Transform For two- or three-dimensional data, wavelets are
applied successively on each dimension, which creates 4 or 8 coefficient bricks respectively
2d x 2d
H L
X transform
2 x (2d x d)
HH
LH
HL
LLY transform
4 x (d x d)
3D Wavelet Transform
A volume of (2d)3 voxels will be transformed into 8 of d3 bricks of coefficients
x
y
z
H L
X transform
HH
LH
HL
LL
Y transform
Z transform
HHH HHL
LHH LHL
HLH HLL
LLH LLL
3D Wavelet Reconstruction
Reconstruct the original volume of (2d)3 from the 8 d3 bricks of coefficients
HHH HHL
LHH LHL
HLH HLL
LLH LLL
Z reconstruction
x
y
z
H L
X reconstruction
HH
LH
HL
LL
Y reconstruction
3D Wavelet Reconstruction
A straightforward implementation of 3D wavelet reconstructions involves a large number of texture copying
Render-to-texture feature is not available for 3D textures
More efficient algorithm is needed to take advantage of the GPUs
Tileboards
Tileboard: flatten a 3D brick into 2d tiles
LLL =LLLLLLLLLLLLLLLLLL
LLL LLL LLL
LLL LLL LLLx
y
z
Tileboards
Tileboard: flatten a 3D brick into 2d tiles
Merge HLL, HLH, HHL, HHH into a RGBA texture
LLL =LLLLLLLLLLLLLLLLLL
LLL LLL LLL
LLL LLL LLLx
y
z
HHH HHL
LHH LHL
HLH HLL
LLH LLL
HHH HHH HHH
HHH HHH HHH
HHL HHL HHL
HHL HHL HHL
HLH HLH HLH
HLH HLH HLH
Tileboards
Tileboard: flatten a 3D brick into 2d tiles
Merge HLL, HLH, HHL, HHH into a 2D RGBA texture
LLL =LLLLLLLLLLLLLLLLLL
LLL LLL LLL
LLL LLL LLLx
y
z
HHH HHL
LHH LHL
HLH HLL
LLH LLL
HLL HLL HLL
HLL HLL HLL
LHH LHH LHH
LHH LHH LHH
LHL LHL LHL
LHL LHL LHL
LLH LLH LLH
LLH LLH LLH
Tileboards
Tileboard: flatten a 3D brick into 2d tiles
Merge LLL, LLH, LHL, LHH into a single 2D RGBA texture
LLL =LLLLLLLLLLLLLLLLLL
LLL LLL LLL
LLL LLL LLLx
y
z
HHH HHL
LHH LHL
HLH HLL
LLH LLL
LLL LLL LLL
LLL LLL LLL
H- and L-Tileboard Pack the 8 coefficient bricks into H- and L-
Tileboards
Reconstruction
The use of tileboards allows us to retrieve 4 coefficients at a single texture lookup
H-Tileboard
L-Tileboard
(2 2D RGBA textures)Evaluating waveletreconstruction formula for each fragment
Proxy polygon
2d of 2d x 2d tilesIn pbuffer
Reconstruction Details
Z reconstruction: combine HHH and LHH, HHL and LHL, HLH and LLH, HLL and LLL
Z reconstruction
HHH HHL
LHH LHL
HLH HLL
LLH LLL
Reconstruction Details
Z reconstruction: combine HHH and LHH, HHL and LHL, HLH and LLH, HLL and LLL
Z reconstruction
HHH HHL
LHH LHL
HLH HLL
LLH LLL
H Tileboard
L Tileboard
R G B A
Reconstruction Details
Z reconstruction: combine RGBA from H- and L- Tileboard (z reconstruction – H** and L**)
Harr wavelets: O RGBA = (H RGBA + L RGBA)/sqrt(2) (even z) O RGBA = (H RGBA - L RGBA)/sqrt(2) (odd z)
+
Reconstruction Details Y reconstruction: combine HH and LH, HL
and LL
HH
LH
HL
LL
Y reconstruction
HHH
LHH
HHL
LHL
HLH
LLH
HLL
LLL
Reconstruction Details
Y reconstruction: combine HH and LH, HL and LL
HH + LH = A + G HL + LL = R + B
HH
LH
HL
LL
Reconstruction Details
X reconstruction: combine H and L
H L
x
y
z
Reconstruction Details
Z reconstruction O RGBA = (H RGBA + L RGBA)/sqrt(2) (even z) O RGBA = (H RGBA - L RGBA)/sqrt(2) (odd z)
Y reconstruction O H = O A + O G O L = O R + O B
X reconstruction O = OH + OL
+
Reconstruction Details
Z reconstruction O RGBA = (H RGBA + L RGBA)/sqrt(2) (even z) O RGBA = (H RGBA - L RGBA)/sqrt(2) (odd z)
Y reconstruction O H = O A + O G O L = O R + O B
X reconstruction O = OH + OL
Single FragmentPass
+
Pseudocode
float4 haar( float2 c : TEX0, // Coords in output tileboard space uniform samplerRECT LTileboard, // L-Tileboard uniform samplerRECT HTileboard) : COLOR // H-Tileboard{ float3 d = CoordsTile2Dto3D(c); // Coords in 3D brick space float2 e = Coords3DtoTile2D(d / 2); // Coords in L- and H-tileboard space float4 L = texRECT(LTileboard, e); // Fetch (LLL, LLH, LHL, LHH) float4 H = texRECT(HTileboard, e); // Fetch (HLL, HLH, HHL, HHH)
float4 RZ = L + H * ChooseSign(d.z); // Reconstruct in Z float2 RY = RZ.rg + RZ.ba * ChooseSign(d.y); // Reconstruct in Y float RX = RY.r + RY.g * ChooseSign(d.x); // Reconstruct in X
return Color(RX); // return A value}
float ChooseSign(float x) { return 1 – 2 * fmod(x, 2); } // 1 or -1
Rendering
The goal is NOT to read out the reconstructed data from the pbuffer
3D volume rendering is performed using the reconstructed tileboard directly
Reconstructed Tileboard3D volume slicingand rendering
Final image
Results
Both Harr and Daubechies wavelets were implemented
Experiments were done on 3.0 GHz Xeon processor with nVidia Quadro FX 3400 card
CPU v.s. GPU
CPU GPU Speedup
Harr 28.11 3.90 7.20
Daubechies 53.79 6.92 7.77
Visible woman data set:480^3
Brick size: 64x64x64
(in seconds)
Brick Sizes v.s. Reconstruction Time
Brick size Harr Daubechies
128^3 102.89 124.61
64^3 33.28 33.44
32^3 16.64 16.80
(in msec)
Time includes uploading and reconstruction
Drop coefficient bricks
Coefficients can be dropped to trade quality for speed
# of coefficient bricks Harr Daubechies
8 3.90 6.92
6 2.92 5.99
4 2.13 4.94
2 1.18 4.27
1 0.75 4.04
Reconstruction time for the visible woman data using different numbers ofcoefficient bricks (in seconds)
Drop coefficient bricks
Dropping bricks affects image quality, which is more severe with Haar than with Daubechies wavelets.
Harr Daubechies
Multiresolution Rendering
Multiresolution can be achieved by feeding the reconstructed tileboard to the next resolution level.
Conclusions
We have devised an algorithm that can successfully utilize GPUs to reconstruct 3D wavelet coefficients.
We have also embedded our implementation in multiresolution data hierarchies.
Ongoing Efforts
Encode and reconstruct of time-varying data Parallel algorithms for visualizing large scale
data