GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)
-
date post
19-Dec-2015 -
Category
Documents
-
view
218 -
download
1
Transcript of GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)
![Page 1: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/1.jpg)
GPGPU ToolkitSlabOps
SlabOps were created by Mark Harris (UNC, NVIDIA)
![Page 2: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/2.jpg)
Main Issue with GPU Programming Main issue is not with writing the code for the
graphics card
The main issue is interfacing with the graphics card
![Page 3: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/3.jpg)
Issues with Interfacing with GPUs1. You forget to do something
1. Forget to initialize FBOs
2. Forget to enable the CG program
3. Forget to set the viewpoint correctly
4. ….
2. GPGPU algorithms are hacks1. You’re rendering a quad to perform an algorithm on an
array
3. Its not object oriented
![Page 4: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/4.jpg)
Using SlabOps GPGPU methods covered previously are fine
for performing 1 or 2 programs What about trying to manage ten or twenty
programs performing hundreds of passes?
SlabOps to the rescue!
![Page 5: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/5.jpg)
Using SlabOps SlabOps were created by Mark Harris while
getting his PHD at the University of North Carolina.
Used in his GPU Fluid Simulator to manage the large number of fragment programs required for each pass.
![Page 6: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/6.jpg)
Using SlabOps3 Parts1. Define
1. Define the type of SlabOp that you need (more on this later)
2. Initialization1. Initialize the program to load2. Initialize the parameters to connect3. Initialize the output
3. Run1. Update any parameters that might have changed2. Call Compute() to run the program
![Page 7: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/7.jpg)
Initialization
void initSlabOps() { // Load the program g_addMatrixfp.InitializeFP(cgContext, "addMatrix.cg", "main"); // Set the texture parameters g_addMatrixfp.SetTextureParameter("tex1", inputYTexID); g_addMatrixfp.SetTextureParameter("tex2", inputXTexID); // Set the texture coordinates and output rectangle g_addMatrixfp.SetTexCoordRect( 0,0, texSizeX, texSizeY); g_addMatrixfp.SetSlabRect( 0,0, texSizeX, texSizeY); // Set the output texture g_addMatrixfp.SetOutputTexture(outputTexID, texSizeX, texSizeY, textureParameters.texTarget, GL_COLOR_ATTACHMENT0_EXT);}
![Page 8: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/8.jpg)
Run
g_addMatrixfp.Compute();
One line to run the program: Sets the variables Enables the program Sets the viewpoint Builds the geometry to perform the processing Perform the computation Get the output into the buffer or texture Disable the program Reset the viewpoint
![Page 9: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/9.jpg)
Comparing Saxpy (SlabOp)
// Do calculations for(int i = 0; i < numIterations; i++) { g_saxpyfp.SetTextureParameter("textureY", yTexID[readTex]); g_saxpyfp.SetOutputTexture(yTexID[writeTex], texSize, texSize, textureParameters.texTarget, attachmentpoints[writeTex]);
g_saxpyfp.Compute(); swap(); }
SlabOp
![Page 10: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/10.jpg)
Comparing Saxpy (Non-SlabOp 1)// attach two textures to FBO
glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, attachmentpoints[writeTex], textureParameters.texTarget, yTexID[writeTex], 0);
glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, attachmentpoints[readTex], textureParameters.texTarget, yTexID[readTex], 0);// check if that worked
if (!checkFramebufferStatus()) {printf("glFramebufferTexture2DEXT():\t [FAIL]\n");
// PAUSE();exit (ERROR_FBOTEXTURE);
} else if (mode == 0) {printf("glFramebufferTexture2DEXT():\t [PASS]\n");
}// enable fragment profilecgGLEnableProfile(fragmentProfile);// bind saxpy programcgGLBindProgram(fragmentProgram);// enable texture x (read-only, not changed during the iteration)cgGLSetTextureParameter(xParam, xTexID);cgGLEnableTextureParameter(xParam);// enable scalar alpha (same)cgSetParameter1f(alphaParam, alpha);// Calling glFinish() is only neccessary to get accurate timings,// and we need a high number of iterations to avoid timing noise.
glFinish();
![Page 11: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/11.jpg)
Comparing Saxpy (Non-SlabOp 2) for (int i=0; i<numIterations; i++) {
// set render destinationglDrawBuffer (attachmentpoints[writeTex]);// enable texture y_old (read-only)cgGLSetTextureParameter(yParam, yTexID[readTex]);cgGLEnableTextureParameter(yParam);// and render multitextured viewport-sized quad// depending on the texture target, switch between // normalised ([0,1]^2) and unnormalised ([0,w]x[0,h])// texture coordinates
// make quad filled to hit every pixel/texel // (should be default but we never know)glPolygonMode(GL_FRONT,GL_FILL);// and render the quadif (textureParameters.texTarget == GL_TEXTURE_2D) {
// render with normalized texcoordsglBegin(GL_QUADS);
glTexCoord2f(0.0, 0.0); glVertex2f(0.0, 0.0);glTexCoord2f(1.0, 0.0); glVertex2f(texSize, 0.0);glTexCoord2f(1.0, 1.0); glVertex2f(texSize, texSize);glTexCoord2f(0.0, 1.0); glVertex2f(0.0, texSize);
glEnd();} else {
// render with unnormalized texcoordsglBegin(GL_QUADS);
glTexCoord2f(0.0, 0.0); glVertex2f(0.0, 0.0);glTexCoord2f(texSize, 0.0); glVertex2f(texSize, 0.0);glTexCoord2f(texSize, texSize); glVertex2f(texSize, texSize);glTexCoord2f(0.0, texSize); glVertex2f(0.0, texSize);
glEnd();}// swap role of the two textures (read-only source becomes // write-only target and the other way round):swap();
}
![Page 12: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/12.jpg)
Comparing Saxpy Ok, that looked a little worse than we know it is But… using SlabOps did look a little easier
Saxpy only had one program being run for multiple iterations.
What about something more complicated… Fluid Flow
![Page 13: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/13.jpg)
Fluids Follow Stams method We’re not going to cover how to do fluids so much
as the program flow and how SlabOps help contain the problem
1. Advection2. Impulse3. Vorticity Confinement4. Viscous Diffusion5. Project Divergent Velocity
1. Compute Divergence2. Compute Pressure Disturbances3. Subtract gradient(p) from u
6. Display
“Fast Fluid Dynamics Simulation on the GPU”, Mark Harris. In GPU Gems.
![Page 14: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/14.jpg)
Lets not forget Boundary Conditions
Boundaries and interior are computed in separate passes and may require separate programs
![Page 15: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/15.jpg)
Implementation Harris’ implementation contained 15 GPU programs
(including 4 for display) The simulation takes about 20 passes for each time-
step,
(not including 2, 50 pass runs for the poisson solver)
Switch to code:(Note, code can be found in GPU Gems 1)
![Page 16: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/16.jpg)
Point: Creating something as complex as a fluid
solver would be very difficult without some kind of abstraction
So what’s so special about SlabOps Versatility Policy-Based Design
![Page 17: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/17.jpg)
SlabOp Versatility Remember we skipped over how to define a SlabOp. Each SlabOp is actually composed of 6 objects
working together. Each of the six objects can be replaced according to
the specific task In other words to alter a SlabOp to display to the
screen instead of the back buffer, I just replace the Update object.
![Page 18: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/18.jpg)
The 6 objects that define a SlabOp Render Target Policy
Sets up / shuts down any special render target functionality needed by the SlabOp
GL State Policy Sets and unsets the GL state needed for the SlabOp
Vertex Pipe Policy Sets up / shuts down vertex programs
Fragment Pipe Policy Sets up / shuts down fragment programs
Compute Policy Performs the computation (usually via rendering)
Update Policy Performs any copies or other update functions after the computation has been
performed
![Page 19: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/19.jpg)
Defining a SlabOp Luckily you do not need to create each of those
objects. You just need to replace one when it doesn’t do what
you want. Harris created 3 predefined SlabOps
DefaultSlabOp – performs simple fragment program rendered to a quad
BCSlabOp – performs boundary condition fragment program rendered as lines
DisplayOp – displays a texture to the screen
![Page 20: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/20.jpg)
More complex SlabOpsObjects defined to perform: Flat 3d texture computations
- computing for voxel grids Flat3DTexComputePolicy Flat3DBoundaryComputePolicy Flat3DVectorizedTexComputePolicy Copy3DTexGLUpdatePolicy
Multi-texture output - rendering with multiple texture outputs MultiTextureGLComputePolicy
Volume computations - rendering with multiple texture coordinates VolumeComputePolicy, VolumeGLComputePolicy
![Page 21: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/21.jpg)
Defining a SlabOp
typedef SlabOp < NoopRenderTargetPolicy, NoopGLStatePolicy, NoopVertexPipePolicy, GenericCgGLFragmentPipePolicy, SingleTextureGLComputePolicy, CopyTexGLUpdatePolicy > DefaultSlabOp;
Include a Noop where a policy is not used, Include the preferred policy where one is needed
![Page 22: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/22.jpg)
Next Generation SlabOps? Version on course website has been extracted
out of Harris’ fluid simulator and updated to use frame buffer objects instead of render texture
Easy to update SlabOps to use the geometry processor also
Additional policies could be created to render to non-quad surfaces, i.e. an object
![Page 23: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/23.jpg)
How do SlabOps work? The rest of this lecture will explain policy
based design. There will be no more GPU talk during the remainder of the lecture
Why? SlabOps were a good implementation of Policy
Based Design You should have some exposure to design
patterns and templates Because I’m the one holding the chalk.
![Page 24: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/24.jpg)
Where did Policy Based Design Come from?
Modern C++ DesignGeneric Programming and Design Patterns Applied
By: Andrei Alexandrescu
Excellent Bedtime reading
- Asleep within 2 pages
Contains unique implementations of
design patterns using templates
![Page 25: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/25.jpg)
What is a design pattern? Design Pattern: A general repeatable solution
to a commonly occurring problem in software design.
- Wikipedia (The irrefutable source on everything)
The most commonly known design pattern?
![Page 26: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/26.jpg)
The Singleton One of the simplest and most useful design
pattern Goal: To only have one instance of an object,
no matter where it is created in the program
![Page 27: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/27.jpg)
The Singletonclass Singleton {public:
static Singleton & Instance();~Singleton();
private:static Singleton * m_singleton;
};
Singleton & Singleton::Instance() {if(m_singleton == null)
m_singleton = new Singleton();return *m_singleton;
}
// in Cpp fileSingleton::m_singleton = null;
![Page 28: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/28.jpg)
C++ Templates Templates – functions that can operate with generic
types The STL is a library of templates
hence its name Standard Template Library Example Templates:
cout, cin vector<int> string
template <class myType> myType GetMax (myType a, myType b)
{ return (a>b?a:b); }
Example Template:
int x,y; GetMax <int> (x,y);
Example Template Use:
Modern C++ Design – Book on design patterns using templates
![Page 29: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/29.jpg)
Policy Based Design Defines a class with a complex behavior out of many
little classes (called policies), each which takes care of one behavioral or structural aspect.
You can mix and match policies to achieve a combinatorial set of behaviors by using a small core of elementary components
![Page 30: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/30.jpg)
How it works Multiple Inheritance
One class that inherits the properties of numerous other classes
Templates Systems that operate with generic types
Multiple Inheritance + Templates => Policy Based Design
![Page 31: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/31.jpg)
Policies Each policy is a simple class that implements
one aspect of the overall goal Policies do not need to be templates
(in many cases they’re not) Policies do need to have specific known
functions that they implement
![Page 32: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/32.jpg)
Encapsulation Class One class needs to use multiple inheritance to
combine all the policies togethertemplate < class RenderTargetPolicy, class GLStatePolicy, class VertexPipePolicy, class FragmentPipePolicy, class ComputePolicy, class UpdatePolicy>class SlabOp : public RenderTargetPolicy, public GLStatePolicy, public VertexPipePolicy, public FragmentPipePolicy, public ComputePolicy, public UpdatePolicy{public: SlabOp() {} ~SlabOp() {} Compute();};
![Page 33: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/33.jpg)
The Compute Method // The only method of the SlabOp host class is Compute(), which // uses the inherited policy methods to perform the slab computation. // Note that this also defines the interfaces that the policy classes // must have. void Compute() { // Activate the output slab, if necessary ActivateRenderTarget();
// Set the necessary state for the slab operation GLStatePolicy::SetState(); VertexPipePolicy::SetState(); FragmentPipePolicy::SetState(); SetViewport();
// Put the results of the operation into the output slab. UpdateOutputSlab();
// Perform the slab operation ComputePolicy::Compute();
ResetViewport();
// Reset state FragmentPipePolicy::ResetState(); VertexPipePolicy::ResetState(); GLStatePolicy::ResetState();
// Deactivate the output slab, if necessary DeactivateRenderTarget(); }};
![Page 34: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/34.jpg)
The Other Methods But wait, what about all the other functions
that we called inside our GPU program? Those exist in the individual policies Example:
InitializeFP(CGcontext context, string fpFileName, string entryPoint)
Exists in the FragmentPipePolicy
![Page 35: GPGPU Toolkit SlabOps SlabOps were created by Mark Harris (UNC, NVIDIA)](https://reader035.fdocuments.us/reader035/viewer/2022062407/56649d405503460f94a19c3c/html5/thumbnails/35.jpg)
Conclusion SlabOps are one of many GPGPU abstractions Happens to be my favorite because they are the most
versatile and are easy to useIssues: Does not include basic GPGPU functions such as
Reduce() There is a learning curve Difficult to find out where things are actually going
on