International Conference on SupercomputingJune 2013
TEAPOT: A Toolset for Evaluating Performance, Power and Image Quality on
Mobile Graphics Systems
Jose-Maria ArnauJoan-Manuel Parcerisa
Polychronis Xekalakis
Computer Architecture DepartmentUniversitat Politecnica de Catalunya
Intel LabsIntel Corporation
Jose-Maria Arnau
MotivationMobile GPU Simulator Requirements
2
Jose-Maria Arnau
Motivation
1. Support for Mobile Applications
Mobile GPU Simulator Requirements
2
Jose-Maria Arnau
Angry BirdsAndroid
Advertisement
% of GPU time
Motivation
1. Support for Mobile Applications
2. Full-System GPU Simulation
Mobile GPU Simulator Requirements
2
Jose-Maria Arnau
Angry BirdsAndroid
Advertisement
% of GPU time
Motivation
1. Support for Mobile Applications
2. Full-System GPU Simulation
3. GPU & Screen Power Models
System Memory
GPU
Screen
% of energy
Mobile GPU Simulator Requirements
2
Jose-Maria Arnau
Angry BirdsAndroid
Advertisement
% of GPU time
Motivation
1. Support for Mobile Applications
2. Full-System GPU Simulation
3. GPU & Screen Power Models
4. Flexible GPU Timing Simulator
System Memory
GPU
Screen
% of energy
Tile-Based Deferred Rendering
Immediate Mode Rendering
GPU GPU
On-Chip Memory
External Memory
Mobile GPU Simulator Requirements
2
Jose-Maria Arnau
Angry BirdsAndroid
Advertisement
% of GPU time
Motivation
1. Support for Mobile Applications
2. Full-System GPU Simulation
3. GPU & Screen Power Models
4. Flexible GPU Timing Simulator
System Memory
GPU
Screen
% of energy
Tile-Based Deferred Rendering
Immediate Mode Rendering
GPU GPU
On-Chip Memory
External Memory
Not supported by any publiclyavailable GPU simulator
Mobile GPU Simulator Requirements
2
Jose-Maria Arnau
Angry BirdsAndroid
Advertisement
% of GPU time
Motivation
1. Support for Mobile Applications
2. Full-System GPU Simulation
3. GPU & Screen Power Models
4. Flexible GPU Timing Simulator
System Memory
GPU
Screen
% of energy
Tile-Based Deferred Rendering
Immediate Mode Rendering
GPU GPU
On-Chip Memory
External Memory
Not supported by any publiclyavailable GPU simulator
Tailored towards desktop-likepower hungry GPUs
Mobile GPU Simulator Requirements
2
Jose-Maria Arnau
Outline
1. Motivation
2. Simulation Infrastructure
2.1. OpenGL ES Trace Generation
2.2. GPU Functional Simulation
2.3. Cycle-Accurate Timing Simulation
2.4. Power Model
2.5. Image Quality Assessment
3. Conclusions
3
Jose-Maria Arnau
OpenGLTrace
Generation
Simulation Infrastructure - Overview
Mobile Applications
Android 4.2 Jelly Bean
Android Emulator
Virtual GPUOpenGL ES
Trace Generator
Desktop GPU Driver
OpenGL ES Trace
Vertex/Fragment programs (GLSL)Textures
Geometry
Tools unmodified Tools adapted Tools created from scratch Trace files Statistics
4
Jose-Maria Arnau
GPUFunctionalSimulation
OpenGLTrace
Generation
Simulation Infrastructure - Overview
Mobile Applications
Android 4.2 Jelly Bean
Android Emulator
Virtual GPUOpenGL ES
Trace Generator
Desktop GPU Driver
OpenGL ES Trace
Vertex/Fragment programs (GLSL)Textures
Geometry
InstrumentedGallium3D
(softpipe driver)Frames
GPU Trace
GPU assembly instructionsMemory addresses
Tools unmodified Tools adapted Tools created from scratch Trace files Statistics
4
Jose-Maria Arnau
GPUTimingSimulation
GPUFunctionalSimulation
OpenGLTrace
Generation
Simulation Infrastructure - Overview
Mobile Applications
Android 4.2 Jelly Bean
Android Emulator
Virtual GPUOpenGL ES
Trace Generator
Desktop GPU Driver
OpenGL ES Trace
Vertex/Fragment programs (GLSL)Textures
Geometry
InstrumentedGallium3D
(softpipe driver)Frames
GPU Trace
GPU assembly instructionsMemory addresses
Cycle-AccurateGPU Simulator McPAT
ScreenPowerModel
ImageQuality
Assessment
Tools unmodified Tools adapted Tools created from scratch Trace files Statistics
4
Jose-Maria Arnau
GPUTimingSimulation
GPUFunctionalSimulation
OpenGLTrace
Generation
Simulation Infrastructure - Overview
Mobile Applications
Android 4.2 Jelly Bean
Android Emulator
Virtual GPUOpenGL ES
Trace Generator
Desktop GPU Driver
OpenGL ES Trace
Vertex/Fragment programs (GLSL)Textures
Geometry
InstrumentedGallium3D
(softpipe driver)Frames
GPU Trace
GPU assembly instructionsMemory addresses
Cycle-AccurateGPU Simulator McPAT
ScreenPowerModel
ImageQuality
Assessment
GPU Execution Time
GPU EnergyScreenEnergy
ImageQuality
Tools unmodified Tools adapted Tools created from scratch Trace files Statistics
4
Jose-Maria Arnau
OpenGL ES Trace GeneratorMobile Game
Android 4.2 Jelly Bean
Android Emulator
Virtual GPU
GPU Driver
Desktop GPU Driver
Desktop GPU
5
Jose-Maria Arnau
OpenGL ES Trace GeneratorMobile Game
Android 4.2 Jelly Bean
Android Emulator
Virtual GPU
GPU Driver
Desktop GPU Driver
Desktop GPU
glDrawArrays(...)
5
Jose-Maria Arnau
OpenGL ES Trace GeneratorMobile Game
Android 4.2 Jelly Bean
Android Emulator
Virtual GPU
GPU Driver
Desktop GPU Driver
Desktop GPU
glDrawArrays(...)
5
Jose-Maria Arnau
OpenGL ES Trace GeneratorMobile Game
Android 4.2 Jelly Bean
Android Emulator
Virtual GPU
GPU Driver
Desktop GPU Driver
Desktop GPU
glDrawArrays(...)
5
Jose-Maria Arnau
OpenGL ES Trace GeneratorMobile Game
Android 4.2 Jelly Bean
Android Emulator
Virtual GPU
GPU Driver
Desktop GPU Driver
Desktop GPU
glDrawArrays(...)
5
Jose-Maria Arnau
OpenGL ES Trace GeneratorMobile Game
Android 4.2 Jelly Bean
Android Emulator
Virtual GPU
GPU Driver
Desktop GPU Driver
Desktop GPU
glDrawArrays(...)
5
Jose-Maria Arnau
OpenGL ES Trace GeneratorMobile Game
Android 4.2 Jelly Bean
Android Emulator
Virtual GPU
GPU Driver
Desktop GPU Driver
Desktop GPU
glDrawArrays(...)
OpenGL ESTrace Generator
void glDrawArrays(...) { saveCommandInfo(...); real_glDrawArrays(...);}
5
Jose-Maria Arnau
OpenGL ES Trace GeneratorMobile Game
Android 4.2 Jelly Bean
Android Emulator
Virtual GPU
GPU Driver
Desktop GPU Driver
Desktop GPU
glDrawArrays(...)
OpenGL ESTrace Generator
void glDrawArrays(...) { saveCommandInfo(...); real_glDrawArrays(...);}
OpenGL ES TraceThread: Mobile GameOpenGL Context: A
- OpenGL ES Command List
- Shaders
- Geometry
- Textures
5
Jose-Maria Arnau
OpenGL ES Trace GeneratorMobile Game
Android 4.2 Jelly Bean
Android Emulator
Virtual GPU
GPU Driver
Desktop GPU Driver
Desktop GPU
glDrawArrays(...)
OpenGL ESTrace Generator
void glDrawArrays(...) { saveCommandInfo(...); real_glDrawArrays(...);}
OpenGL ES TraceThread: Mobile GameOpenGL Context: A
- OpenGL ES Command List
- Shaders
- Geometry
- Textures
Virtual Buttons
5
Jose-Maria Arnau
OpenGL ES Trace GeneratorMobile Game
Android 4.2 Jelly Bean
Android Emulator
Virtual GPU
GPU Driver
Desktop GPU Driver
Desktop GPU
glDrawArrays(...)
OpenGL ESTrace Generator
void glDrawArrays(...) { saveCommandInfo(...); real_glDrawArrays(...);}
OpenGL ES TraceThread: Mobile GameOpenGL Context: A
- OpenGL ES Command List
- Shaders
- Geometry
- Textures
Virtual Buttons
Thread: Virtual ButtonsOpenGL Context: B
- OpenGL ES Command List
- Shaders
- Geometry
- Textures
5
Jose-Maria Arnau
OpenGL ES Trace GeneratorMobile Game
Android 4.2 Jelly Bean
Android Emulator
Virtual GPU
GPU Driver
Desktop GPU Driver
Desktop GPU
glDrawArrays(...)
OpenGL ESTrace Generator
void glDrawArrays(...) { saveCommandInfo(...); real_glDrawArrays(...);}
OpenGL ES TraceThread: Mobile GameOpenGL Context: A
- OpenGL ES Command List
- Shaders
- Geometry
- Textures
Virtual Buttons
Thread: Virtual ButtonsOpenGL Context: B
- OpenGL ES Command List
- Shaders
- Geometry
- Textures
SurfaceFlinger
5
Jose-Maria Arnau
OpenGL ES Trace GeneratorMobile Game
Android 4.2 Jelly Bean
Android Emulator
Virtual GPU
GPU Driver
Desktop GPU Driver
Desktop GPU
glDrawArrays(...)
OpenGL ESTrace Generator
void glDrawArrays(...) { saveCommandInfo(...); real_glDrawArrays(...);}
OpenGL ES TraceThread: Mobile GameOpenGL Context: A
- OpenGL ES Command List
- Shaders
- Geometry
- Textures
Virtual Buttons
Thread: Virtual ButtonsOpenGL Context: B
- OpenGL ES Command List
- Shaders
- Geometry
- Textures
SurfaceFlinger
Thread: Surface FlingerOpenGL Context: C- OpenGL ES Command List
- Shaders
- Geometry
- Textures
5
Jose-Maria Arnau
OpenGL ES Trace GeneratorMobile Game
Android 4.2 Jelly Bean
Android Emulator
Virtual GPU
GPU Driver
Desktop GPU Driver
Desktop GPU
glDrawArrays(...)
OpenGL ESTrace Generator
void glDrawArrays(...) { saveCommandInfo(...); real_glDrawArrays(...);}
OpenGL ES TraceThread: Mobile GameOpenGL Context: A
- OpenGL ES Command List
- Shaders
- Geometry
- Textures
Virtual Buttons
Thread: Virtual ButtonsOpenGL Context: B
- OpenGL ES Command List
- Shaders
- Geometry
- Textures
SurfaceFlinger
Thread: Surface FlingerOpenGL Context: C- OpenGL ES Command List
- Shaders
- Geometry
- Textures
Support for multiple applications and
OpenGL ES contexts
5
Jose-Maria Arnau
GPU Functional Simulation
OpenGL ES Trace
Vertex/Fragment programs (GLSL)Textures
Geometry
Gallium3D
OpenGL ES front-end
Intermediate Representation: TGSITungsten Graphics Shader Infrastructure
Instrumented Softpipe Driver- Software Rasterizer- TGSI Emulator GPU Trace
Information stored per GPU command:
- Thread ID
- OpenGL ES Context ID
- GPU Assembly Instructions (TGSI)
- Memory addresses referenced for fetchingvertices, texels and pixels
...
6
Jose-Maria Arnau
Geometry Unit
GPU Timing Simulator● Immediate-Mode Rendering
Command Processor
Vertex Fetcher
VertexCache
Vertex Processor
PrimitiveAssembly
L2Cache
MemoryController
Fixed-Function Stage
Programmable Stage
Memory Hierarchy
GPU Trace
GPU Command 0
GPU Command 1
GPU Command 2
7
Jose-Maria Arnau
Raster Unit 1
Raster Unit 0Fragment Processor
Geometry Unit
GPU Timing Simulator● Immediate-Mode Rendering
Command Processor
Vertex Fetcher
VertexCache
Vertex Processor
PrimitiveAssembly
RasterizerEarly
Depth Test
PixelCachePixel
CacheTextureCache
ALU LD/ST
L2Cache
MemoryController
Fixed-Function Stage
Programmable Stage
Memory Hierarchy
GPU Trace
GPU Command 0
GPU Command 1
GPU Command 2
7
Jose-Maria Arnau
GPU Timing Simulator● Tile-Based Deferred Rendering
Geometry Unit
GPU Trace
GPU Command 0
GPU Command 1
GPU Command 2
Command Processor
Vertex Fetcher
VertexCache
Vertex Processor
PrimitiveAssembly
L2Cache
MemoryController
Fixed-Function Stage
Memory Hierarchy
Programmable Stage
8
Jose-Maria Arnau
Tiling Engine
GPU Timing Simulator● Tile-Based Deferred Rendering
Geometry Unit
GPU Trace
GPU Command 0
GPU Command 1
GPU Command 2
Command Processor
Vertex Fetcher
VertexCache
Vertex Processor
PrimitiveAssembly
L2Cache
MemoryController
Fixed-Function Stage
Memory Hierarchy
Programmable Stage
PolygonList Builder
TileScheduler
TileCache
8
Jose-Maria Arnau
Tiling Engine
GPU Timing Simulator● Tile-Based Deferred Rendering
Raster Unit 1
Raster Unit 0Fragment Processor
Geometry Unit
GPU Trace
GPU Command 0
GPU Command 1
GPU Command 2
Command Processor
Vertex Fetcher
VertexCache
Vertex Processor
PrimitiveAssembly
RasterizerEarly
Depth Test
Z-BufferColorBuffer
TextureCache
ALU LD/ST
L2Cache
MemoryController
Fixed-Function Stage
Memory Hierarchy
Programmable Stage
PolygonList Builder
TileScheduler
TileCache
8
Jose-Maria Arnau
GPU Timing Simulator● Fragment/Vertex Processors
● Simple in-order 4-stage pipeline● Multi-warp execution● Vectorial ISA● Texture Sampling Units
Warpscheduler
InstructionMemory
ConstantRegister
File
InputRegister
File
TemporalRegister
File
OutputRegister
File
Operand buffering& routing
SIMDALU
SFU
MEMUNIT
TEXUNIT
PixelCachePixel
Cache
TextureCache
TextureCache
Instruction Fetch Instruction Decode Execution WriteBack
9
Jose-Maria Arnau
GPU Power Model● Based on McPAT
– TEAPOT extensions:● Multiple data caches per core● Read-only caches● Specialized graphics hardware (texture sampling units...)● Output file in JSON format
– Directly called from timing simulator
Configuration filenum_raster_unitsnum_geometry_unitsnum_fragment_procsnum_vertex_procsnum_warps_per_proc...
Configuration filenum_raster_unitsnum_geometry_unitsnum_fragment_procsnum_vertex_procsnum_warps_per_proc...
Cycle-Accurate GPU Simulator
McPAT
GPU description
Area, Leakage
Activity Factors
Dynamic Power
10
Jose-Maria Arnau
Screen Power Model● OLED displays
– Consume different energy depending on the colors– Screen energy depends on the output generated by the GPU
● OLED-based displays power model– Provides three functions, f(R), f(G), f(B), that map pixel
intensity into energy consumption
M. Dong, Y.-S. K. Choi, and L. Zhong. “Power Modeling of Graphical User Interfaces on OLED Displays”. In Proc. of DAC, pages 652–657, 2009.
11
Jose-Maria Arnau
Image Quality Assessment
● Image Quality Metrics– Based on per-pixel errors
● MSE (Mean-Squared Error)● PSNR (Peak Signal-to-Noise Ratio)
– Based on the human visual perception system● MSSIM (Mean Structural SIMilarity Index)
– Require reference noise-free image for comparison
– Evaluate distortion when trading quality for energy
Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli. “Image Quality Assessment: from Error Visibility to Structural Similarity”. IEEE Transactions on Image Processing, 2004.
12
Jose-Maria Arnau
Outline
1. Motivation
2. Simulation Infrastructure
2.1. OpenGL ES Trace Generation
2.2. GPU Functional Simulation
2.3. Cycle-Accurate Timing Simulation
2.4. Power Model
2.5. Image Quality Assessment
3. Conclusions
13
Jose-Maria Arnau
Conclusions
● The TEAPOT toolset is tailored towards the mobile segment since it:– Runs unmodified Android applications
– Estimates performance, energy, area and image quality of mobile graphics systems
– Provides a flexible timing simulator, supporting Immediate-Mode and Tile-Based Deferred Rendering
– Reports statistics:● Per Application, including Android OS (full-system)● Per Frame● Per Component: GPU, System Memory and Screen
14
International Conference on SupercomputingJune 2013
TEAPOT: A Toolset for Evaluating Performance, Power and Image Quality on
Mobile Graphics Systems
Jose-Maria ArnauJoan-Manuel Parcerisa
Polychronis Xekalakis
Computer Architecture DepartmentUniversitat Politecnica de Catalunya
Intel LabsIntel Corporation