Advanced Imaging on iOS

38
Advanced Imaging on iOS @rsebbe

description

Want to know the untold secrets of imaging on iOS? This talks goes through performance considerations about a number of imaging APIs on iOS, including some examples of how we integrated them in our own apps. Image loading, processing, and display will be analysed and discussed to find best APIs for particular use cases.

Transcript of Advanced Imaging on iOS

Page 1: Advanced Imaging on iOS

Advanced Imaging on iOS@rsebbe

Page 2: Advanced Imaging on iOS

Foreword• Don’t make any assumption about imaging on

iOS. Why?

• Because even if you’re right today, you’ll be wrong next year.

• iOS is a moving platform, constantly being optimized versions after versions.

• Experiment, and find out the best approach for your app.

Page 3: Advanced Imaging on iOS

Understanding• Things to have in mind at all times: execution

speed & memory consumption.

• How to assess those: Instruments.

Page 4: Advanced Imaging on iOS

APIsCore Image

Image IO

Core Animation

Core Graphics

GLKit

Core Video

AVFoundation

Many APIs but a unique reality: !There’s CPU & there’s GPU Each has pro’s & con’s. !Use them wisely depending on particular app needs.

Page 5: Advanced Imaging on iOS

Imaging 101• On iOS, you typically use either PNGs and/or JPEGs.

• PNG is lossless, typically less compressed *, CPU decode only, prepared by Xcode if in app bundle. Use-case: UI elements.

• JPEG is lossy, typically more compressed *, CPU or GPU decode. Use-case: photo/textures

!

• *: for images with single-color areas (UI), PNG can beat JPEG by a factor of 10x !

Page 6: Advanced Imaging on iOS

Imaging 101• Image on iPhone 5/5s/5c: 3264 x 2448 pixels =

7990272 pixels = ~8MP

• Decoded, each pixel is (R,G,B,A), 4 bytes. Whole image is then 32MB in RAM. Original JPEG is ~3MB.

Page 7: Advanced Imaging on iOS

Imaging Purpose• What is your purpose? Load & display (preview

thumbnails)? Load, process & display (image editing)? Load, process & save? Process only (augmented reality, page detection)?

• Amount of data. Large images or small images? Large input image, small output?

Page 8: Advanced Imaging on iOS

Discrete vs. UMADecode Transfer Display

Discrete GPU (Mac)

Unified Memory Architecture (iOS/Mac)

Decode T Display

GPUTransferCPU

Decode Transfer DisplayDiscrete

GPU

Unified Memory

ArchitectureDecode T Display

Process

Process

Total speed depends on relative transfer & processing speeds

GPU & CPU share same memory. Going back&forth is cheap

Data go through bus, back & forth is expensive

iOS being a UMA gives a lot of flexibility

Page 9: Advanced Imaging on iOS

ComparisonsDraw w/

transform Decode Process T Display

GPUTransferCPU

Decode DisplayPure GPU

CGContextDrawImage

Draw w/ transform Decode T Display CALayer/UIView

setTransform

Page 10: Advanced Imaging on iOS

GPU• Fast, but has constraints.

• Low-level APIs: OpenGL ES, GLSL

• High-level APIs: GLKit, Sprite Kit, Core Animation, Core Image, Core Video

• Max OpenGL texture size is device dependent • 4096x4096: iPad 2/mini/3/Air/mini2, iPhone 4S+ • 2048x2048: iPhone 3GS / 4 / iPad 1

• Has fast hardware decoder for JPEG / Videos • Cannot run if app is in background (exception)

Page 11: Advanced Imaging on iOS

CPU• Slow, but flexible. Like “I’m 15x slower than my GPU

friend, OK, but I can be smarter”.

• Low-level APIs: C, SIMD.

• High-level APIs: Accelerate, Core Graphics, ImageIO.

• Has smart JPEG decoder.

• Can run in background.

Page 12: Advanced Imaging on iOS

Core Animation• Very efficient, GPU-accelerated 2D/3D transform of image-based

content. Foundation for UIKit’s UIView.

• CALayer/UIView needs some visuals. How? -drawLayer: or -setContents:

• -[CALayer drawLayer:] (or -[UIView drawRect:]) uses Core Graphics (CPU) to generate an image (slow), and that image is then made into a GPU-backed texture that can move around (fast).

• If not drawing, but instead setting contents -[CALayer setContents:] (or -[UIImageView setImage:]), you get the fast path, that is, GPU image decoding.

Page 13: Advanced Imaging on iOS

Fast PathGPUTransferCPU

Decode DisplayPure GPU CALayer.contents (or UIImageView.image)

CPU Decode T Display-[CALayer drawRect:] (or UIView) + CGContextDrawImage (or UIImage draw)

Page 14: Advanced Imaging on iOS

Demo 1The Strong, the Weak, & the Ugly

Comparison of CALayer.contents / UIView drawRect: for small images 2MP, 50x Show relative execution speed Show Instruments Time Profiler & OpenGL ES Driver.

Page 15: Advanced Imaging on iOS

Core Graphics / ImageIO• CPU (mostly).

• CGImageRef, CGImageSourceRef, CGContextRef

• Used with -drawRect: -drawLayer:

Page 16: Advanced Imaging on iOS

Core Graphics / ImageIO• How the load CGImageRef? Either using UIImage (easier) or

CGImageSourceRef (more control)

• How to create CGImageRef from existing pixel data? CGDataProviderRef

• Having a CGImage object does not mean it’s decoded. It’s typically not, and even references mem-mapped data on disk -> no real device memory.

• Sometimes, you may want to have it into decoded form (repeated/high performance drawing, access to pixel values)

• How do I do that?

Page 17: Advanced Imaging on iOS

Core Graphics / ImageIO• Need access to pixel values: use CGBitmapContext

• Need to draw that image repeatedly? use CGLayer, UIGraphicsBeginImageContext(), or CGImage’s shouldCache property.

Page 18: Advanced Imaging on iOS

Core Graphics / ImageIO• Understanding kCGImageSourceShouldCache

• It does *not* cache your image when creating it from the CGImageSourceRef.

• Instead it caches it when drawing it the first time.

• It caches possibly at different sizes simultaneously. If you draw your image at 3 different sizes -> cached 3x.

• Check your memory consumption when using caching, and don’t keep that image around when not needed.

Page 19: Advanced Imaging on iOS

Core Graphics / ImageIO• Note on JPEG decoding (CPU)

• Image divided in 8x8 blocks of pixels

• Encoding: DCT (frequency domain)

• Decoding: skip higher frequencies if not needed.

• That property can be used to make CPU decoding a lot faster.

Page 20: Advanced Imaging on iOS

Core Graphics / ImageIO• If source image is 3264px wide,

• Drawing at 1632px will trigger partial JPEG decoding (4x4 instead of 8x8) -> much faster. Drawing at 1633px triggers full decoding + interpolation (much slower)

• Similarly, successive power of 2 dividers have additional speed gain. ÷8 faster than ÷4 faster than ÷2

• If you need to draw a large image to a small size, use Core Graphics API (CPU), not CALayer (GPU). GPU decoding always decodes at full resolution!

Page 21: Advanced Imaging on iOS

I’m CPU, I’m weak but I’m smart!

Draw small from large

image (GPU)Decode Display

GPUTransferCPU

CALayer.contents

Draw small from large

image (CPU)Decode T Display CGContextDrawImage (with

small target size)

Memory

Mem

Page 22: Advanced Imaging on iOS

Demo 2The Strong & Idiot vs. the Weak & Smart

11MP, 10x Show GPU is slower. Show GPU version does entire image decoding, while CPU does smarter, reduced drawing. Show Time Profiler function trace Show VMTracker tool, Dirty size. Change code to show influence of draw size on speed (+ function trace)

Page 23: Advanced Imaging on iOS

Core Image• CPU or GPU, ~20x speed difference on recent iPhone

• Then, why using CPU? Background rendering (GPU not available) or as OS fallback if image is too large.

• API: CIImage, CIFilter, CIContext

• CIImage are (immutable) recipes, do not store image data by themselves

• CIFilter (mutable) used to transform/combine CIImages

• CIContext used to render into destination

Page 24: Advanced Imaging on iOS

Core Image• CIContext targets either EAGLContext or not. If not, it’s

meant to create CGImageRef, or render to CPU memory (void*). In both cases, CIContext uses GPU to render, unless kCIContextUseSoftwareRenderer is YES.

• Using software rendering is slow. Very slow. Very, very slow. Like 15x slower. Not recommended.

• Depending on input image size / output target size, iOS will automatically fallback to software rendering. Query CIContext with inputImageMaximumSize / outputImageMaximumSize

Page 25: Advanced Imaging on iOS

Core Image• -inputImageMaximumSize: 4096 (iPhone 4S+, iPad 2+),

2048 (iPhone 4-, iPad 1)

• 4000x3000 image (12MP) fits. Camera sensor is 8MP, OK.

• 5000x4000 image (20MP) does not fit.

• How do I process images larger than the limit?

Page 26: Advanced Imaging on iOS

Core Image• Answer: Image Tiling.

• Large CIImage & -imageByCroppingToRect? NO, CPU fallback, as Core Image sees the original one (> limit).

• Do the cropping on the CGImage itself (CGImageCreateImageWithRect), and *then* create a CIImage out of it.

• Render tiles as CGImage from the CIContext, and render those tiles in the large, final CGContext (> limit).

• Art of tiling: Prizmo needs to process scanned images, that can be > 20MP.

Page 27: Advanced Imaging on iOS

Core ImageSource Image

Target Image (result)

Tiling in Prizmo: subdivide until source & target tiles both fit GPU texture size limit

GPU texture size limit

Perspective Crop

Page 28: Advanced Imaging on iOS

Core Image• Tips & Tricks

• Core Image has access to hardware JPEG decoder, just like Core Animation’s CALayer.contents API.

• Core Image is not programmable on iOS. But many unavailable functions can be expressed from the builtin CIFilter’s.

• Don’t find the filters you need? Give GPUImage a try.

• Perfect team mate for OpenGL and Core Video.

Page 29: Advanced Imaging on iOS

CIImage’s Fast PathGPUTransferCPU

Decode ProcessPure GPUCIImage imageWithContentsOfURL (or CGImage)

Display

Page 30: Advanced Imaging on iOS

Core Image• Live processing or not? Depends.

Live Processing Cached Processing

OpenGL Layer/View CATiledLayer

Atomic Refresh Visible Tiled Rendering

Faster computation overall Slower computation

Slower interaction Faster interaction

Page 31: Advanced Imaging on iOS

UIKit’s UIImage• Abstraction above CGImage / CIImage

• Can’t be both at the same time, either CGImage-backed or CIImage-backed.

• Has additional properties such as scale (determines how it’s rendered, Retina display) and imageOrientation

• Nice utilities like -resizableImageWithCapInsets:

Page 32: Advanced Imaging on iOS

Core Video• Entry point for media. Both live camera stream & video files

decoding/encoding.

• Defines image types, and image buffer pool concept (reuse).

• Native format generally is YUV420v (MP4/JPEG). Luminance plane (full size) + Cr, Cr planes (1:4)

• You can ask to get them as GPU-enabled CVPixelBuffer for I/O

• As of iOS7, you can render with OpenGL ES using R & RG I/O (resp. 1 & 2 comps for L & Cr/Cr planes) -> no more conversion needed (iPhone 4S+).

Page 33: Advanced Imaging on iOS

OpenGL ES• OpenGL - GLSL - GLKit: low level. You must load image

as a texture, create a rectangle geometry, define a shader that tells how to map texture image to rendered fragments

• Image processing is mostly happening in the fragment shader.

• GPUImage is an interesting library with so many available filters.

• CeedGL is a thin Obj-C wrapper for OpenGL objects (texture, framebuffer, shader, program, etc.)

Page 34: Advanced Imaging on iOS

OpenGL ES• R / RG planes (GL2.0, iPhone 4s+)

• Multiple Render Targets / Framebuffer Fetch (GL3.0, iPhone 5s+)

• MRT: before gl_FragColor.rgba = …

• MRT: after my_FragColor.rgba = …; my_NormalMap.xy = …, etc. in single shader pass.

Page 35: Advanced Imaging on iOS

GLKit Remark• GLKit does not seem to allow hardware decoding of

JPEGs (tested on iOS7, iPhone 5). Could change.

Page 36: Advanced Imaging on iOS

Conclusion• Use CPU / GPU for what it does best.

• Don’t do more work than you need.

• Overwhelming CPU or GPU is not good. Try to balance efforts to remain fluid at all times.

Page 37: Advanced Imaging on iOS

CookbookDisplay

thumbnailsHave JPEGs ready with target size, use CALayer.contents or UIImageView.image to get faster hardware decoding

Compute thumbnails from

large image

Use CGImageSourceCreateThumbnail (or CGBitmapContext / CGContextDrawImage)

Live processing & display of large images

Use CATiledLayer with cached source image (CIImage) at various scales. Or OpenGL rendering if size < 4096 &

processing can be made as a (fast) shader

Offscreen processing of large images

if size<=4096: GPU Core Image (or GL). else: GPU Core Image (or GL) with custom tiling +

CGContext.

Page 38: Advanced Imaging on iOS

@cocoaheadsBE