Using Xcode with OpenCV - Carnegie Mellon...

44
Using Xcode with OpenCV Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps

Transcript of Using Xcode with OpenCV - Carnegie Mellon...

Page 1: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Using Xcode with OpenCV

Instructor - Simon Lucey

16-423 - Designing Computer Vision Apps

Page 2: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Today

• OpenCV review.

• Using OpenCV in Xcode.

• Mobile Cameras

Page 3: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Detecting a Face in OpenCV

• On your browser please go to the address,

https://github.com/slucey-cs-cmu-edu/Detect_Lena

• Or again, you can type from the command line.

$ git clone https://github.com/slucey-cs-cmu-edu/Detect_Lena.git

• Questions: why do you need to clone the Mat image when displaying?

Page 4: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its
Page 5: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its
Page 6: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its
Page 7: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Today

• OpenCV homework review.

• Using OpenCV in Xcode.

• About Mobile Cameras

Page 8: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Objective C• Developed in the early 80s, selected by NeXT as its main

language from which OS X and iOS are derived. • Designed as an object-oriented extension to the C language,

is based on message passing. • Objective C is a thin layer on top of C.

• source code have .m filenames, • header/interface files have a .h extension, and • Objective-C++ files are denoted with a .mm file extension.

• We will be using Objective-C++ files in most of our work in this course.

• Mainly used to interface with the UI.

Page 9: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Objective C versus SWIFT

• Goal in this course is not for you to become experts in either Objective C or SWIFT.

• However, you must have some degree of competency in order to Build and Run an iOS App.

vs.

Page 10: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Objective C versus SWIFT

• Goal in this course is not for you to become experts in either Objective C or SWIFT.

• However, you must have some degree of competency in order to Build and Run an iOS App.

Page 11: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

SWIFT

• Essentially “Objective C without the C”. • SWIFT does not expose pointers and other unsafe

accessors.

• Not going to be using it much in this course, most iOS computer vision tutorials and books still in Objective C.

Page 12: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Xcode Layout

Perform your core development tasks in the Xcode workspace window, your primary interface for creating andmanaging projects. A project is the main unit of development in Xcode. It includes all the elements neededto build your app, framework, plug-in, or other software product. It also maintains the relationships betweenthose elements. For more detail on projects, see A Project Is a Repository of Files and Resources for BuildingApps (page 31).

The workspace window automatically adapts itself to the task at hand, and you can further configure thewindow to fit your work style. You can open as many workspace windows as you need.

The components of the workspace window are shown in the following figure.

The workspace window always includes the editor area. When you select a file in your project, its contentsappear in the editor area, where Xcode opens the file in an appropriate editor. For example, in the figure above,the editor area contains AdventureScene.swift, a swift code file that is selected in the Navigator area onthe left of the workspace window.

2014-10-20 | Copyright © 2014 Apple Inc. All Rights Reserved.

19

Develop Your App in the Workspace Window

(Taken from “Xcode Overview”. )

Page 13: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Displaying an Image in Xcode

• On your browser please go to the address,

https://github.com/slucey-cs-cmu-edu/Intro_iOS_Lena

• Or again, you can type from the command line.

$ git clone https://github.com/slucey-cs-cmu-edu/Intro_iOS_Lena.git

Page 14: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

(Taken from ViewController.mm file within the Detect_Lena project described on the previous slide. )

Page 15: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

(Taken from ViewController.mm file within the Detect_Lena project described on the previous slide. )

Page 16: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

(Taken from ViewController.mm file within the Detect_Lena project described on the previous slide. )

Page 17: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

(Taken from ViewController.mm file within the Detect_Lena project described on the previous slide. )

Page 18: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

(Taken from ViewController.mm file within the Detect_Lena project described on the previous slide. )

Page 19: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

UIImage Class

• High-level way in Objective C for working with images. • Relatively simple to use, you need to use UIImageToMat

function convert between OpenCV’s Mat class. • Be careful as function entail’s a memory copy, so frequent

calls will effect performance. • Be mindful of how color channels are ordered,

• 8 bits per color channel. • Channel ordering in OpenCV’s Mat class is BGR. • UIImage’s ordering is RGB.

Page 20: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

!

Objective-C Cheat Sheet and Quick Reference

Version 1.5. Copyright 2013 Ray Wenderlich. All rights reserved. Source: raywenderlich.com. Visit for more iOS resources and tutorials!

Class Header (.h) !#import "AnyHeaderFile.h" @interface ClassName : SuperClass // define public properties // define public methods @end

Class Implementation (.m) !#import "YourClassName.h" @interface ClassName () // define private properties // define private methods @end @implementation ClassName { // define private instance variables } // implement methods @end

Defining Methods !- (type)doIt; - (type)doItWithA:(type)a; - (type)doItWithA:(type)a b:(type)b;

Implementing Methods !- (type)doItWithA:(type)a b:(type)b { // Do something with a and b... return retVal; }

Creating an Object !ClassName * myObject = [[ClassName alloc] init];

Calling a Method ![myObject doIt]; [myObject doItWithA:a]; [myObject doItWithA:a b:b];

Declaring Variables type myVariable;

Variable types int 1, 2, 500, 10000 float double

1.5, 3.14, 578.234

BOOL YES, NO ClassName * NSString *, NSArray *, etc. id Can hold reference to any object !Defining Properties !@property (attribute1, attribute2) type propertyName; strong Adds reference to keep object alive weak Object can disappear, become nil assign Normal assign, no reference copy Make copy on assign nonatomic Make not threadsafe, increase perf readwrite Create getter&setter (default) readonly Create just getter !Using Properties [myObject setPropertyName:a]; myObject.propertyName = a; // alt a = [myObject propertyName]; a = myObject.propertyName; // alt

What is a Property? 1) Automatically defines a private instance variable:

type _propertyName;

2) Automatically creates a getter and setter:

- (type)propertyName; - (void)setPropertyName:(type)name;

Using _propertyName uses the private instance variable directly. Using self.propertyName uses the getter/setter.

Custom Initializer Example !- (id)initWithParam:(type)param { if ((self = [super init])) { _propertyName = param; } return self; } !NSString Quick Examples !NSString *personOne = @"Ray"; NSString *personTwo = @"Shawn"; NSString *combinedString = [NSString stringWithFormat: @"%@: Hello, %@!", personOne, personTwo]; NSLog(@"%@", combinedString); NSString *tipString = @"24.99"; float tipFloat = [tipString floatValue];

NSArray Quick Examples !NSMutableArray *array = [@[person1, person2] mutableCopy]; [array addObject:@"Waldo"]; NSLog(@"%d items!", [array count]); for (NSString *person in array) { NSLog(@"Person: %@", person); } NSString *waldo = array[2];

(Taken from Ray Wenderlinch’s Objective C Cheet Sheet)

Page 21: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

!

Objective-C Cheat Sheet and Quick Reference

Version 1.5. Copyright 2013 Ray Wenderlich. All rights reserved. Source: raywenderlich.com. Visit for more iOS resources and tutorials!

Class Header (.h) !#import "AnyHeaderFile.h" @interface ClassName : SuperClass // define public properties // define public methods @end

Class Implementation (.m) !#import "YourClassName.h" @interface ClassName () // define private properties // define private methods @end @implementation ClassName { // define private instance variables } // implement methods @end

Defining Methods !- (type)doIt; - (type)doItWithA:(type)a; - (type)doItWithA:(type)a b:(type)b;

Implementing Methods !- (type)doItWithA:(type)a b:(type)b { // Do something with a and b... return retVal; }

Creating an Object !ClassName * myObject = [[ClassName alloc] init];

Calling a Method ![myObject doIt]; [myObject doItWithA:a]; [myObject doItWithA:a b:b];

Declaring Variables type myVariable;

Variable types int 1, 2, 500, 10000 float double

1.5, 3.14, 578.234

BOOL YES, NO ClassName * NSString *, NSArray *, etc. id Can hold reference to any object !Defining Properties !@property (attribute1, attribute2) type propertyName; strong Adds reference to keep object alive weak Object can disappear, become nil assign Normal assign, no reference copy Make copy on assign nonatomic Make not threadsafe, increase perf readwrite Create getter&setter (default) readonly Create just getter !Using Properties [myObject setPropertyName:a]; myObject.propertyName = a; // alt a = [myObject propertyName]; a = myObject.propertyName; // alt

What is a Property? 1) Automatically defines a private instance variable:

type _propertyName;

2) Automatically creates a getter and setter:

- (type)propertyName; - (void)setPropertyName:(type)name;

Using _propertyName uses the private instance variable directly. Using self.propertyName uses the getter/setter.

Custom Initializer Example !- (id)initWithParam:(type)param { if ((self = [super init])) { _propertyName = param; } return self; } !NSString Quick Examples !NSString *personOne = @"Ray"; NSString *personTwo = @"Shawn"; NSString *combinedString = [NSString stringWithFormat: @"%@: Hello, %@!", personOne, personTwo]; NSLog(@"%@", combinedString); NSString *tipString = @"24.99"; float tipFloat = [tipString floatValue];

NSArray Quick Examples !NSMutableArray *array = [@[person1, person2] mutableCopy]; [array addObject:@"Waldo"]; NSLog(@"%d items!", [array count]); for (NSString *person in array) { NSLog(@"Person: %@", person); } NSString *waldo = array[2];

(Taken from Ray Wenderlinch’s Objective C Cheet Sheet)

Page 22: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

!

Objective-C Cheat Sheet and Quick Reference

Version 1.5. Copyright 2013 Ray Wenderlich. All rights reserved. Source: raywenderlich.com. Visit for more iOS resources and tutorials!

Class Header (.h) !#import "AnyHeaderFile.h" @interface ClassName : SuperClass // define public properties // define public methods @end

Class Implementation (.m) !#import "YourClassName.h" @interface ClassName () // define private properties // define private methods @end @implementation ClassName { // define private instance variables } // implement methods @end

Defining Methods !- (type)doIt; - (type)doItWithA:(type)a; - (type)doItWithA:(type)a b:(type)b;

Implementing Methods !- (type)doItWithA:(type)a b:(type)b { // Do something with a and b... return retVal; }

Creating an Object !ClassName * myObject = [[ClassName alloc] init];

Calling a Method ![myObject doIt]; [myObject doItWithA:a]; [myObject doItWithA:a b:b];

Declaring Variables type myVariable;

Variable types int 1, 2, 500, 10000 float double

1.5, 3.14, 578.234

BOOL YES, NO ClassName * NSString *, NSArray *, etc. id Can hold reference to any object !Defining Properties !@property (attribute1, attribute2) type propertyName; strong Adds reference to keep object alive weak Object can disappear, become nil assign Normal assign, no reference copy Make copy on assign nonatomic Make not threadsafe, increase perf readwrite Create getter&setter (default) readonly Create just getter !Using Properties [myObject setPropertyName:a]; myObject.propertyName = a; // alt a = [myObject propertyName]; a = myObject.propertyName; // alt

What is a Property? 1) Automatically defines a private instance variable:

type _propertyName;

2) Automatically creates a getter and setter:

- (type)propertyName; - (void)setPropertyName:(type)name;

Using _propertyName uses the private instance variable directly. Using self.propertyName uses the getter/setter.

Custom Initializer Example !- (id)initWithParam:(type)param { if ((self = [super init])) { _propertyName = param; } return self; } !NSString Quick Examples !NSString *personOne = @"Ray"; NSString *personTwo = @"Shawn"; NSString *combinedString = [NSString stringWithFormat: @"%@: Hello, %@!", personOne, personTwo]; NSLog(@"%@", combinedString); NSString *tipString = @"24.99"; float tipFloat = [tipString floatValue];

NSArray Quick Examples !NSMutableArray *array = [@[person1, person2] mutableCopy]; [array addObject:@"Waldo"]; NSLog(@"%d items!", [array count]); for (NSString *person in array) { NSLog(@"Person: %@", person); } NSString *waldo = array[2]; (Taken from Ray Wenderlinch’s Objective C Cheet Sheet)

Page 23: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

!

Objective-C Cheat Sheet and Quick Reference

Version 1.5. Copyright 2013 Ray Wenderlich. All rights reserved. Source: raywenderlich.com. Visit for more iOS resources and tutorials!

Class Header (.h) !#import "AnyHeaderFile.h" @interface ClassName : SuperClass // define public properties // define public methods @end

Class Implementation (.m) !#import "YourClassName.h" @interface ClassName () // define private properties // define private methods @end @implementation ClassName { // define private instance variables } // implement methods @end

Defining Methods !- (type)doIt; - (type)doItWithA:(type)a; - (type)doItWithA:(type)a b:(type)b;

Implementing Methods !- (type)doItWithA:(type)a b:(type)b { // Do something with a and b... return retVal; }

Creating an Object !ClassName * myObject = [[ClassName alloc] init];

Calling a Method ![myObject doIt]; [myObject doItWithA:a]; [myObject doItWithA:a b:b];

Declaring Variables type myVariable;

Variable types int 1, 2, 500, 10000 float double

1.5, 3.14, 578.234

BOOL YES, NO ClassName * NSString *, NSArray *, etc. id Can hold reference to any object !Defining Properties !@property (attribute1, attribute2) type propertyName; strong Adds reference to keep object alive weak Object can disappear, become nil assign Normal assign, no reference copy Make copy on assign nonatomic Make not threadsafe, increase perf readwrite Create getter&setter (default) readonly Create just getter !Using Properties [myObject setPropertyName:a]; myObject.propertyName = a; // alt a = [myObject propertyName]; a = myObject.propertyName; // alt

What is a Property? 1) Automatically defines a private instance variable:

type _propertyName;

2) Automatically creates a getter and setter:

- (type)propertyName; - (void)setPropertyName:(type)name;

Using _propertyName uses the private instance variable directly. Using self.propertyName uses the getter/setter.

Custom Initializer Example !- (id)initWithParam:(type)param { if ((self = [super init])) { _propertyName = param; } return self; } !NSString Quick Examples !NSString *personOne = @"Ray"; NSString *personTwo = @"Shawn"; NSString *combinedString = [NSString stringWithFormat: @"%@: Hello, %@!", personOne, personTwo]; NSLog(@"%@", combinedString); NSString *tipString = @"24.99"; float tipFloat = [tipString floatValue];

NSArray Quick Examples !NSMutableArray *array = [@[person1, person2] mutableCopy]; [array addObject:@"Waldo"]; NSLog(@"%d items!", [array count]); for (NSString *person in array) { NSLog(@"Person: %@", person); } NSString *waldo = array[2];

(Taken from Ray Wenderlinch’s Objective C Cheet Sheet)

Page 24: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

OpenCV as a Framework??

• Frameworks are intended to simplify the process of handling dependencies.

• They encapsulate headers, and binary files, so you do not need to handle them manually.

• Framework is just a specially structured folder. • All dependencies are handled automatically. • You may need to re-build the opencv2.framework if you are

using Xcode 7.

Page 25: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Using OpenCV in Xcode

• On your browser please go to the address,

https://github.com/slucey-cs-cmu-edu/Intro_iOS_OpenCV

• Or again, you can type from the command line.

$ git clone https://github.com/slucey-cs-cmu-edu/Intro_iOS_OpenCV.git

Page 26: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Today

• OpenCV homework review.

• Using OpenCV in Xcode.

• About Mobile Cameras

Page 27: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Pinhole Camera

21

2 SERGE BELONGIE, CSE 252B: COMPUTER VISION II

Figure 1. The pinhole imaging model, from Forsyth & Ponce.

Let us begin by considering a mathematical description of the imagingprocess through this idealized camera. We will consider issues like lens dis-tortion subsequently.

The pinhole camera or the projective camera as it is known images thescene by applying a perspective projection to it. In the following we shall re-fer to scene coordinates with upper case roman letters, {X, Y, Z, . . .}. Imagecoordinates will be referred to using lower case roman letters, {x, y, z, . . .}.Vectors shall be denoted by boldfaced symbols, e.g., X or x. (In class, whenwriting on the blackboard, I will put a tilde underneath the correspondingsymbols to denote a vector.)

The scene is three dimensional, whereas the image is located in a twodimensional plane. Hence the perspective projection maps the 3D space toa 2D plane.

(X, Y, Z)>Projection������! (x, y)>

The equations of perspective projections are given by

(1.1) x = fX

Zy = f

Y

Zhere, f is the focal length of the camera, i.e., the distance between the imageplane and the pinhole.

The process is illustrated in figure 2.

C

Y

y

Y

x

X

x

p

image plane

camera

centre

Z

principal axis

C

X

Figure 2. Image formation in a projective camera.

(Taken from Forsyth & Ponce)

Page 28: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Pinhole Camera

21

2 SERGE BELONGIE, CSE 252B: COMPUTER VISION II

Figure 1. The pinhole imaging model, from Forsyth & Ponce.

Let us begin by considering a mathematical description of the imagingprocess through this idealized camera. We will consider issues like lens dis-tortion subsequently.

The pinhole camera or the projective camera as it is known images thescene by applying a perspective projection to it. In the following we shall re-fer to scene coordinates with upper case roman letters, {X, Y, Z, . . .}. Imagecoordinates will be referred to using lower case roman letters, {x, y, z, . . .}.Vectors shall be denoted by boldfaced symbols, e.g., X or x. (In class, whenwriting on the blackboard, I will put a tilde underneath the correspondingsymbols to denote a vector.)

The scene is three dimensional, whereas the image is located in a twodimensional plane. Hence the perspective projection maps the 3D space toa 2D plane.

(X, Y, Z)>Projection������! (x, y)>

The equations of perspective projections are given by

(1.1) x = fX

Zy = f

Y

Zhere, f is the focal length of the camera, i.e., the distance between the imageplane and the pinhole.

The process is illustrated in figure 2.

C

Y

y

Y

x

X

x

p

image plane

camera

centre

Z

principal axis

C

X

Figure 2. Image formation in a projective camera.

(Taken from Forsyth & Ponce)

imaging sensor

Page 29: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Digital Cameras

• All digital cameras rely on the photoelectric effect to create electrical signal from light.

• CCD (charge coupled device) and CMOS (complementary metal oxide semiconductor) are the two most common image sensors found in digital cameras.

• Both invented in the late 60s early 70s.

(Taken from https://www.teledynedalsa.com/imaging/knowledge-center/appnotes/ccd-vs-cmos/)

Page 30: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

CCD versus CMOS

• CMOS and CCD imagers differ in the way that signals are converted from signal charge.

• CMOS imagers are inherently more parallel than CCDs. • Consequently, high speed CMOS imagers can be designed

to have much lower noise than high speed CCDs.

(Taken from https://www.teledynedalsa.com/imaging/knowledge-center/appnotes/ccd-vs-cmos/)

Page 31: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

CCD versus CMOS

• CCD used to be the image sensor of choice as it gave far superior images with the fabrication technology available.

• CMOS was of interest with the the advent of mobile phones. • CMOS promised lower power consumption. • lowered fabrication costs (reuse mainstream logic and memory device

fabrication). • An enormous amount of investment was made to develop

and fine tune CMOS imagers. • As a result we witnessed great improvements in image

quality, even as pixel sizes shrank. • In the case of high volume consumer area imagers, CMOS

imagers outperform CCDs based on almost every performance parameter.

(Taken from https://www.teledynedalsa.com/imaging/knowledge-center/appnotes/ccd-vs-cmos/)

Page 32: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Taken from: http://9to5mac.com/2014/09/23/iphone-6-camera-compared-to-all-previous-iphones-gallery/

Page 33: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Rolling-Shutter Effect• A drawback to CMOS sensors

is the “rolling-shutter effect”. • CMOS captures images by

scanning one line of the frame at a time.

• If anything is moving fast, then it will lead to weird distortions in still photos, and to rather odd effects in video.

• Check out the following video taken with the iPhone 4 CCD camera.

• CCD-based cameras often use a “global” shutter to circumvent this problem.

Taken from: http://www.wired.com/2011/07/iphones-rolling-shutter-captures-amazing-slo-mo-guitar-string-vibrations/

Page 34: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Rolling-Shutter Effect• A drawback to CMOS sensors

is the “rolling-shutter effect”. • CMOS captures images by

scanning one line of the frame at a time.

• If anything is moving fast, then it will lead to weird distortions in still photos, and to rather odd effects in video.

• Check out the following video taken with the iPhone 4 CCD camera.

• CCD-based cameras often use a “global” shutter to circumvent this problem.

Taken from: http://www.wired.com/2011/07/iphones-rolling-shutter-captures-amazing-slo-mo-guitar-string-vibrations/

Page 35: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Rolling Shutter Effect

27

from inertial measurement sensors. The readings of accelerom-eters capture not only linear acceleration of cameras, butalso gravity and acceleration caused by rotation. Besides,acceleration readings must be integrated twice to obtain thecamera translation, which makes the estimation more proneto measurement noise. Even if we can obtain accurate cameratranslation, the video rectification and stabilization problem isstill ill-posed since it is impossible to obtain depth informationfor every image pixel. Dense warping [3] and image-based ren-dering [7] have been applied to approximate the stabilizationresults based on sparse 3-D scene reconstruction. However,they are computationally prohibitive for many handheld plat-forms.Fortunately, camera shake and rolling shutter effects are

caused primarily by camera rotations. In fact, [4] and [8]have shown that taking only camera rotations into account issufficient to produce satisfactory videos.In our paper, we also use gyroscope readings. In the

gyroscope-only method [4] the camera rotation is directlyestimated by integrating the gyroscope readings (angular ve-locities). Another recent approach [5] uses both gyroscopeand accelerometer readings to estimate the camera rotationsbased on EKF. The gyroscope readings are used as the controlinputs in the dynamic motion model. The authors assume thatusers usually try to hold the camera in a steady position so thegravity is approximately the only source in the accelerometermeasurements. Thus the accelerometer readings can be usedas measurements of the camera rotation.Our 3-D orientation estimation is also based on EKF, but

our measurement model is quite different from [5]. We findthat the linear acceleration of the camera and the accelerationcaused by rotation are sometimes non-negligible. Thus we donot use the accelerometer readings as orientation measure-ments. Instead, we use the tracked feature points extractedfrom the video frames, which provide accurate geometric cluefor the estimation of the camera motion. Based on the factthat matched feature points can be related by a homographictransformation under pure rotational motion, the relative rota-tion between consecutive frames can be measured [9].Motion estimation based on visual and inertial measurement

sensors have been extensively studied in the problem ofsimultaneous localization and mapping (SLAM) in robotics[10]. However, the rolling shutter camera model has never beenconsidered in SLAM before. Our algorithm is the first EKF-based motion estimation method for rolling-shutter camerasthat uses visual and inertial measurements. In our measure-ment model, tracked feature points in consecutive frames areonly linked by the relative camera rotation between them.Therefore, our algorithm can be classified as a relative motionestimation method [11], [12].

III. CAMERA MODEL

For rolling shutter cameras, each row in a frame is exposedat a different time. Fig. 2 illustrates the image capture modelof a rolling shutter camera, where tr is the total readout timein each frame and tid is the inter-frame idle time. Thus for

Fig. 2. Rolling shutter cameras sequentially expose rows. tr + tid =1

frame per second .

an image point u = [u0, u1]T in frame i, the exposure time ist(u, i) = ti + tr ×

u1

h, where ti is the timestamp of frame i

and h is the total number of rows in each frame.Assume the intrinsic camera matrix is K, the sequences

of rotation matrices and translation vectors of the camera areR(t) and l(t). A 3-D point x and its projection image u inframe i should satisfy the following equation:

u ∼ KR(t(u, i))(x + l(t(u, i))) (1)

where ∼ indicates equality up to scale.Usually there is a constant delay td between the recorded

timestamps of gyroscopes and videos. Thus using the times-tamps of gyroscopes as reference, the exposure time equationshould be modified as

t(u, i) = ti + td + tr ×uy

h. (2)

When pure rotation is considered, the translation vectorremains unchanged and thus the image of a certain scene pointin one frame can be mapped to another frame through a 3× 3homography matrix

u′ ∼ KR(t(u′, i))RT (t(u, j))K−1u (3)

where u′ and u are the images in frame i and j respectively.

IV. ONLINE ROTATION ESTIMATIONOur online motion estimation is based on EKF. Due to

the special property of rolling shutter camera model and thepure rotation motion model, state definition and the structureof dynamical and measurement model need to be designedcarefully.

A. State Vector and Dynamic Bayesian NetworkThe gyroscope in cell phone cameras usually has a higher

sampling frequency (around 100 Hz) than the video frame rate,as illustrated in Fig. 3.In Fig. 3, several gyroscope readings are grouped together

since they are used to compute the camera rotations for thesame frame during its corresponding exposure time. Note thatdue to the fact that the idle time tid is large enough so thatno pixels in frame i but only several pixels in frame i+1 areexposed after τk+3. Thus ωk+3 is relegated to group i + 1.Further we assume that a certain 3-D feature point has itsprojection at u in frame i and u′ in frame i + 1. Without

Rolling shutter cameras sequentially expose rows.

Taken from: Jia and Evans “Probabilistic 3-D Motion Estimation for Rolling Shutter Video Rectification from Visual and Inertial Measurements” MMSP 2012.

tr + tid =

1

frames per second

Page 36: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Rolling Shutter Effect = “Aliasing”

• Rolling Shutter Effect is an example of a broader phenomena regularly studied in Signal Processing called “Aliasing”.

• Common phenomenon • Wagon wheels rolling the wrong way in movies.

28

Page 37: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Rolling Shutter Effect = “Aliasing”

• Rolling Shutter Effect is an example of a broader phenomena regularly studied in Signal Processing called “Aliasing”.

• Common phenomenon • Wagon wheels rolling the wrong way in movies.

28

Page 38: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Rectifying Rolling Shutter

• What do you think the camera motion was here?

29

Stabilizing Cell Phone Video using Inertial Measurement Sensors

Gustav Hanning, Nicklas Forslow, Per-Erik Forssen, Erik Ringaby, David Tornqvist, Jonas CallmerDepartment of Electrical Engineering

Linkoping Universityhttp://www.liu.se/forskning/foass/per-erik-forssen/VGS

Abstract

We present a system that rectifies and stabilizes video

sequences on mobile devices with rolling-shutter cameras.

The system corrects for rolling-shutter distortions using

measurements from accelerometer and gyroscope sensors,

and a 3D rotational distortion model. In order to obtain

a stabilized video, and at the same time keep most content

in view, we propose an adaptive low-pass filter algorithm

to obtain the output camera trajectory. The accuracy of the

orientation estimates has been evaluated experimentally us-

ing ground truth data from a motion capture system. We

have conducted a user study, where the output from our sys-

tem, implemented in iOS, has been compared to that of three

other applications, as well as to the uncorrected video. The

study shows that users prefer our sensor-based system.

1. Introduction

Most mobile video-recording devices of today make useof CMOS sensors with rolling-shutter (RS) readout [6]. AnRS camera captures video by exposing every frame line-by-line from top to bottom. This is in contrast to a global shut-

ter, where an entire frame is acquired at once.The RS technique gives rise to image distortions in sit-

uations where either the device or the target is moving.Figure 1 shows an example of how an image is distortedwhen using a rolling shutter. Here, vertical lines such asthe flag poles appear slanted as a result of panning the cam-era quickly from left to right during recording. Recordingvideo by hand also leads to visible frame-to-frame jitter.The recorded video is perceived as “shaky” and is not veryenjoyable to watch.

Since mobile video-recording devices are so common,there is an interest in correcting these types of distor-tions. The inertial sensors (accelerometers and gyroscopes)present in many of the new devices provide a new way ofdoing this: Using the position and/or orientation of the de-vice, as sensed during recording, the motion induced distor-tions can be compensated for in a post-processing step.

Figure 1. An example of rolling-shutter distortion. Top: Frame

from a video sequence recorded with an iPod touch. Bottom: Rec-

tification using the 3D rotation model and inertial measurements.

1.1. Related Work

Early work on modeling the distortions caused by arolling-shutter exposure is described in [7].

Rolling-shutter video has previously been rectified usingimage measurements. Two recent, state-of-the-art methodsare described in [3, 5]. To perform rectification, we usethe 3D rotational model introduced in [5], but use inertialsensor data instead of image measurements.

For stabilization we use a 3D rotation-based correctionas in [13, 14], and a dynamical model derived from [17].Differences compared to [13, 14] is the use of inertial sen-sor data instead of image measurements, and an adaptive

1

Taken from: Hanning et al. “Stabilizing Cell Phone Video using Inertial Measurement Sensors” in ICCV 2011 Workshop.

Page 39: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Global versus Rolling Shutter

30

from inertial measurement sensors. The readings of accelerom-eters capture not only linear acceleration of cameras, butalso gravity and acceleration caused by rotation. Besides,acceleration readings must be integrated twice to obtain thecamera translation, which makes the estimation more proneto measurement noise. Even if we can obtain accurate cameratranslation, the video rectification and stabilization problem isstill ill-posed since it is impossible to obtain depth informationfor every image pixel. Dense warping [3] and image-based ren-dering [7] have been applied to approximate the stabilizationresults based on sparse 3-D scene reconstruction. However,they are computationally prohibitive for many handheld plat-forms.Fortunately, camera shake and rolling shutter effects are

caused primarily by camera rotations. In fact, [4] and [8]have shown that taking only camera rotations into account issufficient to produce satisfactory videos.In our paper, we also use gyroscope readings. In the

gyroscope-only method [4] the camera rotation is directlyestimated by integrating the gyroscope readings (angular ve-locities). Another recent approach [5] uses both gyroscopeand accelerometer readings to estimate the camera rotationsbased on EKF. The gyroscope readings are used as the controlinputs in the dynamic motion model. The authors assume thatusers usually try to hold the camera in a steady position so thegravity is approximately the only source in the accelerometermeasurements. Thus the accelerometer readings can be usedas measurements of the camera rotation.Our 3-D orientation estimation is also based on EKF, but

our measurement model is quite different from [5]. We findthat the linear acceleration of the camera and the accelerationcaused by rotation are sometimes non-negligible. Thus we donot use the accelerometer readings as orientation measure-ments. Instead, we use the tracked feature points extractedfrom the video frames, which provide accurate geometric cluefor the estimation of the camera motion. Based on the factthat matched feature points can be related by a homographictransformation under pure rotational motion, the relative rota-tion between consecutive frames can be measured [9].Motion estimation based on visual and inertial measurement

sensors have been extensively studied in the problem ofsimultaneous localization and mapping (SLAM) in robotics[10]. However, the rolling shutter camera model has never beenconsidered in SLAM before. Our algorithm is the first EKF-based motion estimation method for rolling-shutter camerasthat uses visual and inertial measurements. In our measure-ment model, tracked feature points in consecutive frames areonly linked by the relative camera rotation between them.Therefore, our algorithm can be classified as a relative motionestimation method [11], [12].

III. CAMERA MODEL

For rolling shutter cameras, each row in a frame is exposedat a different time. Fig. 2 illustrates the image capture modelof a rolling shutter camera, where tr is the total readout timein each frame and tid is the inter-frame idle time. Thus for

Fig. 2. Rolling shutter cameras sequentially expose rows. tr + tid =1

frame per second .

an image point u = [u0, u1]T in frame i, the exposure time ist(u, i) = ti + tr ×

u1

h, where ti is the timestamp of frame i

and h is the total number of rows in each frame.Assume the intrinsic camera matrix is K, the sequences

of rotation matrices and translation vectors of the camera areR(t) and l(t). A 3-D point x and its projection image u inframe i should satisfy the following equation:

u ∼ KR(t(u, i))(x + l(t(u, i))) (1)

where ∼ indicates equality up to scale.Usually there is a constant delay td between the recorded

timestamps of gyroscopes and videos. Thus using the times-tamps of gyroscopes as reference, the exposure time equationshould be modified as

t(u, i) = ti + td + tr ×uy

h. (2)

When pure rotation is considered, the translation vectorremains unchanged and thus the image of a certain scene pointin one frame can be mapped to another frame through a 3× 3homography matrix

u′ ∼ KR(t(u′, i))RT (t(u, j))K−1u (3)

where u′ and u are the images in frame i and j respectively.

IV. ONLINE ROTATION ESTIMATIONOur online motion estimation is based on EKF. Due to

the special property of rolling shutter camera model and thepure rotation motion model, state definition and the structureof dynamical and measurement model need to be designedcarefully.

A. State Vector and Dynamic Bayesian NetworkThe gyroscope in cell phone cameras usually has a higher

sampling frequency (around 100 Hz) than the video frame rate,as illustrated in Fig. 3.In Fig. 3, several gyroscope readings are grouped together

since they are used to compute the camera rotations for thesame frame during its corresponding exposure time. Note thatdue to the fact that the idle time tid is large enough so thatno pixels in frame i but only several pixels in frame i+1 areexposed after τk+3. Thus ωk+3 is relegated to group i + 1.Further we assume that a certain 3-D feature point has itsprojection at u in frame i and u′ in frame i + 1. Without

Taken from: Jia and Evans “Probabilistic 3-D Motion Estimation for Rolling Shutter Video Rectification from Visual and Inertial Measurements” MMSP 2012.

Stru

ctur

e an

d M

otio

n

Rec

onst

ruct

•S

cene

geo

met

ry•C

amer

a m

otio

n

Unk

now

nca

mer

avi

ewpo

ints

Stru

ctur

e an

d M

otio

n fro

m D

iscr

ete

Vie

ws

•In

trodu

ctio

n

•C

ompu

ting

the

fund

amen

tal m

atrix

, F, f

rom

cor

ner c

orre

spon

denc

es

•Fe

atur

e m

atch

ing

•R

AN

SA

C

•E

stim

atio

n

•D

eter

min

ing

ego-

mot

ion

from

F

•S

IFT

for w

ide

base

line

mat

chin

g

•C

ompu

ting

a ho

mog

raph

y, H

, fro

m c

orne

r cor

resp

onde

nces

•M

ore

than

two

view

s

•B

atch

and

seq

uent

ial s

olut

ions

Stru

ctur

e an

d M

otio

n

Rec

onst

ruct

•S

cene

geo

met

ry•C

amer

a m

otio

n

Unk

now

nca

mer

avi

ewpo

ints

Stru

ctur

e an

d M

otio

n fro

m D

iscr

ete

Vie

ws

•In

trodu

ctio

n

•C

ompu

ting

the

fund

amen

tal m

atrix

, F, f

rom

cor

ner c

orre

spon

denc

es

•Fe

atur

e m

atch

ing

•R

AN

SA

C

•E

stim

atio

n

•D

eter

min

ing

ego-

mot

ion

from

F

•S

IFT

for w

ide

base

line

mat

chin

g

•C

ompu

ting

a ho

mog

raph

y, H

, fro

m c

orne

r cor

resp

onde

nces

•M

ore

than

two

view

s

•B

atch

and

seq

uent

ial s

olut

ions

Stru

ctur

e an

d M

otio

n

Rec

onst

ruct

•S

cene

geo

met

ry•C

amer

a m

otio

n

Unk

now

nca

mer

avi

ewpo

ints

Stru

ctur

e an

d M

otio

n fro

m D

iscr

ete

Vie

ws

•In

trodu

ctio

n

•C

ompu

ting

the

fund

amen

tal m

atrix

, F, f

rom

cor

ner c

orre

spon

denc

es

•Fe

atur

e m

atch

ing

•R

AN

SA

C

•E

stim

atio

n

•D

eter

min

ing

ego-

mot

ion

from

F

•S

IFT

for w

ide

base

line

mat

chin

g

•C

ompu

ting

a ho

mog

raph

y, H

, fro

m c

orne

r cor

resp

onde

nces

•M

ore

than

two

view

s

•B

atch

and

seq

uent

ial s

olut

ions

Stru

ctur

e an

d M

otio

n

Rec

onst

ruct

•S

cene

geo

met

ry•C

amer

a m

otio

n

Unk

now

nca

mer

avi

ewpo

ints

Stru

ctur

e an

d M

otio

n fro

m D

iscr

ete

Vie

ws

•In

trodu

ctio

n

•C

ompu

ting

the

fund

amen

tal m

atrix

, F, f

rom

cor

ner c

orre

spon

denc

es

•Fe

atur

e m

atch

ing

•R

AN

SA

C

•E

stim

atio

n

•D

eter

min

ing

ego-

mot

ion

from

F

•S

IFT

for w

ide

base

line

mat

chin

g

•C

ompu

ting

a ho

mog

raph

y, H

, fro

m c

orne

r cor

resp

onde

nces

•M

ore

than

two

view

s

•B

atch

and

seq

uent

ial s

olut

ions

Structu

re and Motio

nReco

nstruct

•Scene geometry

•Camera motio

n

Unknown

camera

viewpoints

Structu

re and Motio

n from Disc

rete Views

•Intro

duction

•Computin

g the fu

ndamental matrix

, F, fr

om corner c

orresp

ondences

•Feature m

atching

•RANSAC

•Esti

mation

•Determ

ining ego-motio

n from F

•SIFT fo

r wide base

line match

ing

•Computin

g a homography, H, fr

om corner c

orresp

ondences

•More th

an two vie

ws

•Batch

and sequentia

l solutio

ns

Structu

re and Motio

nReco

nstruct

•Scene geometry

•Camera motio

n

Unknown

camera

viewpoints

Structu

re and Motio

n from Disc

rete Views

•Intro

duction

•Computin

g the fu

ndamental matrix

, F, fr

om corner c

orresp

ondences

•Feature m

atching

•RANSAC

•Esti

mation

•Determ

ining ego-motio

n from F

•SIFT fo

r wide base

line match

ing

•Computin

g a homography, H, fr

om corner c

orresp

ondences

•More th

an two vie

ws

•Batch

and sequentia

l solutio

ns

Structu

re and Motio

nReco

nstruct

•Scene geometry

•Camera motio

n

Unknown

camera

viewpoints

Structu

re and Motio

n from Disc

rete Views

•Intro

duction

•Computin

g the fu

ndamental matrix

, F, fr

om corner c

orresp

ondences

•Feature m

atching

•RANSAC

•Esti

mation

•Determ

ining ego-motio

n from F

•SIFT fo

r wide base

line match

ing

•Computin

g a homography, H, fr

om corner c

orresp

ondences

•More th

an two vie

ws

•Batch

and sequentia

l solutio

ns

Structu

re and Motio

nReco

nstruct

•Scene geometry

•Camera motio

n

Unknown

camera

viewpoints

Structu

re and Motio

n from Disc

rete Views

•Intro

duction

•Computin

g the fu

ndamental matrix

, F, fr

om corner c

orresp

ondences

•Feature m

atching

•RANSAC

•Esti

mation

•Determ

ining ego-motio

n from F

•SIFT fo

r wide base

line match

ing

•Computin

g a homography, H, fr

om corner c

orresp

ondences

•More th

an two vie

ws

•Batch

and sequentia

l solutio

ns

Page 40: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Global versus Rolling Shutter

30

from inertial measurement sensors. The readings of accelerom-eters capture not only linear acceleration of cameras, butalso gravity and acceleration caused by rotation. Besides,acceleration readings must be integrated twice to obtain thecamera translation, which makes the estimation more proneto measurement noise. Even if we can obtain accurate cameratranslation, the video rectification and stabilization problem isstill ill-posed since it is impossible to obtain depth informationfor every image pixel. Dense warping [3] and image-based ren-dering [7] have been applied to approximate the stabilizationresults based on sparse 3-D scene reconstruction. However,they are computationally prohibitive for many handheld plat-forms.Fortunately, camera shake and rolling shutter effects are

caused primarily by camera rotations. In fact, [4] and [8]have shown that taking only camera rotations into account issufficient to produce satisfactory videos.In our paper, we also use gyroscope readings. In the

gyroscope-only method [4] the camera rotation is directlyestimated by integrating the gyroscope readings (angular ve-locities). Another recent approach [5] uses both gyroscopeand accelerometer readings to estimate the camera rotationsbased on EKF. The gyroscope readings are used as the controlinputs in the dynamic motion model. The authors assume thatusers usually try to hold the camera in a steady position so thegravity is approximately the only source in the accelerometermeasurements. Thus the accelerometer readings can be usedas measurements of the camera rotation.Our 3-D orientation estimation is also based on EKF, but

our measurement model is quite different from [5]. We findthat the linear acceleration of the camera and the accelerationcaused by rotation are sometimes non-negligible. Thus we donot use the accelerometer readings as orientation measure-ments. Instead, we use the tracked feature points extractedfrom the video frames, which provide accurate geometric cluefor the estimation of the camera motion. Based on the factthat matched feature points can be related by a homographictransformation under pure rotational motion, the relative rota-tion between consecutive frames can be measured [9].Motion estimation based on visual and inertial measurement

sensors have been extensively studied in the problem ofsimultaneous localization and mapping (SLAM) in robotics[10]. However, the rolling shutter camera model has never beenconsidered in SLAM before. Our algorithm is the first EKF-based motion estimation method for rolling-shutter camerasthat uses visual and inertial measurements. In our measure-ment model, tracked feature points in consecutive frames areonly linked by the relative camera rotation between them.Therefore, our algorithm can be classified as a relative motionestimation method [11], [12].

III. CAMERA MODEL

For rolling shutter cameras, each row in a frame is exposedat a different time. Fig. 2 illustrates the image capture modelof a rolling shutter camera, where tr is the total readout timein each frame and tid is the inter-frame idle time. Thus for

Fig. 2. Rolling shutter cameras sequentially expose rows. tr + tid =1

frame per second .

an image point u = [u0, u1]T in frame i, the exposure time ist(u, i) = ti + tr ×

u1

h, where ti is the timestamp of frame i

and h is the total number of rows in each frame.Assume the intrinsic camera matrix is K, the sequences

of rotation matrices and translation vectors of the camera areR(t) and l(t). A 3-D point x and its projection image u inframe i should satisfy the following equation:

u ∼ KR(t(u, i))(x + l(t(u, i))) (1)

where ∼ indicates equality up to scale.Usually there is a constant delay td between the recorded

timestamps of gyroscopes and videos. Thus using the times-tamps of gyroscopes as reference, the exposure time equationshould be modified as

t(u, i) = ti + td + tr ×uy

h. (2)

When pure rotation is considered, the translation vectorremains unchanged and thus the image of a certain scene pointin one frame can be mapped to another frame through a 3× 3homography matrix

u′ ∼ KR(t(u′, i))RT (t(u, j))K−1u (3)

where u′ and u are the images in frame i and j respectively.

IV. ONLINE ROTATION ESTIMATIONOur online motion estimation is based on EKF. Due to

the special property of rolling shutter camera model and thepure rotation motion model, state definition and the structureof dynamical and measurement model need to be designedcarefully.

A. State Vector and Dynamic Bayesian NetworkThe gyroscope in cell phone cameras usually has a higher

sampling frequency (around 100 Hz) than the video frame rate,as illustrated in Fig. 3.In Fig. 3, several gyroscope readings are grouped together

since they are used to compute the camera rotations for thesame frame during its corresponding exposure time. Note thatdue to the fact that the idle time tid is large enough so thatno pixels in frame i but only several pixels in frame i+1 areexposed after τk+3. Thus ωk+3 is relegated to group i + 1.Further we assume that a certain 3-D feature point has itsprojection at u in frame i and u′ in frame i + 1. Without

Taken from: Jia and Evans “Probabilistic 3-D Motion Estimation for Rolling Shutter Video Rectification from Visual and Inertial Measurements” MMSP 2012.

Stru

ctur

e an

d M

otio

n

Rec

onst

ruct

•S

cene

geo

met

ry•C

amer

a m

otio

n

Unk

now

nca

mer

avi

ewpo

ints

Stru

ctur

e an

d M

otio

n fro

m D

iscr

ete

Vie

ws

•In

trodu

ctio

n

•C

ompu

ting

the

fund

amen

tal m

atrix

, F, f

rom

cor

ner c

orre

spon

denc

es

•Fe

atur

e m

atch

ing

•R

AN

SA

C

•E

stim

atio

n

•D

eter

min

ing

ego-

mot

ion

from

F

•S

IFT

for w

ide

base

line

mat

chin

g

•C

ompu

ting

a ho

mog

raph

y, H

, fro

m c

orne

r cor

resp

onde

nces

•M

ore

than

two

view

s

•B

atch

and

seq

uent

ial s

olut

ions

Stru

ctur

e an

d M

otio

n

Rec

onst

ruct

• S

cene

geo

met

ry• C

amer

a m

otio

n

Unk

now

nca

mer

avi

ewpo

ints

Stru

ctur

e an

d M

otio

n fro

m D

iscr

ete

Vie

ws

•In

trodu

ctio

n•

Com

putin

g th

e fu

ndam

enta

l mat

rix, F

, fro

m c

orne

r cor

resp

onde

nces

•Fe

atur

e m

atch

ing

•R

AN

SA

C•

Est

imat

ion

•D

eter

min

ing

ego-

mot

ion

from

F•

SIF

T fo

r wid

e ba

selin

e m

atch

ing

•C

ompu

ting

a ho

mog

raph

y, H

, fro

m c

orne

r cor

resp

onde

nces

•M

ore

than

two

view

s•

Bat

ch a

nd s

eque

ntia

l sol

utio

ns

Stru

ctur

e an

d M

otio

n

Rec

onst

ruct

• S

cene

geo

met

ry• C

amer

a m

otio

n

Unk

now

nca

mer

avi

ewpo

ints

Stru

ctur

e an

d M

otio

n fro

m D

iscr

ete

Vie

ws

•In

trodu

ctio

n•

Com

putin

g th

e fu

ndam

enta

l mat

rix, F

, fro

m c

orne

r cor

resp

onde

nces

•Fe

atur

e m

atch

ing

•R

AN

SA

C•

Est

imat

ion

•D

eter

min

ing

ego-

mot

ion

from

F

•S

IFT

for w

ide

base

line

mat

chin

g

•C

ompu

ting

a ho

mog

raph

y, H

, fro

m c

orne

r cor

resp

onde

nces

•M

ore

than

two

view

s•

Bat

ch a

nd s

eque

ntia

l sol

utio

ns

Stru

ctur

e an

d M

otio

nRe

cons

truct

• S

cene

geo

met

ry

• Cam

era

mot

ion

Unkn

own

cam

era

viewp

oint

s

Stru

ctur

e an

d M

otio

n fro

m D

iscre

te V

iews

•In

trodu

ctio

n

•Co

mpu

ting

the

fund

amen

tal m

atrix

, F, f

rom

cor

ner c

orre

spon

denc

es

•Fe

atur

e m

atch

ing

•RA

NSAC

•Es

timat

ion

•De

term

inin

g eg

o-m

otio

n fro

m F

•SI

FT fo

r wid

e ba

selin

e m

atch

ing

•Co

mpu

ting

a ho

mog

raph

y, H

, fro

m c

orne

r cor

resp

onde

nces

•M

ore

than

two

views

•Ba

tch

and

sequ

entia

l sol

utio

ns

Structu

re and Motio

nReco

nstruct

•Scene geometry

•Camera motio

n

Unknown

camera

viewpoints

Structu

re and Motio

n from Disc

rete Views

•Intro

duction

•Computin

g the fu

ndamental matrix

, F, fr

om corner c

orresp

ondences

•Feature m

atching

•RANSAC

•Esti

mation

•Determ

ining ego-motio

n from F

•SIFT fo

r wide base

line match

ing

•Computin

g a homography, H, fr

om corner c

orresp

ondences

•More th

an two vie

ws

•Batch

and sequentia

l solutio

ns

Structu

re an

d Moti

on

Recon

struc

t

•Scene

geom

etry

•Camer

a moti

on

Unkno

wn

camer

a

viewpo

ints

Structu

re an

d Moti

on fr

om D

iscre

te View

s

•Int

rodu

ction

•Com

putin

g the

fund

amen

tal m

atrix,

F, fr

om co

rner

corre

spon

denc

es

•Fe

ature

matc

hing

•RANSAC

•Esti

mation

•Dete

rmini

ng eg

o-moti

on fr

om F

•SIF

T for

wide

base

line m

atchin

g

•Com

putin

g a ho

mogra

phy,

H, from

corn

er co

rresp

onde

nces

•Mor

e tha

n two v

iews

•Batc

h and

sequ

entia

l solu

tions

Stru

ctur

e an

d M

otio

n

Reco

nstru

ct

•Sce

ne g

eom

etry

•Cam

era

mot

ion

Unkn

own

cam

era

viewp

oint

s

Stru

ctur

e an

d M

otio

n fro

m D

iscre

te V

iews

•In

trodu

ctio

n

•Co

mpu

ting

the

fund

amen

tal m

atrix

, F, f

rom

cor

ner c

orre

spon

denc

es

•Fe

atur

e m

atch

ing

•RA

NSAC

•Es

timat

ion

•De

term

inin

g eg

o-m

otio

n fro

m F

•SI

FT fo

r wid

e ba

selin

e m

atch

ing

•Co

mpu

ting

a ho

mog

raph

y, H

, fro

m c

orne

r cor

resp

onde

nces

•M

ore

than

two

views

•Ba

tch

and

sequ

entia

l sol

utio

ns

Stru

ctur

e an

d M

otio

n

Rec

onst

ruct

•Sce

ne g

eom

etry

•Cam

era

mot

ion

Unk

now

nca

mer

avi

ewpo

ints

Stru

ctur

e an

d M

otio

n fro

m D

iscr

ete

Vie

ws

•In

trodu

ctio

n

•C

ompu

ting

the

fund

amen

tal m

atrix

, F, f

rom

cor

ner c

orre

spon

denc

es

•Fe

atur

e m

atch

ing

•R

AN

SA

C

•E

stim

atio

n

•D

eter

min

ing

ego-

mot

ion

from

F

•S

IFT

for w

ide

base

line

mat

chin

g

•C

ompu

ting

a ho

mog

raph

y, H

, fro

m c

orne

r cor

resp

onde

nces

•M

ore

than

two

view

s

•B

atch

and

seq

uent

ial s

olut

ions

Page 41: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Rectifying Rolling Shutter

• Result from rectification,

31

Taken from: Hanning et al. “Stabilizing Cell Phone Video using Inertial Measurement Sensors” in ICCV 2011 Workshop.

Stabilizing Cell Phone Video using Inertial Measurement Sensors

Gustav Hanning, Nicklas Forslow, Per-Erik Forssen, Erik Ringaby, David Tornqvist, Jonas CallmerDepartment of Electrical Engineering

Linkoping Universityhttp://www.liu.se/forskning/foass/per-erik-forssen/VGS

Abstract

We present a system that rectifies and stabilizes video

sequences on mobile devices with rolling-shutter cameras.

The system corrects for rolling-shutter distortions using

measurements from accelerometer and gyroscope sensors,

and a 3D rotational distortion model. In order to obtain

a stabilized video, and at the same time keep most content

in view, we propose an adaptive low-pass filter algorithm

to obtain the output camera trajectory. The accuracy of the

orientation estimates has been evaluated experimentally us-

ing ground truth data from a motion capture system. We

have conducted a user study, where the output from our sys-

tem, implemented in iOS, has been compared to that of three

other applications, as well as to the uncorrected video. The

study shows that users prefer our sensor-based system.

1. Introduction

Most mobile video-recording devices of today make useof CMOS sensors with rolling-shutter (RS) readout [6]. AnRS camera captures video by exposing every frame line-by-line from top to bottom. This is in contrast to a global shut-

ter, where an entire frame is acquired at once.The RS technique gives rise to image distortions in sit-

uations where either the device or the target is moving.Figure 1 shows an example of how an image is distortedwhen using a rolling shutter. Here, vertical lines such asthe flag poles appear slanted as a result of panning the cam-era quickly from left to right during recording. Recordingvideo by hand also leads to visible frame-to-frame jitter.The recorded video is perceived as “shaky” and is not veryenjoyable to watch.

Since mobile video-recording devices are so common,there is an interest in correcting these types of distor-tions. The inertial sensors (accelerometers and gyroscopes)present in many of the new devices provide a new way ofdoing this: Using the position and/or orientation of the de-vice, as sensed during recording, the motion induced distor-tions can be compensated for in a post-processing step.

Figure 1. An example of rolling-shutter distortion. Top: Frame

from a video sequence recorded with an iPod touch. Bottom: Rec-

tification using the 3D rotation model and inertial measurements.

1.1. Related Work

Early work on modeling the distortions caused by arolling-shutter exposure is described in [7].

Rolling-shutter video has previously been rectified usingimage measurements. Two recent, state-of-the-art methodsare described in [3, 5]. To perform rectification, we usethe 3D rotational model introduced in [5], but use inertialsensor data instead of image measurements.

For stabilization we use a 3D rotation-based correctionas in [13, 14], and a dynamical model derived from [17].Differences compared to [13, 14] is the use of inertial sen-sor data instead of image measurements, and an adaptive

1

Page 42: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

High-Frame Rate Cameras

• Another way around this is to create higher-frame rate cameras.

• Increasingly seeing faster and faster CMOS cameras.

• Opening up other exciting opportunities in computer vision.

• However, really fast motions still need an understanding of the rolling shutter effect.

32

Page 43: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

High-Frame Rate Cameras

• Another way around this is to create higher-frame rate cameras.

• Increasingly seeing faster and faster CMOS cameras.

• Opening up other exciting opportunities in computer vision.

• However, really fast motions still need an understanding of the rolling shutter effect.

32

Page 44: Using Xcode with OpenCV - Carnegie Mellon University16423.courses.cs.cmu.edu/slides/Spring_2017/Lecture_3.pdf · Objective C • Developed in the early 80s, selected by NeXT as its

Homework - Building App on Your Device

• Now lets capture an image on your iOS device. • Will not work on simulator, when playing with the camera it is

important to work on your device. • Use example code I have provided here,

• We will review briefly in class on Thursday.

https://github.com/slucey-cs-cmu-edu/Intro_iOS_Camera