Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

42
Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    230
  • download

    3

Transcript of Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Page 1: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Lecture 11: Structure from motion

CS6670: Computer VisionNoah Snavely

Page 2: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Announcements

• Project 2 out, due next Wednesday, October 14– Artifact due Friday, October 16

• Questions?

Page 3: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Readings

• Szeliski, Chapter 7.2• My thesis, Chapter 3

http://www.cs.cornell.edu/~snavely/publications/thesis/thesis.pdf

Page 4: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Energy minimization via graph cuts

Labels (disparities)

d1

d2

d3

edge weight

Page 5: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Other uses of graph cuts

Page 6: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Agarwala et al., Interactive Digital Photo Montage, SIGGRAPH 2004

Other uses of graph cuts

Page 7: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Panoramic stitching

• Problem: copy every pixel in the output image from one of the input images (without blending)

• Make transitions between two images seamless

Page 8: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Panoramic stitching

• Can be posed as a labeling problem: label each pixel in the output image with one of the input images

Input images:

Output panorama:

Page 9: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Panoramic stitching

• Number of labels: k = number of input images• Objective function (in terms of labeling L)

Input images:

Output panorama:

Page 10: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Panoramic stitching

Infinite cost of labeling a pixel (x,y) with image I if I doesn’t cover (x,y)

Else, cost = 0

Page 11: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Photographing long scenes with multi-viewpoint panoramas

Page 12: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Today: Drift

copy of first image

(xn,yn)

(x1,y1)

– add another copy of first image at the end– this gives a constraint: yn = y1– there are a bunch of ways to solve this problem

• add displacement of (y1 – yn)/(n - 1) to each image after the first

• compute a global warp: y’ = y + ax• run a big optimization problem, incorporating this constraint

– best solution, but more complicated– known as “bundle adjustment”

Page 13: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Global optimization

• Minimize a global energy function:– What are the variables?

• The translation tj = (xj, yj) for each image Ij

– What is the objective function?• We have a set of matched features pi,j = (ui,j, vi,j)

– We’ll call these tracks

• For each point match (pi,j, pi,j+1): pi,j+1 – pi,j = tj+1 – tj

I1 I2 I3 I4

p1,1p1,2 p1,3

p2,2

p2,3 p2,4

p3,3p3,4 p4,4p4,1

track

Page 14: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Global optimization

I1 I2 I3 I4

p1,1p1,2 p1,3

p2,2

p2,3 p2,4

p3,3p3,4 p4,4p4,1

p1,2 – p1,1 = t2 – t1

p1,3 – p1,2 = t3 – t2

p2,3 – p2,2 = t3 – t2

v4,1 – v4,4 = y1 – y4

minimizewij = 1 if track i is visible in images j and j+1 0 otherwise

Page 15: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Global optimization

I1 I2 I3 I4

p1,1p1,2 p1,3

p2,2

p2,3 p2,4

p3,3p3,4 p4,4p4,1

A2m x 2n 2n x 1

x2m x 1

b

Page 16: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Global optimization

Defines a least squares problem: minimize• Solution:• Problem: there is no unique solution for ! (det = 0)• We can add a global offset to a solution and get the same error

A2m x 2n 2n x 1

x2m x 1

b

Page 17: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Ambiguity in global location

• Each of these solutions has the same error• Called the gauge ambiguity• Solution: fix the position of one image (e.g., make the origin of the 1st image (0,0))

(0,0)

(-100,-100)

(200,-200)

Page 18: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Solving for camera parameters

Projection equation

1333

100

'0

'0

xxcy

cx

yfs

xfs

s

sv

su

p txR

Recap: a camera is described by several parameters• Translation t of the optical center from the origin of world coords• Rotation R of the image plane• focal length f, principle point (x’c, y’c), pixel size (sx, sy)• blue parameters are called “extrinsics,” red are “intrinsics”

K

Page 19: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Solving for camera rotation

• Instead of spherically warping the images and solving for translation, we can directly solve for the rotation Rj of each camera

• Can handle tilt / twist

Page 20: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Solving for rotations

R1R2

f

I1

I2

p12 = (u12, v12)

p11 = (u11, v11)

(u11, v11, f) = p11

R1p11

R2p22

Page 21: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Solving for rotations

minimize

Page 22: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

3D rotations

• How many degrees of freedom are there?• How do we represent a rotation?– Rotation matrix (too many degrees of freedom)– Euler angles (e.g. yaw, pitch, and roll) – bad idea– Quaternions (4-vector on unit sphere)

• Usually involves non-linear optimization

Page 23: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Revisiting stereo

• How do we find the epipolar lines?• Need to calibrate the cameras (estimate

relative position, orientation)

p p'epipolar line

Image I Image J

Page 24: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Solving for rotations and translations

• Structure from motion (SfM)• Unlike with panoramas, we often need to

solve for structure (3D point positions) as well as motion (camera parameters)

Page 25: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

p1,1

p1,2p1,3

Image 1

Image 2

Image 3

x1

x4

x3

x2

x5x6

x7

R1,t1

R2,t2

R3,t3

Page 26: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Structure from motion

• Input: images with points in correspondence pi,j = (ui,j,vi,j)

• Output• structure: 3D location xi for each point pi• motion: camera parameters Rj , tj

• Objective function: minimize reprojection error

Reconstruction (side) (top)

Page 27: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

p1,1

p1,2p1,3

Image 1

Image 2

Image 3

x1

x4

x3

x2

x5x6

x7

R1,t1

R2,t2

R3,t3

Page 28: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

SfM objective function• Given point x and rotation and translation R, t

• Minimize sum of squared reprojection errors:

predicted image location

observedimage location

Page 29: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Solving structure from motion• Minimizing g is difficult– g is non-linear due to rotations, perspective division– lots of parameters: 3 for each 3D point, 6 for each camera– difficult to initialize– gauge ambiguity: error is invariant to a similarity

transform (translation, rotation, uniform scale)

• Many techniques use non-linear least-squares (NLLS) optimization (bundle adjustment)– Levenberg-Marquardt is one common algorithm for NLLS– Lourakis, The Design and Implementation of a Generic

Sparse Bundle Adjustment Software Package Based on the Levenberg-Marquardt Algorithm, http://www.ics.forth.gr/~lourakis/sba/

– http://en.wikipedia.org/wiki/Levenberg-Marquardt_algorithm

Page 30: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Extensions to SfM• Can also solve for intrinsic parameters (focal

length, radial distortion, etc.)• Can use a more robust function than squared

error, to avoid fitting to outliers

• For more information, see: Triggs, et al, “Bundle Adjustment – A Modern Synthesis”, Vision Algorithms 2000.

Page 31: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Photo Tourism• Structure from motion on Internet photo

collections

Page 32: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Photo Tourism

Page 33: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Scene reconstruction

Feature detectionFeature detection

Pairwisefeature matching

Pairwisefeature matching

Incremental structure

from motion

Incremental structure

from motion

Correspondence estimation

Correspondence estimation

Page 34: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Feature detectionDetect features using SIFT [Lowe, IJCV 2004]

Page 35: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Feature detectionDetect features using SIFT [Lowe, IJCV 2004]

Page 36: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Feature detectionDetect features using SIFT [Lowe, IJCV 2004]

Page 37: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Feature matchingMatch features between each pair of images

Page 38: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Feature matchingRefine matching using RANSAC [Fischler & Bolles 1987] to estimate fundamental matrices between pairs

Page 39: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Image connectivity graph

(graph layout produced using the Graphviz toolkit: http://www.graphviz.org/)

Page 40: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Incremental structure from motion

Page 41: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Incremental structure from motion

Page 42: Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.

Incremental structure from motion