Post on 31-Dec-2015
description
Stereo reconstruction
Given two or more images of the same scene or object, compute a representation of its shape
knownknowncameracamera
viewpointsviewpoints
Stereo reconstruction
Given two or more images of the same scene or object, compute a representation of its shape
knownknowncameracamera
viewpointsviewpoints
How to estimate camera parameters?
- where is the camera?
- where is it pointing?
- what are internal parameters, e.g. focal length?
Calibration from 2D motion
Structure from motion (SFM) - track points over a sequence of images
- estimate for 3D positions and camera positions
- calibrate intrinsic camera parameters before hand
Self-calibration: - solve for both intrinsic and extrinsic camera parameters
SFM = Holy Grail of 3D Reconstruction
Take movie of object
Reconstruct 3D model
Would be
commercially
highly viable
How to Get Feature Correspondences
Feature-based approach
- good for images
- feature detection (corners or sift features)
- feature matching using RANSAC (epipolar line)
Pixel-based approach
- good for video sequences
- patch based registration with lucas-kanade algorithm
- register features across the entire sequence
A Brief Introduction on Feature-based Matching
Find a few important features (aka Interest Points)
Match them across two images
Compute image transformation function h
Feature Detection
-Two images taken at the same place with different angles
- Projective transformation H3X3
Feature Matching
?
-Two images taken at the same place with different angles
- Projective transformation H3X3
Feature Matching
?
-Two images taken at the same place with different angles
- Projective transformation H3X3
How do we match features across images? Any criterion?
Feature Matching
?
-Two images taken at the same place with different angles
- Projective transformation H3X3
How do we match features across images? Any criterion?
Feature Matching
Intensity/Color similarity• The intensity of pixels around the corresponding features should
have similar intensity
Feature Matching
Feature similarity (Intensity or SIFT signature)• The intensity of pixels around the corresponding features should
have similar intensity
• Cross-correlation, SSD
Feature Matching
Feature similarity (Intensity or SIFT signature)• The intensity of pixels around the corresponding features should
have similar intensity
• Cross-correlation, SSD
Distance constraint• The displacement of features should be smaller than a given
threshold
Feature Matching
Feature similarity (Intensity or SIFT signature)• The intensity of pixels around the corresponding features should
have similar intensity
• Cross-correlation, SSD
Distance constraint• The displacement of features should be smaller than a given
threshold
Epipolar line constraint• The corresponding pixels satisfy epipolar line constraints.
Feature Matching
Feature similarity (Intensity or SIFT signature)• The intensity of pixels around the corresponding features should
have similar intensity
• Cross-correlation, SSD
Distance constraint• The displacement of features should be smaller than a given
threshold
Epipolar line constraint• The corresponding pixels satisfy epipolar line constraints.
Fundamental matrix H
Feature-space Outlier Rejection
Can we now compute H3X3 from the blue points?
• No! Still too many outliers…
Feature-space Outlier Rejection
Can we now compute H3X3 from the blue points?• No! Still too many outliers…
• What can we do?
Feature-space Outlier Rejection
Can we now compute H3X3 from the blue points?• No! Still too many outliers…
• What can we do?
Robust estimation!
RANSAC for Estimating Projective Transformation
RANSAC loop:Select four feature pairs (at random)
Compute the transformation matrix H (exact)
Compute inliers where SSD(pi’, H pi) < ε
Keep largest set of inliers
Re-compute least-squares H estimate on all of the inliers
For more detail, check
- http://research.microsoft.com/en-us/um/people/zhang/INRIA/software-FMatrix.html
- Philip H. S. Torr (1997). "The Development and Comparison of Robust Methods for Estimating
the Fundamental Matrix". International Journal of Computer Vision 24 (3): 271–300
Structure from Motion
Two Principal Solutions• Bundle adjustment (nonlinear optimization)
• Factorization (SVD, through orthographic approximation, affine geometry)
Projection Matrix
Perspective projection:
2D coordinates are just a nonlinear function of its 3D coordinates and camera parameters:
1100
0
1 3
2
1
3
2
1
0
0
i
i
i
T
T
T
y
x
i
i
z
y
x
t
t
t
r
r
r
vf
uf
v
u
33
32302
33
30213021
)(
)(
tPr
ttfPrvrfv
tPr
tuttfPrurrfu
T
yTT
yi
Tx
TTTx
i
K
);,,( iPTRKf
);,,( iPTRKg
R T P
Nonlinear Approach for SFM
M
j
N
iijj
jiijj
ji
TRK
PTRKgvPTRKfujj
1 1
22
}{},{,
));,,(());,,((minarg
What’s the difference between camera calibration and SFM?
- camera calibration: known 3D and 2D
Nonlinear Approach for SFM
M
j
N
iijj
jiijj
ji
TRKP
PTRKgvPTRKfujji
1 1
22
}{},{,},{
)),,,(()),,,((minarg
M
j
N
iijj
jiijj
ji
TRK
PTRKgvPTRKfujj
1 1
22
}{},{,
));,,(());,,((minarg
What’s the difference between camera calibration and SFM?
- camera calibration: known 3D and 2D
- SFM: unknown 3D and known 2D
Nonlinear Approach for SFM
M
j
N
iijj
jiijj
ji
TRKP
PTRKgvPTRKfujji
1 1
22
}{},{,},{
)),,,(()),,,((minarg
M
j
N
iijj
jiijj
ji
TRK
PTRKgvPTRKfujj
1 1
22
}{},{,
));,,(());,,((minarg
What’s the difference between camera calibration and SFM?
- camera calibration: known 3D and 2D
- SFM: unknown 3D and known 2D
- what’s 3D-to-2D registration problem?
Nonlinear Approach for SFM
M
j
N
iijj
jiijj
ji
TRKP
PTRKgvPTRKfujji
1 1
22
}{},{,},{
)),,,(()),,,((minarg
M
j
N
iijj
jiijj
ji
TRK
PTRKgvPTRKfujj
1 1
22
}{},{,
));,,(());,,((minarg
What’s the difference between camera calibration and SFM?
- camera calibration: known 3D and 2D
- SFM: unknown 3D and known 2D
- what’s 3D-to-2D registration problem?
SFM: Bundle Adjustment
SFM = Nonlinear Least Squares problem
Minimize through• Gradient Descent
• Conjugate Gradient
• Gauss-Newton
• Levenberg Marquardt common method
Prone to local minima
M
j
N
iijj
jiijj
ji
TRKP
PTRKgvPTRKfujji
1 1
22
}{},{,},{
)),,,(()),,,((minarg
Count # Constraints vs #Unknowns
M camera poses
N points
2MN point constraints
6M+3N + 4 (unknowns)
Suggests: need 2mn 6m + 3n+4
But: Can we really recover all parameters???
M
j
N
iijj
jiijj
ji
TRKP
PTRKgvPTRKfujji
1 1
22
}{},{,},{
)),,,(()),,,((minarg
Count # Constraints vs #Unknowns
M camera poses
N points
2MN point constraints
6M+3N+4 unknowns (known intrinsic camera parameters)
Suggests: need 2mn 6m + 3n+4
But: Can we really recover all parameters???• Can’t recover origin, orientation (6 params)
• Can’t recover scale (1 param)
Thus, we need 2mn 6m + 3n+4 - 7
M
j
N
iijj
jiijj
ji
TRKP
PTRKgvPTRKfujji
1 1
22
}{},{,},{
)),,,(()),,,((minarg
SFM Using Factorization
12
1
2
1
i
i
i
T
T
i
i
z
y
x
t
t
r
r
v
u
Assume an orthographic camera
Image World
SFM Using Factorization
12
1
2
1
i
i
i
T
T
i
i
z
y
x
t
t
r
r
v
u
Assume orthographic camera
Image World
i
i
i
T
T
N
ii
i
N
ii
i
z
y
x
r
r
N
vv
N
uu
2
1
1
1
Subtract the mean
SFM Using Factorization
N
N
N
T
T
N
N
z
y
x
z
y
x
z
y
x
r
r
v
u
v
u
v
u
...
...
...
~
~
...
...~
~
~
~
2
2
2
1
1
1
2
1
2
2
1
1
Stack all the features from the same frame:
SFM Using Factorization
N
N
N
T
T
N
N
z
y
x
z
y
x
z
y
x
r
r
v
u
v
u
v
u
...
...
...
~
~
...
...~
~
~
~
2
2
2
1
1
1
2
1
2
2
1
1
N
N
N
TF
TF
T
T
NF
NF
F
F
F
F
NF
NF
F
F
F
F
z
y
x
z
y
x
z
y
x
r
r
r
r
v
u
v
u
v
u
v
u
v
u
v
u
...
...
...
~
~
...
...~
~
~
~
~
~
...
...~
~
~
~
2
2
2
1
1
1
2,
1,
2,1
1,1
,
,
2,
2,
1,
1,
,
,
2,
2,
1,
1,
Stack all the features from the same frame:
Stack all the features from all the images:
W
SFM Using Factorization
N
N
N
T
T
N
N
z
y
x
z
y
x
z
y
x
r
r
v
u
v
u
v
u
...
...
...
~
~
...
...~
~
~
~
2
2
2
1
1
1
2
1
2
2
1
1
N
N
N
TF
TF
T
T
NF
NF
F
F
F
F
NF
NF
F
F
F
F
z
y
x
z
y
x
z
y
x
r
r
r
r
v
u
v
u
v
u
v
u
v
u
v
u
...
...
...
~
~
...
...~
~
~
~
~
~
...
...~
~
~
~
2
2
2
1
1
1
2,
1,
2,1
1,1
,
,
2,
2,
1,
1,
,
,
2,
2,
1,
1,
NFW 2
~
Stack all the features from the same frame:
Stack all the features from all the images:
W
32 FM NS 3
SFM Using Factorization
N
N
N
TF
TF
T
T
NF
NF
F
F
F
F
NF
NF
F
F
F
F
z
y
x
z
y
x
z
y
x
r
r
r
r
v
u
v
u
v
u
v
u
v
u
v
u
...
...
...
~
~
...
...~
~
~
~
~
~
...
...~
~
~
~
2
2
2
1
1
1
2,
1,
2,1
1,1
,
,
2,
2,
1,
1,
,
,
2,
2,
1,
1,
NFW 2
~32 FM
Stack all the features from all the images:
W
NS 3
Factorize the matrix into two matrix using SVD:
NFW 2
~
TNF
TNF VSUMVUW 2
1
32
1
322
~~~
SFM Using Factorization
N
N
N
TF
TF
T
T
NF
NF
F
F
F
F
NF
NF
F
F
F
F
z
y
x
z
y
x
z
y
x
r
r
r
r
v
u
v
u
v
u
v
u
v
u
v
u
...
...
...
~
~
...
...~
~
~
~
~
~
...
...~
~
~
~
2
2
2
1
1
1
2,
1,
2,1
1,1
,
,
2,
2,
1,
1,
,
,
2,
2,
1,
1,
NFW 2
~32 FM
Stack all the features from all the images:
NS 3
Factorize the matrix into two matrix using SVD:
NFW 2
~
TNF
TNF VSUMVUW 2
1
32
1
322
~~~
NNFF SQSQMM
31
333333232
~~
SFM Using Factorization
N
N
N
TF
TF
T
T
NF
NF
F
F
F
F
NF
NF
F
F
F
F
z
y
x
z
y
x
z
y
x
r
r
r
r
v
u
v
u
v
u
v
u
v
u
v
u
...
...
...
~
~
...
...~
~
~
~
~
~
...
...~
~
~
~
2
2
2
1
1
1
2,
1,
2,1
1,1
,
,
2,
2,
1,
1,
,
,
2,
2,
1,
1,
NFW 2
~32 FM
Stack all the features from all the images:
W
NS 3
Factorize the matrix into two matrix using SVD:
NFW 2
~
TNF
TNF VSUMVUW 2
1
32
1
322
~~~
NNFF SQSQMM
31
333333232
~~
How to compute the matrix ? 33Q
SFM Using Factorization
2,2,2,11,1
2,
1,
2,1
1,1
3232 FF
TF
TF
T
T
TFF rrrr
r
r
r
r
MM
M is the stack of rotation matrix:
2,2,
2,1,
1,2,
1,1,
2,12,1
2,11,1
1,12,1
1,11,1
FTF
FTF
FTF
FTF
T
T
T
T
rr
rr
rr
rr
rr
rr
rr
rr
SFM Using Factorization
2,2,2,11,1
2,
1,
2,1
1,1
3232 FF
TF
TF
T
T
TFF rrrr
r
r
r
r
MM
M is the stack of rotation matrix:
2,2,
2,1,
1,2,
1,1,
2,12,1
2,11,1
1,12,1
1,11,1
FTF
FTF
FTF
FTF
T
T
T
T
rr
rr
rr
rr
rr
rr
rr
rr
1 010
1 010
Orthogonal constraints from rotation matrix
SFM Using Factorization
2,2,2,11,1
2,
1,
2,1
1,1
3232 FF
TF
TF
T
T
TFF rrrr
r
r
r
r
MM
2,2,
2,1,
1,2,
1,1,
2,12,1
2,11,1
1,12,1
1,11,1
FTF
FTF
FTF
FTF
T
T
T
T
rr
rr
rr
rr
rr
rr
rr
rr
M is the stack of rotation matrix:
1 010
1 010
Orthogonal constraints from rotation matrix
TF
TF MQQM 32333332
~~
SFM Using Factorization
TF
TF MQQM 32333332
~~
2,2,
2,1,
1,2,
1,1,
2,12,1
2,11,1
1,12,1
1,11,1
FTF
FTF
FTF
FTF
T
T
T
T
rr
rr
rr
rr
rr
rr
rr
rr
1 010
1 010
Orthogonal constraints from rotation matrices:
SFM Using Factorization
TF
TF MQQM 32333332
~~
2,2,
2,1,
1,2,
1,1,
2,12,1
2,11,1
1,12,1
1,11,1
FTF
FTF
FTF
FTF
T
T
T
T
rr
rr
rr
rr
rr
rr
rr
rr
1 010
1 010
Orthogonal constraints from rotation matrices:
QQ: symmetric 3 by 3 matrix
SFM Using Factorization
TF
TF MQQM 32333332
~~
2,2,
2,1,
1,2,
1,1,
2,12,1
2,11,1
1,12,1
1,11,1
FTF
FTF
FTF
FTF
T
T
T
T
rr
rr
rr
rr
rr
rr
rr
rr
1 010
1 010
Orthogonal constraints from rotation matrices:
How to compute QQT?
least square solution
- 4F linear constraints, 9 unknowns (6 independent due to symmetric matrix)
QQ: symmetric 3 by 3 matrix
SFM Using Factorization
TF
TF MQQM 32333332
~~
2,2,
2,1,
1,2,
1,1,
2,12,1
2,11,1
1,12,1
1,11,1
FTF
FTF
FTF
FTF
T
T
T
T
rr
rr
rr
rr
rr
rr
rr
rr
1 010
1 010
Orthogonal constraints from rotation matrices:
How to compute QQT?
least square solution
- 4F linear constraints, 9 unknowns (6 independent due to symmetric matrix) How to compute Q from QQT:
SVD again: 2
1
UQVUQQ T
QQ: symmetric 3 by 3 matrix
SFM Using Factorization
2,2,2,11,1
2,
1,
2,1
1,1
3232 FF
TF
TF
T
T
TFF rrrr
r
r
r
r
MM
2,2,
2,1,
1,2,
1,1,
2,12,1
2,11,1
1,12,1
1,11,1
FTF
FTF
FTF
FTF
T
T
T
T
rr
rr
rr
rr
rr
rr
rr
rr
M is the stack of rotation matrix:
1 010
1 010
Orthogonal constraints from rotation matrix
TF
TF MQQM 32333332
~~
QQT: symmetric 3 by 3 matrix
Computing QQT is easy:
- 3F linear equations
- 6 independent unknowns
SFM Using Factorization
1. Form the measurement matrix
2. Decompose the matrix into two matrices and using SVD
3. Compute the matrix Q with least square and SVD
4. Compute the rotation matrix and shape matrix:
and
NFW 2
~
NS 3
~ 32
~FM
QMM F 32
~ 32
1 ~
FSQS
Weak-perspective Projection
Factorization also works for weak-perspective projection (scaled orthographic projection):
d z0
12
1
2
1
i
i
i
T
T
i
i
z
y
x
t
t
r
r
v
u
SFM for Deformable Objects
For detail, click here
SFM Using Factorization
Bundle adjustment (nonlinear optimization) - work with perspective camera model - work with incomplete data - prone to local minima
Factorization: - closed-form solution for weak perspective camera - simple and efficient - usually need complete data - becomes complicated for full-perspective camera model
Phil Torr’s structure from motion toolkit in matlab (click here)
Voodoo camera tracker (click here)