Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer...
Transcript of Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer...
![Page 1: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/1.jpg)
Agenda
• Rotations
• Camera calibration
• Homography
• Ransac
![Page 2: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/2.jpg)
164 Computer Vision: Algorithms and Applications (September 3, 2010 draft)
Transformation Matrix # DoF Preserves Icon
translationh
I ti
2⇥32 orientation
rigid (Euclidean)h
R ti
2⇥33 lengths ⇢⇢
⇢⇢SSSS
similarityh
sR ti
2⇥34 angles ⇢
⇢SS
affineh
Ai
2⇥36 parallelism ⇥⇥ ⇥⇥
projectiveh
˜Hi
3⇥38 straight lines `
Table 3.5 Hierarchy of 2D coordinate transformations. Each transformation also preservesthe properties listed in the rows below it, i.e., similarity preserves not only angles but alsoparallelism and straight lines. The 2⇥3 matrices are extended with a third [0T 1] row to forma full 3⇥ 3 matrix for homogeneous coordinate transformations.
amples of such transformations, which are based on the 2D geometric transformations shownin Figure 2.4. The formulas for these transformations were originally given in Table 2.1 andare reproduced here in Table 3.5 for ease of reference.
In general, given a transformation specified by a formula x0 = h(x) and a source imagef(x), how do we compute the values of the pixels in the new image g(x), as given in (3.88)?Think about this for a minute before proceeding and see if you can figure it out.
If you are like most people, you will come up with an algorithm that looks something likeAlgorithm 3.1. This process is called forward warping or forward mapping and is shown inFigure 3.46a. Can you think of any problems with this approach?
procedure forwardWarp(f,h, out g):
For every pixel x in f(x)
1. Compute the destination location x0 = h(x).
2. Copy the pixel f(x) to g(x0).
Algorithm 3.1 Forward warping algorithm for transforming an image f(x) into an imageg(x0) through the parametric transform x0 = h(x).
Geometric Transformations
x
y
Let’s define families of transformations by the properties that they preserve
![Page 3: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/3.jpg)
Rotations
Definition: an orthogonal transformation perserves dot products
Linear transformations that preserve distances and angles
[can conclude by setting a,b = coordinate vectors]
Defn: A is a rotation matrix if ATA = I, det(A) = 1Defn: A is a reflection matrix if ATA = I, det(A) = -1
aT b = T (a)T (b) where T (a) = Aa, a 2 Rn, A 2 Rn⇥n
aT b = aTATAb () ATA = I
aT b = F (a)TF (b) where F (a) = Aa, a 2 Rn, A 2 Rn⇥n
![Page 4: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/4.jpg)
2D Rotations
R =
cos ✓ � sin ✓sin ✓ cos ✓
�
1 DOF
![Page 5: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/5.jpg)
3D Rotations
Think of as change of basis where ri = r(i,:) are orthonormal basis vectors
R
2
4XYZ
3
5 =
2
4r11 r12 r13r21 r22 r23r31 r32 r33
3
5
2
4XYZ
3
5
rotated coordinate frame
r1
r2
r3
How many DOFs?
3 = (2 to point r1 + 1 to rotate along r1)
![Page 6: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/6.jpg)
3D RotationsLots of parameterizations that try to capture 3 DOFs
Helpful one for vision: axis-angle representation
Represent a 3D rotation with a unit vector that represents the axis of rotation, and an angle of rotation about that vector
7
Shears
A=
2
664
1 hxy hxz 0hyx 1 hyz 0hzx hzy 1 00 0 0 1
3
775
Shears y into x
7
8
Rotations• 3D Rotations fundamentally more complex than in 2D!
• 2D: amount of rotation!• 3D: amount and axis of rotation
-vs-
2D 3D
8
05-3DTransformations.key - February 9, 2015
![Page 7: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/7.jpg)
Recall: cross-product
Dot product:
Cross product:
a · b = ||a|| ||b||cos✓
Cross product matrix:
������
i j ka1 a2 a3b1 b2 b3
������=
����a2 a3b2 b3
���� i�����a1 a3b1 b3
���� j+����a1 a2b1 b2
����k
a⇥ b = ab =
2
40 �a3 a2a3 0 �a1�a2 a1 0
3
5
2
4b1b2b3
3
5
![Page 8: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/8.jpg)
Approach
x
! 2 R3, ||!|| = 1
✓
![Page 9: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/9.jpg)
Approach
x✓
! 2 R3, ||!|| = 1
xk
x?
1. Write as x as sum of parallel and perpindicular component to omega
2. Rotate perpindicular component by 2D rotation of theta in plane orthogonal to omega
R = I + w sin ✓ + ww(1� cos ✓)
[Rx can simplify to cross and dot product computations]
![Page 10: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/10.jpg)
Exponential map
x✓
! 2 R3, ||!|| = 1
xk
x?
[standard Taylor series expansion of exp(x) @ x=0 as 1 + x + (1/2!)x2 +…]
Implication: we can approximate change in position due to a small rotation as v ⇥ x, where v = !✓
R = exp(v), where v = !✓
= I + v +1
2!
v2 + . . .
![Page 11: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/11.jpg)
Agenda
• Rotations
• Camera calibration
• Homography
• Ransac
![Page 12: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/12.jpg)
Perspective projection
COP
(X,Y,Z)
(x,y,1)
x =f
Z
X
y =f
Z
Y
x
y
z
[right-handed coordinate system]
![Page 13: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/13.jpg)
Perspective projection revisited
�
2
4x
y
1
3
5 =
2
4f 0 00 f 00 0 1
3
5
2
4X
Y
Z
3
5
�x = fX
� = Z
x =�x
�
=fX
Z
Given (X,Y,Z) and f, compute (x,y) and lambda:
![Page 14: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/14.jpg)
Special case: f = 1
COP
(X,Y,Z)(x,y,1)
• 3D point is obtained by scaling ray pointed at image coordinate • Scale factor = true depth of point
Natural geometric intuition:
[Aside: given an image with a focal length ‘f’, resize by ‘1/f’ to obtain unit-focal-length image]
Z
2
4x
y
1
3
5 =
2
4X
Y
Z
3
5
![Page 15: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/15.jpg)
Homogenous notation
For now, think of above as shorthand notation for
2
4x
y
z
3
5 ⇠
2
4X
Y
Z
3
5
2
4x
y
z
3
5 ⌘
2
4X
Y
Z
3
5
9� s.t. �
2
4x
y
z
3
5 =
2
4X
Y
Z
3
5
![Page 16: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/16.jpg)
Camera projection
3D point in world coordinates
Camera extrinsics (rotation and translation)
Camera instrinsic matrix K (can include skew & non-square pixel size)
�
2
4x
y
1
3
5 =
2
4f 0 00 f 00 0 1
3
5
2
4r11 r12 r13 t
x
r21 r22 r23 t
y
r31 r32 r33 t
z
3
5
2
664
X
Y
Z
1
3
775
camera
world coordinate frame
r1
r2
r3
T
Aside: homogenous notation is shorthand for x =�x
�
![Page 17: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/17.jpg)
Fancier intrinsicsx
s
= s
x
x
y
s
= s
y
y
x
0 = x
s
+ o
x
y
0 = y
s
+ o
y
x” = x
0 + s
✓
y
0
non-square pixels
shifted origin
x
y
✓ skewed image axes
}
}
K =
2
4s
x
s
✓
o
x
0 s
y
o
y
0 0 1
3
5
2
4f 0 00 f 00 0 1
3
5 =
2
4fs
x
fs
✓
o
x
0 fs
y
o
y
0 0 1
3
5
![Page 18: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/18.jpg)
Notation�
2
4x
y
1
3
5 =
2
4fs
x
fs
✓
o
x
0 fs
y
o
y
0 0 1
3
5
2
4r11 r12 r13 t
x
r21 r22 r23 t
y
r31 r32 r33 t
z
3
5
2
664
X
Y
Z
1
3
775
= K3⇥3
⇥R3⇥3 T3⇥1
⇤
2
664
X
Y
Z
1
3
775
= M3⇥4
2
664
X
Y
Z
1
3
775
Claims (without proof): 1. A 3x4 matrix ‘M’ can be a camera matrix iff det(M) is not zero 2. M is determined only up to a scale factor
[Using Matlab’s rows x columns]
![Page 19: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/19.jpg)
Notation (more)M3⇥4
2
664
XYZ1
3
775 =⇥A3⇥3 b3⇥1
⇤
2
664
XYZ1
3
775
= A3⇥3
2
4XYZ
3
5+ b3⇥1
M =
2
4mT
1
mT2
mT3
3
5 , A =
2
4aT1aT2aT3
3
5 , b =
2
4b1b2b3
3
5
![Page 20: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/20.jpg)
Applying the projection matrix
Set of 3D points that project to x = 0:
Set of 3D points that project to y = 0:
Set of 3D points that project to x = inf or y = inf:
� =⇥X Y Z
⇤a3 + b3
⇥X Y Z
⇤a1 + b1 = 0
⇥X Y Z
⇤a2 + b2 = 0
⇥X Y Z
⇤a3 + b3 = 0
x =1
�
(⇥X Y Z
⇤a1 + b1)
y =1
�(⇥X Y Z
⇤a2 + b2)
![Page 21: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/21.jpg)
x
y
a3
Rows of the projection matrix describe the 3 planes defined by the image coordinate system
a1
a2
image plane
COP
![Page 22: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/22.jpg)
(x,y) (X,Y,Z)
What’s set of (X,Y,Z) points that project to same (x,y)?2
4X
Y
Z
3
5 = �w + b where w = A
�1
2
4x
y
1
3
5, b = �A
�1b
What’s the position of COP / pinhole?
COP
A
2
4XYZ
3
5+ b = 0 )
2
4XYZ
3
5 = �A�1b
Other geometric properties
![Page 23: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/23.jpg)
Affine Cameras
• Example: Weak-perspective projection model • Projection defined by 8 parameters • Parallel lines are projected to parallel lines • The transformation can be written as a direct linear transformation
Image coordinates (x,y) are an affine function of world coordinates (X,Y,Z)
mT3 =
⇥0 0 0 1
⇤ x =⇥X Y Z
⇤a1 + b1
y =⇥X Y Z
⇤a2 + b1
Affine transformations = linear transformations plus an offset
![Page 24: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/24.jpg)
Geometric Transformations
Euclidean (trans + rot) preserves lengths + angles
Euclidean
Affine
Projective
Affine: preserves parallel lines
Projective: preserves lines
![Page 25: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/25.jpg)
Agenda
• Rotations
• Camera calibration
• Homography
• Ransac
![Page 26: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/26.jpg)
Calibration: Recover M from scene points P1,..,PN and the corresponding projections in the image plane p1,..,pN
Find M that minimizes the distance between the actual points in the image, pi, and their predicted projections MPi
Problems: • The projection is (in general) non-linear • M is defined up to an arbitrary scale factor
![Page 27: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/27.jpg)
PnP = Perspective n-Point
![Page 28: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/28.jpg)
ii MPp ≡
iT
iT
ii
Ti
T
i PmPmv
PmPmu
3
2
3
1 ==
0)(0)(
32
31
=−
=−
iiT
iT
iiT
iT
vPmPmuPmPm
Write relation between image point, projection matrix, and point in space:
Write non-linear relations between coordinates:
Make them linear:
The math for the calibration procedure follows a recipe that is used in many (most?) problems involving camera geometry, so it’s worth remembering:
![Page 29: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/29.jpg)
0
00
00
111
111
=
⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢
⎣
⎡
−
−
−
−
m
PvPPuP
PvPPuP
TNN
TN
TNN
TN
TT
TT
���
Put all the relations for all the points into a single matrix:
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
==⎥⎦
⎤
−
−⎢⎣
⎡
3
2
1
00
0 mmm
mmPvPu
PP
Tii
Tii
Ti
TiWrite them in
matrix form:
In noise-free case: Lm = 0
(vector of 0’s)
![Page 30: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/30.jpg)
What about noisy case?
min||m||2=1
||Lm||2
Is this the right error to minimize?
If not, what is?
Min right singular vector of L (or eigenvector of LTL)
![Page 31: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/31.jpg)
P1
z
x
y
Pi
(ui,vi)
(u1,v1)
MPi
Ideal error
2
3
2
2
3
1⎟⎟⎠
⎞⎜⎜⎝
⎛
⋅
⋅−+⎟⎟
⎠
⎞⎜⎜⎝
⎛
⋅
⋅−
i
ii
i
ii Pm
PmvPmPmuError(M) =
Initialize nonlinear optimization with “algebraic” solution
![Page 32: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/32.jpg)
Radial Lens Distortions
![Page 33: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/33.jpg)
Radial Lens Distortions
No Distortion Barrel Distortion Pincushion Distortion
![Page 34: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/34.jpg)
![Page 35: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/35.jpg)
Correcting Radial Lens Distortions
Before After
http://www.grasshopperonline.com/barrel_distortion_correction_software.html
![Page 36: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/36.jpg)
Overall approachError(M,k’s)Minimize reprojection error:
Initialize with algebraic solution (approaches in literature based on various assumptions)
![Page 37: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/37.jpg)
Revisiting homographies
�
2
4x
y
1
3
5 =
2
4f 0 00 f 00 0 1
3
5
2
4r11 r12 r13 t
x
r21 r22 r23 t
y
r31 r32 r33 t
z
3
5
2
664
X
Y
01
3
775
Place world coordinate frame on object plane
![Page 38: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/38.jpg)
Projection of planar points
Convert between 2D location on object plane and image coordinate with a 3X3 matrix H(Above holds for any instrinc matrix K)
�
2
4x
y
1
3
5 =
2
4f 0 00 f 00 0 1
3
5
2
4r11 r12 r13 t
x
r21 r22 r23 t
y
r31 r32 r33 t
z
3
5
2
664
X
Y
01
3
775
=
2
4f 0 00 f 00 0 1
3
5
2
4r11 r12 t
x
r21 r22 t
y
r31 r32 t
z
3
5
2
4X
Y
1
3
5
=
2
4fr11 fr12 ft
x
fr21 fr22 ft
y
r31 r32 t
z
3
5
2
4X
Y
1
3
5
![Page 39: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/39.jpg)
Two-views of a plane
Image correspondences
�1
2
4x1
y11
3
5 = H1
2
4XY1
3
5
�2
2
4x2
y21
3
5 = H2
2
4XY1
3
5
�
2
4x2
y21
3
5 = H
2
4XY1
3
5
[Aside: H usually invertible]
[LHS and RHS are related by a scale factor]
�
2
4x2
y21
3
5 = H2H�11
2
4x1
y11
3
5
![Page 40: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/40.jpg)
Computing homography projections
�
2
4x2
y2
1
3
5 =
2
4a b c
d e f
g h i
3
5
2
4x1
y1
1
3
5
Given (x1,y1) and H, how do we compute (x2,y2)?
Is this operation linear in H or (x1,y1)?
x2 =�x2
�
=ax1 + by1 + c
gx1 + hy1 + i
![Page 41: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/41.jpg)
Estimating homographies
Image correspondences
Given corresponding 2D points in left and right image, estimate H
How many corresponding points needed? How many degrees of freedom in H?
Homogenous linear systemAH(:) =
2
6400...
3
75
x2(gx1 + hy1 + i) = ax1 + by1 + c
...
![Page 42: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/42.jpg)
Estimating homographies
Image correspondences
H is determined only up to scale factor (8 DOFs) Need 4 points minimum. How to handle more points?
min||H(:)||2=1
||AH(:)||2
Minimum right singular vector of A (eigenvector of ATA)
AH(:) =
2
6400...
3
75
Given corresponding 2D points in left and right image, estimate H
![Page 43: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/43.jpg)
“Frontalizing” planes using homographies
Estimate homography on (at least) 4 pairs of corresponding points (e.g., corners of quad/rect)
Apply homography on all (x,y) coordinates inside target rectangle to compute source pixel location
![Page 44: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/44.jpg)
“Frontalizing” planes using homographies
![Page 45: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/45.jpg)
Special case of 2 views: rotations about camera center
LECTURE 4. PLANAR SCENES AND HOMOGRAPHY 5
cues (parallax) can only be recovered when T is nonzero. Looking at thehomography equation, the limit of H as d approaches infinity is R. Thus anypair of images of an arbitrary scene captured by a purely rotating camera isrelated by a planar homography.
A planar panorama can be constructed by capturing many overlappingimages at di↵erent rotations, picking an image to be a reference, and thenfinding corresponding points between the overlapping images. The pairwisehomographies are derived from the corresponding points, forming a mosaicthat typically is shaped like a “bow-tie,” as images farther away from thereference are warped outward to fit the homography. The figure below isfrom Pollefeys and Hartley & Zisserman.
4.7. Second Derivation of Homography Constraint
The homography constraint, element by element, in homogenous coordinatesis as follows:
2
4x2
y2z2
3
5 =
2
4H11 H12 H13
H21 H22 H23
H31 H32 H33
3
5
2
4x1
y1z1
3
5 , x2 ⇠ Hx1
In inhomogenous coordinates (x02 = x2/z2 and y02 = y2/z2),
Can be modeled as planar transformations, regardless of scene geometry!
(a) incline L.jpg (img1) (b) incline R.jpg (img2) (c) img2 warped to img1’s frame
Figure 5: Example output for Q6.1: Original images img1 and img2 (left and center) andimg2 warped to fit img1 (right). Notice that the warped image clips out of the image. Wewill fix this in Q6.2
H2to1=computeH(p1,p2)
Inputs: p1 and p2 should be 2⇥N matrices of corresponding (x, y)T coordinatesbetween two images.Outputs: H2to1 should be a 3⇥ 3 matrix encoding the homography that best matchesthe linear equation derived above for Equation 8 (in the least squares sense). Hint:Remember that a homography is only determined up to scale. The Matlab functionseig() or svd() will be useful. Note that this function can be written without anexplicit for-loop over the data points.
6 Stitching it together: Panoramas (30 pts)
We can also use homographies to create a panorama image from multiple views of the samescene. This is possible for example when there is no camera translation between the views(e.g., only rotation about the camera center), as we saw in Q4.2.
First, you will generate panoramas using matched point correspondences between imagesusing the BRIEF matching you implemented in Q2.4. We will assume that there is no errorin your matched point correspondences between images (Although there might be someerrors).
In the next section you will extend the technique to use (potentially noisy) keypointmatches.
You will need to use the provided function warp im=warpH(im, H, out size), whichwarps image im using the homography transform H. The pixels in warp_im are sampledat coordinates in the rectangle (1, 1) to (out_size(2), out_size(1)). The coordinates ofthe pixels in the source image are taken to be (1, 1) to (size(im,2), size(im,1)) andtransformed according to H.
• Q6.1 (15pts) In this problem you will implement and use the function (stub providedin matlab/imageStitching.m):
[panoImg] = imageStitching(img1, img2, H2to1)
on two images from the Dusquesne incline. This function accepts two images and theoutput from the homography estimation function. This function will:
10
Figure 6: Final panorama view. With homography estimated with RANSAC.
• a folder matlab containing all the .m and .mat files you were asked to write andgenerate
• a pdf named writeup.pdf containing the results, explanations and images asked forin the assignment along with to the answers to the questions on homographies.
Submit all the code needed to make your panorama generator run. Make sure all the .m
files that need to run are accessable from the matlab folder without any editing of the pathvariable. If you downloaded and used a feature detector for the extra credit, include thecode with your submission and mention it in your writeup. You may leave the data folderin your submission, but it is not needed. Please zip your homework as usual and submit itusing blackboard.
Appendix: Image Blending
Note: This section is not for credit and is for informational purposes only.
For overlapping pixels, it is common to blend the values of both images. You can sim-ply average the values but that will leave a seam at the edges of the overlapping images.Alternatively, you can obtain a blending value for each image that fades one image into theother. To do this, first create a mask like this for each image you wish to blend:
mask = zeros(size(im,1), size(im,2));
mask(1,:) = 1; mask(end,:) = 1; mask(:,1) = 1; mask(:,end) = 1;
mask = bwdist(mask, ’city’);
mask = mask/max(mask(:));
The function bwdist computes the distance transform of the binarized input image, so thismask will be zero at the borders and 1 at the center of the image. You can warp this maskjust as you warped your images. How would you use the mask weights to compute a linearcombination of the pixels in the overlap region? Your function should behave well whereone or both of the blending constants are zero.
13
![Page 46: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/46.jpg)
Derivation
LECTURE 4. PLANAR SCENES AND HOMOGRAPHY 5
cues (parallax) can only be recovered when T is nonzero. Looking at thehomography equation, the limit of H as d approaches infinity is R. Thus anypair of images of an arbitrary scene captured by a purely rotating camera isrelated by a planar homography.
A planar panorama can be constructed by capturing many overlappingimages at di↵erent rotations, picking an image to be a reference, and thenfinding corresponding points between the overlapping images. The pairwisehomographies are derived from the corresponding points, forming a mosaicthat typically is shaped like a “bow-tie,” as images farther away from thereference are warped outward to fit the homography. The figure below isfrom Pollefeys and Hartley & Zisserman.
4.7. Second Derivation of Homography Constraint
The homography constraint, element by element, in homogenous coordinatesis as follows:
2
4x2
y2z2
3
5 =
2
4H11 H12 H13
H21 H22 H23
H31 H32 H33
3
5
2
4x1
y1z1
3
5 , x2 ⇠ Hx1
In inhomogenous coordinates (x02 = x2/z2 and y02 = y2/z2),
…
K2
2
4X2
Y2
Z2
3
5 = R
2
4X1
Y1
Z1
3
5
�2
2
4x2
y2
1
3
5 =
2
4f2 0 00 f2 00 0 1
3
5
2
4X2
Y2
Z2
3
5
�
2
4x2
y2
1
3
5 = K2RK
�11
2
4x1
y1
1
3
5
![Page 47: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/47.jpg)
Take-home points for homographies
• If camera rotates about its center, then the images are related by a homography irrespective of scene depth.
• If the scene is planar, then images from any two cameras are related by a homography.
• Homography mapping is a 3x3 matrix with 8 degrees of freedom.
�
2
4x2
y2
1
3
5 =
2
4a b c
d e f
g h i
3
5
2
4x1
y1
1
3
5
![Page 48: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/48.jpg)
Matching features
What do we do about the “bad” matches?
![Page 49: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/49.jpg)
49
General problem: we are trying to fit a (geometric) model to noisy data
How about we choose the average vector (least-squares soln)? Why will/won’t this work?
![Page 50: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/50.jpg)
Let’s generalize the problem a bitEstimate best model (a line) that fits data {xi, yi}
minw,b
X
i
(yi � fw,b(xi))2
fw,b(xi) = wxi + b
x
y
![Page 51: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/51.jpg)
Let’s generalize the problem a bit“Least-squares” solution
x
y
![Page 52: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/52.jpg)
RANSAC Line Fitting Example
Sample two points
![Page 53: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/53.jpg)
RANSAC Line Fitting Example
Fit Line
![Page 54: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/54.jpg)
RANSAC Line Fitting Example
Total number of points within a threshold of line.
![Page 55: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/55.jpg)
RANSAC Line Fitting Example
Repeat, until get a good result
![Page 56: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/56.jpg)
RANSAC Line Fitting Example
Repeat, until get a good result
![Page 57: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/57.jpg)
RANSAC Line Fitting Example
Repeat, until get a good result
![Page 58: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/58.jpg)
RAndom SAmple Consensus
Select one match, count inliers
![Page 59: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/59.jpg)
RAndom SAmple Consensus
Select one match, count inliers
![Page 60: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/60.jpg)
Least squares fit
Find “average” translation vector for the largest group of inliers
![Page 61: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/61.jpg)
RANSAC for estimating transformation
RANSAC loop: 1. Select feature pairs (at random) 2. Compute transformation T (exact) 3. Compute inliers (point matches where |pi’ - T pi|2< ε) 4. Keep largest set of inliers
5. Re-compute least-squares estimate of transformation T using all of the inliers
![Page 62: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/62.jpg)
RANSAC for estimating transformation
RANSAC loop: 1. Select feature pairs (at random) 2. Compute transformation T (exact) 3. Compute inliers (point matches where |pi’ - T pi|2< ε) 4. Keep largest set of inliers
5. Re-compute least-squares estimate of transformation T using all of the inliers
Ah = 0, A 2 R8X9 h, 0 2 R9
Recall homography estimation: how do we estimate with all inlier points?
![Page 63: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/63.jpg)
RANSAC for estimating transformation
RANSAC loop: 1. Select feature pairs (at random) 2. Compute transformation T (exact) 3. Compute inliers (point matches where |pi’ - T pi|2< ε) 4. Keep largest set of inliers
5. Re-compute least-squares estimate of transformation T using all of the inliers
Ah = 0, A 2 R8X9 h, 0 2 R9
Recall homography estimation: how do we estimate with all inlier points?
![Page 64: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/64.jpg)
RANSAC for alignment
![Page 65: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/65.jpg)
RANSAC for alignment
![Page 66: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/66.jpg)
RANSAC for alignment
![Page 67: Agenda - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/transformations.pdf164 Computer Vision: Algorithms and Applications (September 3, 2010 draft) Transformation Matrix](https://reader030.fdocuments.us/reader030/viewer/2022040803/5e3ee070519f0a24ae1c5408/html5/thumbnails/67.jpg)
Planar object recognition(what is transformation used; how many pairs must be selected in initial step?