3D Vision – Real Objects, Real Cameras · 2017. 2. 21. · Without a 3-D model of the world,...
Transcript of 3D Vision – Real Objects, Real Cameras · 2017. 2. 21. · Without a 3-D model of the world,...
3D Vision – Real Objects, Real Cameras Chapter 11 (parts of) , 12 (parts of) Computerized Image Analysis 2 Anders Brun, [email protected]
3D Vision
! Philisophy ! Image formation
" The pinhole camera " Projective geometry " Artefacts and challenges
! Camera calibration ! Stereo vision ! Structured Light
Philosophy: Why 3-D?
! Why do we model things in 3-D? ! Without a 3-D model of the
world, events are more difficult to predict! Movement, grasping, collision estimation, real size estimation, …
! Example: 2-D: A car on the highway looks bigger and drives faster when it approaches
! 3-D: A car on the highway has constant size and speed when it approaches
x z
x y
Philosophy: 3-D cues …
Photo: Greg Keene
• Shape from: • Focus • Lighting • Stereo • Structured light • …
Philosophy: 3-D cues …
Philosophy: 3-D cues …
Philosophy: 3-D cues …
Philosophy: Marr and 2.5-D
! Primal sketch: Edges and areas ! 2.5-D sketch: Texture and depth ! 3-D model: A hierarchical 3-D model of the world
Teddy dataset, from http://cat.middlebury.edu/
Philosophies
! Build accurate 3-D world representation 1. Build a complete 3-D model of the scene 2. Plan the task using the 3-D model 3. Example: Build a model of the scene, then
find the teddy bear and send a robot arm to grab it.
! Plan as you go, act and react 1. Collect features from the scene 2. Use the features to guide your actions 3. Example: Find the teddy bear using
template matching in image, then send the robot hand in that direction. Possibly take more images when halfway.
Passive, Active and Dynamic Vision
! Passive vision: " The camera has a fixed location
! Dynamic vision: " The camera is moving but cannot be steered
! Active vision: " The camera can be steered
The pinhole camera
! The Pinhole camera is an idealized model
! A real aperture is not a point. ! A real aperture has a non-vanishing area and
typically also a lens…
The pinhole camera model
! Where is the point P projected on the image plane inside the camera?
f
P=(X,Y,Z)
x
focal point or origin (the “pinhole”)
image plane
€
x = − fXZ
The pinhole camera model (alternative)
! Imagine an observer is located at the focal point ! A screen is placed at distance f from observer. ! Where on this screen is P projected
f = focal length
P=(X,Y,Z)
x
focal point (the “observer”)
screen
€
x = + fXZ
y = + f YZ
The pinhole camera model
! In the pinhole camera, the world appears to be upside down (or 180° rotated).
! The alternative interpretation is useful in computer graphics. It tells you exactly where to draw P on a screen, in front of the observer, in order to make it appear real for the observer. (OBS: the change of sign)
! The alternative interpretation leads directly to “projective geometry”.
Projective Geometry (Very Briefly)
! Points in 2-D are represented by lines in 3-D ! The 3-D space is called the embedding space ! All points along a line are equivalent ! This is analogous to a photography, every point (position) in a
photograph (2-D) corresponds to a line or ray in reality (3-D) Equivalence class
x
α x
x
Projective Geometry (Very Briefly)
! We can convert points in the ordinary plane to the projective plane
! 2-D (x,y) # 3-D (x,y,1) ! In general: D-dimensional # (D+1)-dimensional ! Points x and α x are equivalent, α ≠ 0
1
Equivalence class
x
α x
x
Projective Geometry (Very Briefly)
1
x
1
(linear) transformation H x’
€
α x'y'1
#
$
% % %
&
'
( ( (
=
h11 h12 h13
h21 h22 h23
h31 h32 h33
#
$
% % %
&
'
( ( (
xy1
#
$
% % %
&
'
( ( (
€
x'= h11x + h12y + h13h31x + h32y + h33
y'=h21x + h22y + h23h31x + h32y + h33
€
⇔
Projective Geometry (Very Briefly)
! Homography, a map from (D+1)-dim to (D+1)-dim ! Linear in the (D+1)-dim embedding space ! x’ = H x ! Represents a perspective transformation in D-dim space ! This is very nice!
1
x
1
(linear) transformation H x’
Projective Geometry (Very Briefly)
! Using homographies, we can express a rich class of transformations using linear mappings
Identity Similarity Isometric Affine Perspective
€
R −Rt0 1#
$ %
&
' (
€
sR −Rt0 1
#
$ %
&
' (
€
A t0 1"
# $
%
& '
€
H = I
€
det(H) ≠ 0
Perspective Transformations
! Remember this example? We wanted to compute the perspective transformation parameters.
From Feature based methods for structure and motion estimation by P. H. S. Torr and A. Zisserman
Perspective Transformations
€
x'= h11x + h12y + h13h31x + h32y + h33
y'=h21x + h22y + h23h31x + h32y + h33
! Estimating H from point correspondences (simplified version, check the book for a more advanced version)
! Each point correspondence translates to 2 linear equations (in the coefficients of H)
! Assuming h33 =1, we need 4 corresponding 2-D point pairs (x,y,x’,y’) to solve this equation system (8 unknowns).
! This way of solving the for the parameters has severe practical disadvantages, but it shows that it is possible at least...
€
h31xx'+h32yx'+h33x'−h11x − h12y − h13 = 0h31xy'+h32yy'+h33y'−h21x − h22y − h23 = 0
€
Px (x,y,x ',y')Py (x,y,x ',y')"
# $
%
& ' h =
00"
# $ %
& '
€
⇔
€
⇔
Perspective Transformations
! A cleaner and more stable solution ! Multiply both sides with the “cross product matrix”
€
α
0 −1 y'1 0 −x'−y ' x ' 0
$
%
& & &
'
(
) ) ) x'y'1
$
%
& & &
'
(
) ) )
=
0 −1 y '1 0 −x '−y' x' 0
$
%
& & &
'
(
) ) )
h11 h12 h13
h21 h22 h23
h31 h32 h33
$
%
& & &
'
(
) ) ) xy1
$
%
& & &
'
(
) ) )
0 =
0 −1 y '1 0 −x '−y' x' 0
$
%
& & &
'
(
) ) )
h11 h12 h13
h21 h22 h23
h31 h32 h33
$
%
& & &
'
(
) ) ) xy1
$
%
& & &
'
(
) ) )
0 =Q(x,y,x',y ')h
€
⇔
€
⇔
“Now three equations killing two unknowns”
Single perspective camera
C
Oi
X
u
αu =f s −w0
0 g −v0
0 0 1
"
#
$$$$
%
&
''''
1 0 0 00 1 0 00 0 1 0
"
#
$$$
%
&
''' R −Rt0T 1
"
#$$
%
&''X
αu =MX
f
M: Projection matrix
Internal parameters
External parameters
Single perspective camera
! Estimation of M from known coordinates (X,Y,Z,1) projections in a camera (x,y,1)
! This is analogous to the homographic projection ! Algorithms exist to solve this with 6
correspondences €
α x'y'1
#
$
% % %
&
'
( ( (
=
m11 m12 m13 m14
m21 m22 m23 m24
m31 m32 m33 m34
#
$
% % %
&
'
( ( (
XYZ1
#
$
% % % %
&
'
( ( ( (
Single perspective camera
! This enables calibration from 6 known points ! M can be factored: You can estimate camera
focal length, image coordinate systems, camera position and rotation.
! Triangulation: If you known several Mi, then you can also estimate a position X (3-D) using several camera projections ui ,(2-D).
Marker based motion capture
Images: courtesy of Lennart Svensson
Mocap
Images: courtesy of Lennart Svensson
External calibration
! Rotation + position, 6 DoF, ”calibration”
Images: courtesy of Lennart Svensson
Motion capture applications
! Animation ! Biomechanical analysis ! Industrial analysis
Images: courtesy of Lennart Svensson
Image formation – Lenses
! Thin lens !
€
zz'= f 2
Image focal point
object focal point
Image plane
€
z'
€
f
€
f
€
z
Object plane
Image formation – Lenses
! Magnification, m = x/X ! From similarity, x/z’ = X/f
Image focal point
object focal point
Image plane
€
z'
€
f
€
f
€
z
Object plane
€
m =xX
=fz
=z'f
€
x
€
X
Image formation – Lenses
! Depth of field
! Thus, objects within depth of field, are scattered within an area smaller than a pixel, i.e. they are depicted sharp
Image focal point
object focal point
Image plane
€
z'
€
f
€
f
€
z
Object plane
€
ε
€
Δz
€
Δz
= size of a pixel
Image formation – Lenses
Image focal point
object focal point
Image plane
€
z'
€
f
€
f
€
z
Object plane
€
ε = size of a pixel
! Depth of field
! Aperture size and focal length both affects the depth of field. A larger aperture will yield a smaller depth of field.
€
Δz
€
Δz
Image formation – Lenses
Image focal point
object focal point
Image plane
€
z'
€
f
€
f
€
z
Object plane
€
ε = size of a pixel
€
Δz'
€
Δz'
! Depth of focus
! “Depth of focus” is analogous. How much the image plane can be shifted without scattering light from a point in focus more than a pixel
AACAM – @ Matlab File Exchange
! Matlab code for non-perfect pinhole camera " Set aperture radius and focal length " Set depth of field " Set object distance and aperture radius
Image formation – Lenses
! (systems of) lenses # distortions: ! Spherical aberration ! Shorter focal length close to edges of lens
(Image from wikipedia)
Image formation – Lenses
! (systems of) lenses # distortions: ! Coma
(Image from wikipedia)
Image formation – Lenses
! (systems of) lenses # distortions: ! Chromatic aberration
(Image from wikipedia)
Image formation – Lenses
! (systems of) lenses # distortions: ! Astigmatism
(Image from wikipedia)
Image formation – Lenses
! (systems of) lenses # distortions: ! Geometric distortion
(Image from wikipedia)
Barrel distortion Pincushion distortion
Is this really a problem?
! In old and cheap cameras, yes ! Uppsala 1999-01-01
From http://www.uu.se/carpediem/1999/
Is this really a problem?
! But also for e.g. modern GoPRO cameras!
Camera Calibration Toolbox
! A Matlab toolbox for camera calibration: ! http://www.vision.caltech.edu/bouguetj/calib_doc/ ! Freely available
Camera Toolbox Calibration
! Focal length: The focal length in pixels is stored in the 2x1 vector fc. ! Principal point: The principal point coordinates are stored in the 2x1
vector cc. ! Skew coefficient: The skew coefficient defining the angle between
the x and y pixel axes is stored in the scalar alpha_c. ! Distortions: The image distortion coefficients (radial and tangential
distortions) are stored in the 5x1 vector kc.
Stereo – Basic equations
x z
B
f
P=(X,Y,Z)
x1 x2
€
x1 = − fXZ
€
x2 = − fX − BZ
€
⇒ Z =fB
x2 − x1=fBd
P=(X,Y,Z)
B
Stereo – the general case
! It may happen that the relation between the two cameras is not a paralax translation
! Then the “epipolar constraint” applies ! By “rectification” epipolar lines are aligned with
scanlines
From: Epipolar Rectification by Fusiello et al.
Stereo – Disparity Estimation
! Search horizontally for patch disparity, use e.g. sum of squared differences (SSD)
Teddy dataset, from http://cat.middlebury.edu/
Stereo – Depth estimation
! A simple formula converting disparity d to distance z when the inter camera distance is B:
!
€
Z =fBd
Patch based estimate Ground truth
Stereo – Constraints
! Constraints (Marr and Poggio): " Each point in each image is assigned at most one
disparity value " The disparity varies smoothly at most locations in the
images ! However… ! Different regularization
may be applied to the depth function x
z
x1 x2
Stereo from Segmentation
! Alternative approach: " Make a segmentation of the image first " Apply a linear model in each segmented region " Refine the models in the regions …
From Segment-based Stereo Matching Using Graph Cuts by Hong and CHen
Large Scale 3D Maps (C3/SAAB)
d
Courtesy of Petter Torle C3 Technologies
Large Scale 3D Maps (C3/SAAB)
Structured Light
! A lightsource helps the stereo algorithm to find matching points.
! Often used in industrial applications
From: http://mesh.brown.edu/3DPGP-2009/homework/hw2/hw2.html
More Structured Light
! Microsoft Kinect, using infrared light
• http://www.youtube.com/watch?v=nvvQJxgykcU
Other Computer Vision Code
! Open CV " Free to use " Supports IPP speedups " http://en.wikipedia.org/wiki/OpenCV " http://sourceforge.net/projects/opencvlibrary/ " http://opencv.willowgarage.com/wiki/
! Intel® Integrated Performance Primitives 6.0 " http://www.intel.com/cd/software/products/asmo-na/eng/
302910.htm " Commercial (but cheap) " Includes Computer Vision, Signal Processing, Data
Compression, ….
Typical Exam Questions …
! Project this object (points) using a pinhole camera
! Can geometric transformation compensate for lens distortions in general?
! Explain the parameters building up the projection matrix M
€
u =
f s −w0
0 g −v0
0 0 1
#
$
% % %
&
'
( ( (
1 0 0 00 1 0 00 0 1 0
#
$
% % %
&
'
( ( (
R −Rt0T 1#
$ %
&
' ( X
u =MX
Thank You!
! Email questions to: [email protected]