Motion ECE 847: Digital Image Processing Stan Birchfield Clemson University.
-
Upload
meryl-chapman -
Category
Documents
-
view
216 -
download
2
Transcript of Motion ECE 847: Digital Image Processing Stan Birchfield Clemson University.
Motion
ECE 847:Digital Image Processing
Stan BirchfieldClemson University
What if you only had one eye?
Depth perception is possible by moving eye
ParallaxMotion parallax – apparent
displacement of object viewed along two different lines of sight
http://www.infovis.net/imagenes/T1_N144_A6_DifMotion.gif
Head movements for depth perception
Some animals move their heads to generate parallax
http://www.animalwebguide.com/Praying-Mantis.htmhttp://pinknpurplelizard.files.wordpress.com/2008/06/pigeon1.jpg
pigeon praying mantis
Motion field and optical flow
• Motion field – The actual 3D motion projected onto image plane
• Optical flow – The “apparent” motion of the brightness pattern in an image(sometimes called optic flow)
When are the two different?
Barber pole illusion Television / movies
Rotating ping-pong ball
http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/OWENS/LECT12/node4.html
(no optical flow)
(no motion field)
* From Marc Pollefeys COMP 256 2003
Optical flow breakdown
Perhaps an aperture problem discussed later.Perhaps an aperture problem discussed later.
Another illusionAnother illusion
Motion fieldpoint in world: projection onto image:
motion of point:
where
Expanding yields:
Motion field (cont.)velocity of projection:
Expanding yields:
Note that rotationgives no depthinformation
Motion field (cont.)
Note: This is a radial field emanating from
Two special cases:
Optical Flow Assumptions:Brightness Constancy
* Slide from Michael Black, CS143 2003
Optical Flow Assumptions:
* Slide from Michael Black, CS143 2003
Optical Flow Assumptions:
* Slide from Michael Black, CS143 2003
Optical flow
Image at time t Image at time t + t
Brightness constancy assumption:
Taylor series expansion:
Putting together yields:
Optical flow (cont.)From previous slide:
Divide both sides by t and take the limit:
orstandard
optical flowequation
More compactly,
Aperture problemFrom previous slide:
This is one equation, two unknowns!(Underconstrained problem)
We can only compute component of motion in direction of gradient:
I(x,y,t)=isophote
I(x,y,t+t)=isophote
true motion
another possibleanswer
gradient
Key idea:Any function looks
linear through small `aperture’
Aperture Problem Exposed
Motion along just an edge is ambiguous
from G. Bradski, CS223B
Two approaches to overcome aperture problem
• Horn-Schunck (1980)– Assume neighboring pixels are similar– Add regularization term to enforce
smoothness– Compute (u,v) for every pixel in image Dense optical flow
• Lucas-Kanade (1981)– Assume neighboring pixels are same– Use additional equations to solve for motion of
pixel– Compute (u,v) for a small number of pixels
(features); each feature treated independently of other features
Sparse optical flow
Lucas-KanadeRecall scalar equation with two unknowns:
Assume neighboring pixels have same motion:
where N is the number of pixels in the window
Can solve this directly using least squares, or …
or
Lucas-KanadeMultiply by AT:
or
2x2 matrix 2x1 vector
or
Note:
RGB version• For color images, we can get more
equations by using all color channels
• E.g., for 7x7 window we have 49*3=147 equations!
– Can solve using Newton’s method• Also known as Newton-Raphson method
– Lucas-Kanade method does one iteration of Newton’s method
• Better results are obtained via more iterations
Improving accuracy• Recall our small motion assumption
• This is not exact– To do better, we need to add higher order terms back in:
• This is a polynomial root finding problem
It-1(x,y)
It-1(x,y)
It-1(x,y)
* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
1. Solve 2x2 equation to get motion for each pixel
2. Shift second image using estimated motion(Use interpolation to improve accuracy)
3. Repeat until convergence
Iterative Lucas-Kanade Algorithm
shift image2
Lucas-Kanade Algorithm
solving 2x2 equationis easy
Linear interpolation
x0 x0+1 x0+2x0-1 x
f(x0-1)f(x0)
f(x0+1)
f(x0+2)
f(x) ≈ f(x0) + (x-x0) ( f(x0+1) – f(x0) ) = f(x0) + ( f(x0+1) – f(x0) )
= (1 – ) f(x0) + f(x0+1)
0≤<1
1D function
Bilinear interpolationx0 x0+1 x0+2x0-1
y0
y0+1
y0+2
y0-1
x
y
2D function
f(x0,y) ≈ (1-)f00 + f01 f(x0+1,y) ≈ (1-)f10 + f11
f(x,y) ≈ (1-) [(1-)f00 + f01] + [(1-)f10 + f11] = (1-)(1-)f00 + (1-)f01 + (1-)f10 + f11
f00 f10
f11f01
Bilinear interpolationx0 x0+1 x0+2x0-1
y0
y0+1
y0+2
y0-1
x
y
2D function
f(x,y0) ≈ (1-)f00 + f10 f(x,y0+1) ≈ (1-)f01 + f11
f(x,y) ≈ (1-) [(1-)f00 + f10] + [(1-)f01 + f11] = (1-)(1-)f00 + (1-)f10 + (1-)f01 + f11
f00 f10
f11f01
sameresult
Bilinear interpolation• Simply compute double weighted
average of four nearest pixels
• Be careful to handle boundaries correctly
How does Lucas-Kanade work?
• Recall Newton’s method(or Newton-Raphson)
• To find root of function, use first-order approximation
• Start with initial guess• Iterate:
– Use derivative to find root, under assumption that function is linear
– Result become new estimate for next iteration
http://en.wikipedia.org/wiki/Newton-Raphson_method
Optical Flow: 1D Case Brightness Constancy Assumption:
)),(()),(()( dttdttxIttxItf
0)(
txt t
I
t
x
x
I
Ix v It
x
t
I
Iv
0
)(
t
xfBecause no change in brightness with time
from G. Bradski, CS223B
v
??
Tracking in the 1D case:
x
),( txI )1,( txI
p
from G. Bradski, CS223B
v
xI
Spatial derivativeSpatial derivative
Temporal derivativeTemporal derivativetI
Tracking in the 1D case:
x
),( txI )1,( txI
p
tx x
II
px
t t
II
x
t
I
Iv
Assumptions:Assumptions:
• Brightness constancyBrightness constancy• Small motionSmall motion
from G. Bradski, CS223B
Tracking in the 1D case:
x
),( txI )1,( txI
p
xI
tI
Temporal derivative at 2Temporal derivative at 2ndnd iteration iteration
Iterating helps refining the velocity vectorIterating helps refining the velocity vector
Can keep the same estimate for spatial derivativeCan keep the same estimate for spatial derivative
x
tprevious I
Ivv
Converges in about 5 iterationsConverges in about 5 iterations
from G. Bradski, CS223B
From 1D to 2D tracking
0)(
txt t
I
t
x
x
I1D:
0)(
txtt t
I
t
y
y
I
t
x
x
I2D:
0)(
txtt t
Iv
y
Iu
x
I
One equation, two velocity (u,v) unknowns
from G. Bradski, CS223B
From 1D to 2D tracking
* Slide from Michael Black, CS143 2003
We get at most “Normal Flow” – with one point we can only detect movement perpendicular to the brightness gradient. Solution is to take a patch of pixelsaround the pixel of interest.
From 1D to 2D tracking
The Math is very similar:The Math is very similar:
v
x
),( txI )1,( txI
1v
2v
3v
4v
x
y)1,,( tyxI
),,( tyxI
x
t
I
Iv
Aperture problemAperture problem
bGv 1
p yyx
yxx
III
IIIG
aroundwindow 2
2
p ty
tx
II
IIb
aroundwindow
Window size here ~ 11x11Window size here ~ 11x11 from G. Bradski, CS223B
When does Lucas-Kanade work?
• ATA must be invertible• In real world, all matrices are invertible• Instead, we must ensure that ATA is well-
conditioned (not close to singular):– Both eigenvalues are large
– Ratio of eigenvalues max / min is not too large
Eigenvalues of Hessian
• Z is gradient covariance matrix– Related to autocorrelation of I
(Moravec interest operator)– Sometimes called Hessian
• Recall from PCA:– (Square root of) eigenvalues give
length of best fitting ellipse– Large eigenvalues means large
gradient vectors– Small ratio means information in all
directions
Ix
Iy
Finding good features
Three cases:
1. 1 and 2 small
Not enough texture for tracking
2. 1 large, 2 small
On intensity edge
Aperture problem: Only motion perpendicular to edge can be found
3. 1 and 2 largeGood feature to track
Ix
Iy
Ix
Iy
Iy
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Note: Even though tracking is a two-frame problem, we can determine good features from only one frame
Finding good features
• In practice, eigenvalues cannot be too large,b/c image values range from [0,255]
• Solution: Threshold minimum eigenvalue (Shi and Tomasi 1994)
• An alternative approach (Harris and Stephens 1987): – Note that det(Z) = 12 and trace(Z) = 1+ 1
– Use det(Z) – k trace(Z)2, where k=0.04 The second term reduces effect of having a small eigenvalue with a large dominant eigenvalue (known as Harris corner detector or Plessey operator)
• Another alternative: det(Z) / trace(Z) = 1 / (1/1 + 1/2)
Good features code• % Harris Corner detector - by Kashif Shahzad• sigma=2; thresh=0.1; sze=11; disp=0;
• % Derivative masks• dy = [-1 0 1; -1 0 1; -1 0 1];• dx = dy'; %dx is the transpose matrix of dy• • % Ix and Iy are the horizontal and vertical edges of image• Ix = conv2(bw, dx, 'same');• Iy = conv2(bw, dy, 'same');• • % Calculating the gradient of the image Ix and Iy• g = fspecial('gaussian',max(1,fix(6*sigma)), sigma);• Ix2 = conv2(Ix.^2, g, 'same'); % Smoothed squared image derivatives• Iy2 = conv2(Iy.^2, g, 'same');• Ixy = conv2(Ix.*Iy, g, 'same');• • % My preferred measure according to research paper• cornerness = (Ix2.*Iy2 - Ixy.^2)./(Ix2 + Iy2 + eps);• • % We should perform nonmaximal suppression and threshold• mx = ordfilt2(cornerness,sze^2,ones(sze)); % Grey-scale dilate• cornerness = (cornerness==mx)&(cornerness>thresh); % Find maxima• [rws,cols] = find(cornerness); % Find row,col coords.
• clf ; imshow(bw);• hold on;• p=[cols rws];• plot(p(:,1),p(:,2),'or');• title('\bf Harris Corners')
from Sebastian Thrun, CS223B Computer Vision, Winter 2005
Example (=0.1)
from Sebastian Thrun, CS223B Computer Vision, Winter 2005
Example (=0.01)
from Sebastian Thrun, CS223B Computer Vision, Winter 2005
Example (=0.001)
from Sebastian Thrun, CS223B Computer Vision, Winter 2005
Feature tracking
• Identify features and track them over video– Usually use a few hundred features– When many features have been lost, renew feature
detection– Assume small difference between frames– Potential large difference overall
• Two problems:– Motion between frames may be large– Translation assumption is fine between consecutive
frames, but long periods of time introduce deformations
Revisiting the small motion assumption
• Is this motion small enough?– Probably not—it’s much larger than one pixel
(2nd order terms dominate)– How might we solve this problem?
* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
Reduce the resolution!
* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
image It-1 image I
Gaussian pyramid of image It-1 Gaussian pyramid of image I
image Iimage It-1u=10 pixels
u=5 pixels
u=2.5 pixels
u=1.25 pixels
Coarse-to-fine optical flow estimation
slides fromBradsky and Thrun
image Iimage J
Gaussian pyramid of image It-1 Gaussian pyramid of image I
image Iimage It-1
Coarse-to-fine optical flow estimation
run iterative L-K
run iterative L-K
warp & upsample
.
.
.
slides fromBradsky and Thrun
• Compute translation assuming it is small
Alternative derivation
differentiate:
Affine is also possible, but a bit harder (6x6 in stead of 2x2)
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Example
Simple displacement is sufficient between consecutive frames, but not to compare to reference template
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Example
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Synthetic example
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Good features to keep tracking
Perform affine alignment between first and last frameStop tracking features with too large errors
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Errors in Lucas-KanadeWhat are the potential causes of errors in this procedure?
– Suppose ATA is easily invertible
– Suppose there is not much noise in the image
• When our assumptions are violated– Brightness constancy is not satisfied– The motion is not small– A point does not move like its neighbors
• window size is too large• what is the ideal window size?
* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
Feature matching vs. tracking
Extract features independently and then match by comparing descriptors
Extract features in first images and then try to find same feature back in next view
What is a good feature?
Image-to-image correspondences are key to passive triangulation-based 3D reconstruction
M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/
Horn-Schunck
• Dense optical flow
• Minimize
where(data)
(smoothness)
Note: We want to find u and v to minimize the integral, but u and v are themselves functions of x and y!
Calculus of variations
• A function maps values to real numbers– To find the value that minimizes a
function, use calculus
• A functional maps functions to real numbers– To find the function that minimizes a
functional, use calculus of variations
Calculus of variations
Integral
is stationary where
independentvariable
dependentvariable
explicitfunction
Euler-Lagrange equations
C.o.v. applied to o.f.
independent variables dependent variables
explicit function
Euler-Lagrange equations
C.o.v. applied to o.f. (cont.)
Take derivatives:
Expand:
C.o.v. applied to o.f. (cont.)Plug back in:
Laplacian
where
Rearrange:
Approximate:
where (difference of Gaussians)
meanconstant What if =0?
C.o.v. applied to o.f. (cont.)Solve equation:
This is a pair of equations for each pixel. The combined system of equations is 2N x 2N, where N is the number of pixels in the image.
Because it is a sparse matrix, iterative methods (Jacobi iterations, Gauss-Seidel, successive over-relaxation) are more appropriate.
Iterate:
where
Stationary iterative methodsProblem is to solve sparse linear system:
Jacobi method (simply solve for element, using existing estimates):
Gauss-Seidel is “sloppy Jacobi” (use updates as soon as they are available):
Successive over relaxation (SOR) speeds convergence:
=1 means Gauss-SeidelG-S iterate
matrix = (diagonal) – (lower) – (upper)
Horn-Schunck Algorithm
Horn-Schunck Algorithm
www.cs.unc.edu/~sud/courses/256/optical_flow.ppt
note that this is defined differently
Time-to-Collision
Beyond two-frame tracking
Background subtraction and frame differencing
Spatiotemporal volumes
http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/COLLINS/TobyCollins.pdf
Layers
• Wang and Adelson