Computer Vision : CISC 4/689
Gradients and edges
• Points of sharp change in an image are interesting:– change in reflectance
– change in object
– change in illumination
– noise
• Sometimes called edge points
• General strategy– determine image gradient
– now mark points where gradient magnitude is particularly large wrt neighbours (ideally, curves of such points).
Computer Vision : CISC 4/689
The Gradient and Edges
• Consider image intensities as a 2-D height function I(x, y). Then the image gradient is the vector field defined by:
• Definition of an edge– Line segment separating regions of contrasting intensity– Location: Where gradient magnitude is high – Direction: Orthogonal to the gradient
Computer Vision : CISC 4/689
Edge Causes
• Depth discontinuity
• Surface orientation discontinuity
• Reflectance discontinuity (i.e., change in surface material properties)
• Illumination discontinuity (e.g., shadow)
Computer Vision : CISC 4/689
Edge Detection• An edge point can be regarded as a point in an image
where a discontinuity (in gradient) occurs across some line. A discontinuity may be classified as one of five types
• Searching for Edges:– Filter: Smooth image– Enhance: Apply numerical derivative approximation– Detect: Threshold to find strong edges– Localize/analyze: Reject spurious edges, include
weak but justified edges
Gradient Discontinuity -- where the gradient of the pixel values changes across a line. This type of discontinuity can be classed as roof edges ramp edges convex edges concave edges by noting the sign of the component of the gradient perpendicular to the edge on either side of the edge. Ramp edges have the same signs in the gradient components on either side of the discontinuity, while roof edges have opposite signs in the gradient components.
A Jump or Step Discontinuity -- where pixel values themselves change suddenly across some line. A Bar Discontinuity -- where pixel values rapidly increase then decrease again (or vice versa) across some line.
Source: http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/MARSHALL/node28.html
Computer Vision : CISC 4/689
Step edge detection: First Derivative Operators
• Method: Differentiate and find extrema
• Examples– Sobel operator (Matlab: edge(I, ‘sobel’))
– Prewitt, Roberts cross
– Derivative of Gaussian
-1-2-1
000
121
-101
-202
-101
Sobel x Sobel y
Book uses thisformat
Computer Vision : CISC 4/689
Sobel Edge Filtering Example
1 0 -1
2 0 -2
1 0 -1
0 0 2 2
0 0 2 2
0 0 2 2
0 0 2 2
Rotate
10-1
20-2
10-1
Computer Vision : CISC 4/689
Step 1
0
0
1
2
2
1
2
2
22
20
20
22 0
0
0
0
0
2
2
2
2
20
20
20
20
00-1
00-2
10-1
10-1
20-2
10-1
Computer Vision : CISC 4/689
Step 2
0
0
1
2
2
2
3
2
22
20
20
22 60
0
0
0
0
2
2
2
2
20
20
20
20
200
400
10-1
10-1
20-2
10-1
Computer Vision : CISC 4/689
Step 3
0
0
1
2
2
2
3
2
30
20
20
30 6 60
0
0
0
0
2
2
2
2
20
20
20
20
200
400
10-1
10-1
20-2
10-1
Computer Vision : CISC 4/689
Step 4
0
0
0
0
2
2
3
2
30
20
20
30 6 6 -60
0
0
0
0
2
2
2
2
20
20
20
20
10-2
20-4
10-1
10-1
20-2
10-1
edgeeffectfrom zero-padding
Computer Vision : CISC 4/689
Sobel Edge Filtering Example: Result
6
8
8
6
6
8
8
6
-80
-60
-80
-60
(pad with zeroes again, the boundary)and then we threshold…
Computer Vision : CISC 4/689
Sobel Edge Detection: Gradient Approximation
Horizontal diff. Vertical diff.
-1-2-1
000
121
-101
-202
-101
Note anisotropy of edge finding
Computer Vision : CISC 4/689
Sobel
• These can then be combined together to find the absolute magnitude of the gradient at each point and the orientation of that gradient. The gradient magnitude is given by:
• an approximate magnitude is computed using:
which is much faster to compute.
• The angle of orientation of the edge (relative to the pixel grid) giving rise to the spatial gradient is given by:
In this case, orientation 0 is taken to mean that the direction of maximum contrast from black to white runs from left to right on the image, and other angles are measured anti-clockwise from this.
Computer Vision : CISC 4/689
Derivative of Gaussian
Computer Vision : CISC 4/689
Smoothing and Differentiation
• Issue: noise– smooth before differentiation
– two convolutions: to smooth, then differentiate?
– actually, no - we can use a derivative of Gaussian filter
• because differentiation is convolution, and convolution is associative
Computer Vision : CISC 4/689
The Laplacian of Gaussian
• Another way to detect an extremal first derivative is to look for a zero second derivative– the Laplacian
• Bad idea to apply a Laplacian without smoothing– smooth with Gaussian, apply
Laplacian
– this is the same as filtering with a Laplacian of Gaussian filter
• Now mark the zero points where there is a sufficiently large (first) derivative, and enough contrast
Computer Vision : CISC 4/689
Marr-Hildreth operator
• The Laplacian is linear and rotationally symmetric. Thus, we search for the zero crossings of the image that is first smoothed with a Gaussian mask and then the second derivative is calculated; or we can convolve the image with the Laplacian of the Gaussian, also known as the LoG operator;
• This defines the Marr-Hildreth operator.
• One can also get a shape similar to G'' by taking the difference of two Gaussians having different standard deviations. A ratio of standard deviations of 1:1.6 will give a close approximation to .This is known as the DoG operator (Difference of Gaussians), or the Mexican Hat Operator.
• Still sensitive to noise.
Computer Vision : CISC 4/689
Step edge detection: 2nd-Derivative Operators
• Method: 2nd derivative is 0 for 1st-derivative extrema, so find “zero-crossings”– Laplacian
Isotropic (finds edges regardless of orientation.
Three commonly used discrete approximations to the Laplacian filter. (Note, we have defined the Laplacian using a negative peak because this is more common, however, it is equally valid to use the opposite sign convention.) Source: http://www.cee.hw.ac.uk/hipr/html/log.html
Computer Vision : CISC 4/689
Laplacian of Gaussian
• Matlab: fspecial(‘log’,…)
Below: Discrete approximation to LoG function with Gaussian 1.4
Computer Vision : CISC 4/689
Sobel vs. LoG Edge Detection:Matlab Automatic Thresholds
Sobel LoG
Computer Vision : CISC 4/689
There are three major issues: 1) The gradient magnitude at different scales is different; which should we choose? 2) The gradient magnitude is large along thick trail (for 3rd fig); how do we identify the significant points? 3) How do we link the relevant points up into curves?
= 1 = 2
Computer Vision : CISC 4/689
We wish to mark points along the curve where the magnitude is biggest.We can do this by looking for a maximum along a slice normal to the curve(non-maximum suppression). These points should form a curve. There arethen two algorithmic issues: at which point is the maximum, and where is thenext one?
Computer Vision : CISC 4/689
Non-maximumsuppression
At q, we have a maximum if the value is larger than those at both p and at r. Interpolate to get these values.
Computer Vision : CISC 4/689
Predictingthe nextedge point
Assume the marked point is an edge point. Then we construct the tangent (along) to the edge curve (which is normal to the gradient at that point) and use this to predict the next points (here either r or s).
Computer Vision : CISC 4/689
Remaining issues
• Check that maximum value of gradient value is sufficiently large– drop-outs? use hysteresis
• use a high threshold to start edge curves and a low threshold to continue them.
Computer Vision : CISC 4/689
Computer Vision : CISC 4/689
fine scalehigh Threshold(be strict inAcceptingEdge points)
Computer Vision : CISC 4/689
coarse scale,high threshold
Computer Vision : CISC 4/689
coarsescalelowthreshold
Computer Vision : CISC 4/689
Canny Edge Detection
• Steps1. Apply derivative of Gaussian (not Laplacian!)
2. Non-maximum suppression
• Thin multi-pixel wide “ridges” down to single pixel
3. Thresholding
• Low, high edge-strength thresholds
• Accept all edges over low threshold that are connected to edge over high threshold (in the stage of predicting next edge point)
• Matlab: edge(I, ‘canny’)
Computer Vision : CISC 4/689
Edge “Smearing”
from Forsyth & Ponce
0
0
0
0
2
2
2
2
20
20
20
20
2
2
2
2
Sobel filter example: Yields2-pixel wide edge “band”
We want to localize the edge to within 1 pixel
6
8
8
6
6
8
8
6
00
00
00
00
-8
-6
-6
-8
Input
Result
Computer Vision : CISC 4/689
Non-Maximum Suppression: Steps
1. Consider 9-pixel neighborhood around each edge candidate (i.e., already over a threshold)
2. Interpolate edge strengths E at neighborhood boundaries in negative & positive gradient directions from the center pixel
3. If the pixel under consideration is not greater than these two values (i.e. not a maximum), it is suppressed
Interpolating the E value:
E(r) = (1 ¡ a)E(x, y) + aE(x + 1, y)
a 1 ¡ a
(x, y) (x + 1, y)r
Computer Vision : CISC 4/689
Example: Non-Maximum Suppression
courtesy of G. Loy
Original image Gradient magnitudeNon-maxima suppressed
Computer Vision : CISC 4/689
Edge “Streaking”
• Can predict next pixel in edge orthogonal to gradient to make edge chain– Can also just use 8-connectedness to define chains
• Streaking: Gaps in edge chain due to edge strength dipping below threshold
courtesy of G. Loy
Original image Strong edges
gap
Computer Vision : CISC 4/689
Edge Hysteresis
• Hysteresis: A lag or momentum factor
• Idea: Maintain two thresholds khigh and klow
– Use khigh to find strong edges to start edge chain
– Use klow to find weak edges which continue edge chain
• Usual ratio of thresholds is roughly
khigh / klow = 2 or 3
Computer Vision : CISC 4/689
Example: Canny Edge Detection
courtesy of G. Loy
gap is gone
Originalimage
Strongedges
only
Strong +connectedweak edges
Weakedges
Computer Vision : CISC 4/689
Example: Canny Edge Detection
(Matlab automatically set thresholds)
Computer Vision : CISC 4/689
Image Pyramids
• Observation: Fine-grained template matching expensive over a full image – Idea: Represent image at smaller
scales, allowing efficient coarse- to-fine search
• Downsampling: Cut width, height in half at each iteration:
from Forsyth & Ponce
Computer Vision : CISC 4/689
Gaussian Pyramid
• Let the base (the finest resolution) of an n-level Gaussian pyramid be defined
as P0 = I. Then the ith level is reduced from the level below it by:
• Upsampling S"(I): Double size of image, interpolate missing pixels
courtesy of Wolfram
Gaussian pyramid
Computer Vision : CISC 4/689
Reconstruction
Computer Vision : CISC 4/689
Laplacian Pyramids
• The tip (the coarsest resolution) of an n-level Laplacian pyramid is the same as the Gaussian pyramid at that level: Ln(I) = Pn(I)
• The ith level is obtained from the level above according to Li(I) = Pi(I) ¡ S"(Pi+1(I))
• Synthesizing the original image: Get I back by summing upsampled Laplacian pyramid levels
Computer Vision : CISC 4/689
Laplacian Pyramid
• The differences of images at successive levels of the Gaussian pyramid define the Laplacian pyramid. To calculate a difference, the image at a higher level in the pyramid must be increased in size by a factor of four prior to subtraction. This computes the pyramid.
• The original image may be reconstructed from the Laplacian pyramid by reversing the previous steps. This interpolates and adds the images at successive levels of the pyramid beginning with the coarsest level.
• Laplacian is largely uncorrelated, and so may be represented pixel by pixel with many fewer bits than Gaussian.
courtesy of Wolfram
Computer Vision : CISC 4/689
Splining
• Build Laplacian pyramids LA and LB for A & B images
• Build a Gaussian pyramid GR from selected region R
• Form a combined pyramid LS from LA and LB using nodes of GR as weights:
LS(I,j) = GR(I,j)*LA(I,j)+(1-GR(I,j))*LB(I,j)
Collapse the LS pyramid to get the final blended image
Computer Vision : CISC 4/689
Splining (Blending)
• Splining two images simply requires: 1) generating a Laplacian pyramid for each image, 2) generating a Gaussian pyramid for the bitmask indicating how the two images should be merged, 3) merging each Laplacian level of the two images using the bitmask from the corresponding Gaussian level, and 4) collapsing the resulting Laplacian pyramid.
• i.e. GS = Gaussian pyramid of bitmask LA = Laplacian pyramid of image "A" LB = Laplacian pyramid of image "B" therefore, "Lout = (GS)LA + (1-GS)LB"
Computer Vision : CISC 4/689
Example images from GTech
Image-1 bit-mask image-2
Direct addition splining bad bit-mask choice
Computer Vision : CISC 4/689
Outline
• Corner detection
• RANSAC
Computer Vision : CISC 4/689
Matching with Invariant Features
Darya Frolova, Denis Simakov
The Weizmann Institute of Science
March 2004
Computer Vision : CISC 4/689
Example: Build a Panorama
M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 2003
Computer Vision : CISC 4/689
How do we build panorama?
• We need to match (align) images
Computer Vision : CISC 4/689
Matching with Features
•Detect feature points in both images
Computer Vision : CISC 4/689
Matching with Features
•Detect feature points in both images
•Find corresponding pairs
Computer Vision : CISC 4/689
Matching with Features
•Detect feature points in both images
•Find corresponding pairs
•Use these pairs to align images
Computer Vision : CISC 4/689
Matching with Features
• Problem 1:– Detect the same point independently in both images
no chance to match!
We need a repeatable detector
Computer Vision : CISC 4/689
Matching with Features
• Problem 2:– For each point correctly recognize the corresponding one
?
We need a reliable and distinctive descriptor
Computer Vision : CISC 4/689
More motivation…
• Feature points are used also for:– Image alignment (homography, fundamental matrix)
– 3D reconstruction
– Motion tracking
– Object recognition
– Indexing and database retrieval
– Robot navigation
– … other
Computer Vision : CISC 4/689
Corner Detection
• Basic idea: Find points where two edges meet—i.e., high gradient in two directions
• “Cornerness” is undefined at a single pixel, because there’s only one gradient per point– Look at the gradient behavior over a small window
• Categories image windows based on gradient statistics– Constant: Little or no brightness change– Edge: Strong brightness change in single direction– Flow: Parallel stripes– Corner/spot: Strong brightness changes in orthogonal directions
Computer Vision : CISC 4/689
Corner Detection: Analyzing Gradient Covariance
• Intuitively, in corner windows both Ix and Iy should be high– Can’t just set a threshold on them directly, because we want rotational invariance
• Analyze distribution of gradient components over a window to differentiate between types from previous slide:
• The two eigenvectors and eigenvalues ¸1, ¸2 of C (Matlab: eig(C)) encode the predominant directions and magnitudes of the gradient, respectively, within the window
• Corners are thus where min(¸1, ¸2) is over a threshold courtesy of Wolfram
Computer Vision : CISC 4/689
Contents
• Harris Corner Detector
– Description
– Analysis
• Detectors
– Rotation invariant
– Scale invariant
– Affine invariant
• Descriptors
– Rotation invariant
– Scale invariant
– Affine invariant
Computer Vision : CISC 4/689
Harris Detector: Mathematics
2
,
( , ) ( , ) ( , ) ( , )x y
E u v w x y I x u y v I x y
Change of intensity for the shift [u,v]:
IntensityShifted intensity
Window function
orWindow function w(x,y) =
Gaussian1 in window, 0 outside
Taylor series:F(x+dx,y+dy) = f(x,y)+fx(x,y)dx+fy(x,y)dy+…http://mathworld.wolfram.com.TaylorSeries.html
Computer Vision : CISC 4/689
Harris Detector: Mathematics
( , ) ,u
E u v u v Mv
For small shifts [u,v] we have a bilinear approximation:
2
2,
( , ) x x y
x y x y y
I I IM w x y
I I I
where M is a 22 matrix computed from image derivatives:
Computer Vision : CISC 4/689
Harris Detector: Mathematics
( , ) ,u
E u v u v Mv
Intensity change in shifting window: eigenvalue analysis
1, 2 – eigenvalues of M
direction of the slowest change
direction of the fastest change
(max)-1/2
(min)-1/2
Ellipse E(u,v) = const
Computer Vision : CISC 4/689
Harris Detector: Mathematics
1
2
“Corner”1 and 2 are large,
1 ~ 2;
E increases in all directions
1 and 2 are small;
E is almost constant in all directions
“Edge” 1 >> 2
“Edge” 2 >> 1
“Flat” region
Classification of image points using eigenvalues of M:
Computer Vision : CISC 4/689
Harris Detector: Mathematics
Measure of corner response:
2det traceR M k M
1 2
1 2
det
trace
M
M
(k – empirical constant, k = 0.04-0.06)
Computer Vision : CISC 4/689
Harris Detector: Mathematics
1
2 “Corner”
“Edge”
“Edge”
“Flat”
• R depends only on eigenvalues of M
• R is large for a corner
• R is negative with large magnitude for an edge
• |R| is small for a flat region
R > 0
R < 0
R < 0|R| small
Computer Vision : CISC 4/689
Harris Detector
• The Algorithm:– Find points with large corner response function R (R >
threshold)
– Take the points of local maxima of R
Top Related