Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun...
Transcript of Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun...
![Page 1: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/1.jpg)
Visual Recognition: Filtering and Transformations
Raquel Urtasun
TTI Chicago
Jan 15, 2012
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 1 / 65
![Page 2: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/2.jpg)
Today’s lecture ...
More on Image Filtering
Additional transformations
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 2 / 65
![Page 3: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/3.jpg)
Readings
Chapter 2 and 3 of Rich Szeliski’s book
Available online here
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 3 / 65
![Page 4: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/4.jpg)
Image Sub-Sampling
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 4 / 65
![Page 5: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/5.jpg)
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 5 / 65
![Page 6: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/6.jpg)
Image Sub-Sampling
Throw away every other row and column to create a 1/2 size image
1/4
1/8
[Source: S. Seitz]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 6 / 65
![Page 7: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/7.jpg)
Image Sub-Sampling
Why does this look so crufty?
!"#$$%&'$())*+$ !",$$%#'$())*+$!"&$
[Source: S. Seitz]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 7 / 65
![Page 8: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/8.jpg)
Image Sub-Sampling
[Source: F. Durand]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 8 / 65
![Page 9: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/9.jpg)
Even worse for synthetic images
What’s happening?
[Source: L. Zhang]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 9 / 65
![Page 10: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/10.jpg)
Aliasing
Occurs when your sampling rate is not high enough to capture the amountof detail in your image
To do sampling right, need to understand the structure of your signal/image
The minimum sampling rate is called the Nyquist rate
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 10 / 65
![Page 11: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/11.jpg)
Aliasing
Occurs when your sampling rate is not high enough to capture the amountof detail in your image
To do sampling right, need to understand the structure of your signal/image
The minimum sampling rate is called the Nyquist rate
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 10 / 65
![Page 12: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/12.jpg)
Aliasing
Occurs when your sampling rate is not high enough to capture the amountof detail in your image
To do sampling right, need to understand the structure of your signal/image
The minimum sampling rate is called the Nyquist rate
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 10 / 65
![Page 13: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/13.jpg)
Aliasing problems
Shannons Sampling Theorem shows that the minimum sampling
fs ≥ 2fmax
If you haven’t seen this... take a class on Fourier analysis... everyone shouldhave at least one!
Figure: example of a 1D signal [R. Szeliski et al.]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 11 / 65
![Page 14: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/14.jpg)
Nyquist limit 2D example
Good sampling
Bad sampling
[Source: N. Snavely]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 12 / 65
![Page 15: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/15.jpg)
Going back to Downsampling ...
When downsampling by a factor of two, the original image has frequenciesthat are too high
How can we fix this?
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 13 / 65
![Page 16: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/16.jpg)
Going back to Downsampling ...
When downsampling by a factor of two, the original image has frequenciesthat are too high
How can we fix this?
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 13 / 65
![Page 17: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/17.jpg)
Gaussian pre-filtering
Solution: filter the image, then subsample
G 1/4
G 1/8
Gaussian 1/2
[Source: S. Seitz]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 14 / 65
![Page 18: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/18.jpg)
Subsampling with Gaussian pre-filtering
G 1/4 G 1/8 Gaussian 1/2
[Source: S. Seitz]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 15 / 65
![Page 19: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/19.jpg)
Compare with ...
1/4 (2x zoom) 1/8 (4x zoom) 1/2
[Source: S. Seitz]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 16 / 65
![Page 20: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/20.jpg)
And in 2D...
Figure: (a) Example of a 2D signal. (b–d) downsampled with different filters
[Source: R. Szeliski]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 17 / 65
![Page 21: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/21.jpg)
Gaussian pre-filtering
Solution: filter the image, then subsample
!"#$%
!"###$#&%
'#!'()*"+% !"#$% '#!'()*"+% ,%!%#
!%###$#&%
!&#!"#
[Source: N. Snavely]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 18 / 65
![Page 22: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/22.jpg)
Gaussian pre-filtering
!"#$%
!"###$#&%
'#!'()*"+% !"#$% '#!'()*"+% ,%!%#
!%###$#&%
!&#!"#
{ !"#$$%"&'()*"+%,'
[Source: S. Seitz]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 19 / 65
![Page 23: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/23.jpg)
Gaussian Pyramids [Burt and Adelson, 1983]
In computer graphics, a mip map [Williams, 1983]
A precursor to wavelet transform
How much space does a Gaussian pyramid take compared to the originalimage?
[Source: S. Seitz]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 20 / 65
![Page 24: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/24.jpg)
Gaussian Pyramids [Burt and Adelson, 1983]
In computer graphics, a mip map [Williams, 1983]
A precursor to wavelet transform
How much space does a Gaussian pyramid take compared to the originalimage?
[Source: S. Seitz]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 20 / 65
![Page 25: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/25.jpg)
Example of Gaussian Pyramid
[Source: N. Snavely]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 21 / 65
![Page 26: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/26.jpg)
Decimation or Sub-sampling
Decimation: reduces resolution
g(i , j) =∑k,l
f (k , l)h(i − k/r , j − l/r)
with r the down-sampling rate.
Different filters exist to do this.
What would you use?
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 22 / 65
![Page 27: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/27.jpg)
Decimation or Sub-sampling
Decimation: reduces resolution
g(i , j) =∑k,l
f (k , l)h(i − k/r , j − l/r)
with r the down-sampling rate.
Different filters exist to do this.
What would you use?
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 22 / 65
![Page 28: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/28.jpg)
Decimation or Sub-sampling
Decimation: reduces resolution
g(i , j) =∑k,l
f (k , l)h(i − k/r , j − l/r)
with r the down-sampling rate.
Different filters exist to do this.
What would you use?
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 22 / 65
![Page 29: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/29.jpg)
Image Up-Sampling
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 23 / 65
![Page 30: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/30.jpg)
Image Up-Sampling
This image is too small, how can we make it 10 times as big?
Simplest approach: repeat each row and column 10 times (Nearest neighborinterpolation)
[Source: N. Snavely]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 24 / 65
![Page 31: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/31.jpg)
Image Interpolation
!" #" $" %" &"
d = 1 in this example
Recall how a digital image is formed
F [x , y ] = quantize{f (xd , yd)}
It is a discrete point-sampling of a continuous function
If we could somehow reconstruct the original function, any new image couldbe generated, at any resolution and scale
[Source: N. Snavely, S. Seitz]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 25 / 65
![Page 32: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/32.jpg)
Image Interpolation
!" #" $" %" &"
d = 1 in this example
Recall how a digital image is formed
F [x , y ] = quantize{f (xd , yd)}
It is a discrete point-sampling of a continuous function
If we could somehow reconstruct the original function, any new image couldbe generated, at any resolution and scale
[Source: N. Snavely, S. Seitz]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 25 / 65
![Page 33: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/33.jpg)
Image Interpolation
!" #" $" %" &"
d = 1 in this example
Recall how a digital image is formed
F [x , y ] = quantize{f (xd , yd)}
It is a discrete point-sampling of a continuous function
If we could somehow reconstruct the original function, any new image couldbe generated, at any resolution and scale
[Source: N. Snavely, S. Seitz]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 25 / 65
![Page 34: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/34.jpg)
Image Interpolation
!" #" $" %" &"#'&"
!" d = 1 in this example
What if we don’t know f ?
Guess an approximation: Can be done in a principled way via filtering
Convert F to a continuous function
fF (x) =
{F ( x
d ) if xd is an integer
0 otherwise
Reconstruct by convolution with a reconstruction filter, h
f = h ∗ fF
[Source: N. Snavely, S. Seitz]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 26 / 65
![Page 35: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/35.jpg)
Image Interpolation
!" #" $" %" &"#'&"
!" d = 1 in this example
What if we don’t know f ?
Guess an approximation: Can be done in a principled way via filtering
Convert F to a continuous function
fF (x) =
{F ( x
d ) if xd is an integer
0 otherwise
Reconstruct by convolution with a reconstruction filter, h
f = h ∗ fF
[Source: N. Snavely, S. Seitz]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 26 / 65
![Page 36: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/36.jpg)
Image Interpolation
!" #" $" %" &"#'&"
!" d = 1 in this example
What if we don’t know f ?
Guess an approximation: Can be done in a principled way via filtering
Convert F to a continuous function
fF (x) =
{F ( x
d ) if xd is an integer
0 otherwise
Reconstruct by convolution with a reconstruction filter, h
f = h ∗ fF
[Source: N. Snavely, S. Seitz]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 26 / 65
![Page 37: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/37.jpg)
Image Interpolation
!" #" $" %" &"#'&"
!" d = 1 in this example
What if we don’t know f ?
Guess an approximation: Can be done in a principled way via filtering
Convert F to a continuous function
fF (x) =
{F ( x
d ) if xd is an integer
0 otherwise
Reconstruct by convolution with a reconstruction filter, h
f = h ∗ fF
[Source: N. Snavely, S. Seitz]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 26 / 65
![Page 38: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/38.jpg)
Image Interpolation
!"#$%&'()$*+,-.)/*0+,(
1$%)$-.2,$3456+)(3,.$)7+&%0+,(
83,$%)(3,.$)7+&%0+,(
9%/--3%,()$*+,-.)/*0+,(
:+/)*$;(<=(>/)&$--(
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 27 / 65
![Page 39: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/39.jpg)
Reconstruction filters
What does the 2D version of this hat function look like?
!"#$%#&'(()*+",#(*+-"#!%),.%+(
/-"+-($0+1.%+2(!"#$%#&'((!"#"$%&'("$)%'*+#&,+$(
Often implemented without cross-correlation, e.g.,http://en.wikipedia.org/wiki/Bilinear_interpolation
Better filters give better resampled images: Bicubic is a common choice
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 28 / 65
![Page 40: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/40.jpg)
Reconstruction filters
What does the 2D version of this hat function look like?
!"#$%#&'(()*+",#(*+-"#!%),.%+(
/-"+-($0+1.%+2(!"#$%#&'((!"#"$%&'("$)%'*+#&,+$(
Often implemented without cross-correlation, e.g.,http://en.wikipedia.org/wiki/Bilinear_interpolation
Better filters give better resampled images: Bicubic is a common choice
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 28 / 65
![Page 41: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/41.jpg)
Reconstruction filters
What does the 2D version of this hat function look like?
!"#$%#&'(()*+",#(*+-"#!%),.%+(
/-"+-($0+1.%+2(!"#$%#&'((!"#"$%&'("$)%'*+#&,+$(
Often implemented without cross-correlation, e.g.,http://en.wikipedia.org/wiki/Bilinear_interpolation
Better filters give better resampled images: Bicubic is a common choice
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 28 / 65
![Page 42: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/42.jpg)
Image Interpolation
Original image
Interpolation results
!"#$"%&'(")*+,-$.)(&"$/-0#1-(. 2)0)("#$.)(&"$/-0#1-(. 2)34,)3.)(&"$/-0#1-(.
[Source: N. Snavely]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 29 / 65
![Page 43: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/43.jpg)
Image Interpolation
What operation have we done?
!"#$%&#'(%)$*%!"#$%&'()*+
[Source: N. Snavely]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 30 / 65
![Page 44: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/44.jpg)
Depixelating Pixel Art
Published by [Kopt et al., SIGGRAPH 2011]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 31 / 65
![Page 45: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/45.jpg)
More Examples
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 32 / 65
![Page 46: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/46.jpg)
When are Pyramids Useful?
We might want to change resolution of an image before processing.
We might not know which scale we want, e.g., when searching for a facein an image.
In this case, we will generate a full pyramid of different image sizes.
Can also be used to accelerate the search, by first finding at the coarserlevel of the pyramid and then at the full resolution.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65
![Page 47: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/47.jpg)
When are Pyramids Useful?
We might want to change resolution of an image before processing.
We might not know which scale we want, e.g., when searching for a facein an image.
In this case, we will generate a full pyramid of different image sizes.
Can also be used to accelerate the search, by first finding at the coarserlevel of the pyramid and then at the full resolution.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65
![Page 48: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/48.jpg)
When are Pyramids Useful?
We might want to change resolution of an image before processing.
We might not know which scale we want, e.g., when searching for a facein an image.
In this case, we will generate a full pyramid of different image sizes.
Can also be used to accelerate the search, by first finding at the coarserlevel of the pyramid and then at the full resolution.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65
![Page 49: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/49.jpg)
When are Pyramids Useful?
We might want to change resolution of an image before processing.
We might not know which scale we want, e.g., when searching for a facein an image.
In this case, we will generate a full pyramid of different image sizes.
Can also be used to accelerate the search, by first finding at the coarserlevel of the pyramid and then at the full resolution.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65
![Page 50: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/50.jpg)
Image Pyramid
[Source: R. Szeliski]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 34 / 65
![Page 51: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/51.jpg)
Interpolation and Decimation
To interpolate (or upsample) an image to a higher resolution, we need toselect an interpolation kernel with which to convolve the image
g(i , j) =∑k,l
f (k , l)h(i − rk, j − rl)
with r the up-sampling rate.
The linear interpolator (corresponding to the tent kernel) producesinterpolating piecewise linear curves.
More complex kernels, e.g., B-splines.
Decimation: reduces resolution
g(i , j) =∑k,l
f (k, l)h(i − k/r , j − l/r)
with r the down-sampling rate.
Different filters exist as well.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 35 / 65
![Page 52: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/52.jpg)
Interpolation and Decimation
To interpolate (or upsample) an image to a higher resolution, we need toselect an interpolation kernel with which to convolve the image
g(i , j) =∑k,l
f (k , l)h(i − rk, j − rl)
with r the up-sampling rate.
The linear interpolator (corresponding to the tent kernel) producesinterpolating piecewise linear curves.
More complex kernels, e.g., B-splines.
Decimation: reduces resolution
g(i , j) =∑k,l
f (k, l)h(i − k/r , j − l/r)
with r the down-sampling rate.
Different filters exist as well.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 35 / 65
![Page 53: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/53.jpg)
Interpolation and Decimation
To interpolate (or upsample) an image to a higher resolution, we need toselect an interpolation kernel with which to convolve the image
g(i , j) =∑k,l
f (k , l)h(i − rk, j − rl)
with r the up-sampling rate.
The linear interpolator (corresponding to the tent kernel) producesinterpolating piecewise linear curves.
More complex kernels, e.g., B-splines.
Decimation: reduces resolution
g(i , j) =∑k,l
f (k , l)h(i − k/r , j − l/r)
with r the down-sampling rate.
Different filters exist as well.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 35 / 65
![Page 54: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/54.jpg)
Interpolation and Decimation
To interpolate (or upsample) an image to a higher resolution, we need toselect an interpolation kernel with which to convolve the image
g(i , j) =∑k,l
f (k , l)h(i − rk, j − rl)
with r the up-sampling rate.
The linear interpolator (corresponding to the tent kernel) producesinterpolating piecewise linear curves.
More complex kernels, e.g., B-splines.
Decimation: reduces resolution
g(i , j) =∑k,l
f (k , l)h(i − k/r , j − l/r)
with r the down-sampling rate.
Different filters exist as well.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 35 / 65
![Page 55: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/55.jpg)
Interpolation and Decimation
To interpolate (or upsample) an image to a higher resolution, we need toselect an interpolation kernel with which to convolve the image
g(i , j) =∑k,l
f (k , l)h(i − rk, j − rl)
with r the up-sampling rate.
The linear interpolator (corresponding to the tent kernel) producesinterpolating piecewise linear curves.
More complex kernels, e.g., B-splines.
Decimation: reduces resolution
g(i , j) =∑k,l
f (k , l)h(i − k/r , j − l/r)
with r the down-sampling rate.
Different filters exist as well.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 35 / 65
![Page 56: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/56.jpg)
Multi-Resolution Representations
The most used one is the Laplacian pyramid:
We first blur and subsample the original image by a factor of two and storethis in the next level of the pyramid.
Subtract then this low-pass version from the original to yield the band-passLaplacian image.
The pyramid has perfect reconstruction: the Laplacian images plus thebase-level Gaussian are sufficient to exactly reconstruct the original image.
Wavelets are alternative pyramids. We will not see them here.
[Source: R. Szeliski]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 36 / 65
![Page 57: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/57.jpg)
Multi-Resolution Representations
The most used one is the Laplacian pyramid:
We first blur and subsample the original image by a factor of two and storethis in the next level of the pyramid.
Subtract then this low-pass version from the original to yield the band-passLaplacian image.
The pyramid has perfect reconstruction: the Laplacian images plus thebase-level Gaussian are sufficient to exactly reconstruct the original image.
Wavelets are alternative pyramids. We will not see them here.
[Source: R. Szeliski]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 36 / 65
![Page 58: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/58.jpg)
Multi-Resolution Representations
The most used one is the Laplacian pyramid:
We first blur and subsample the original image by a factor of two and storethis in the next level of the pyramid.
Subtract then this low-pass version from the original to yield the band-passLaplacian image.
The pyramid has perfect reconstruction: the Laplacian images plus thebase-level Gaussian are sufficient to exactly reconstruct the original image.
Wavelets are alternative pyramids. We will not see them here.
[Source: R. Szeliski]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 36 / 65
![Page 59: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/59.jpg)
Multi-Resolution Representations
The most used one is the Laplacian pyramid:
We first blur and subsample the original image by a factor of two and storethis in the next level of the pyramid.
Subtract then this low-pass version from the original to yield the band-passLaplacian image.
The pyramid has perfect reconstruction: the Laplacian images plus thebase-level Gaussian are sufficient to exactly reconstruct the original image.
Wavelets are alternative pyramids. We will not see them here.
[Source: R. Szeliski]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 36 / 65
![Page 60: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/60.jpg)
Laplacian Pyramid Construction
How do we reconstruct back?
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 37 / 65
![Page 61: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/61.jpg)
Laplacian Pyramid Construction
How do we reconstruct back?
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 37 / 65
![Page 62: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/62.jpg)
Laplacian Pyramid Re-construction
When is this useful?
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 38 / 65
![Page 63: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/63.jpg)
Laplacian Pyramid Re-construction
When is this useful?
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 38 / 65
![Page 64: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/64.jpg)
More Complex Filters
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 39 / 65
![Page 65: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/65.jpg)
Steerable Filters
Oriented filters are used in many vision and image processing tasks:texture analysis, edge detection, image data compression, motion analysis.
One approach to finding the response of a filter at many orientations is toapply many versions of the same filter, each different from the others bysome small rotation in angle.
More efficient is to apply a few filters corresponding to a few angles andinterpolate between the responses.
One then needs to know how many filters are required and how toproperly interpolate between the responses.
With the correct filter set and the correct interpolation rule, it is possible todetermine the response of a filter of arbitrary orientation without explicitlyapplying that filter.
Steerable filters are a class of filters in which a filter of arbitrary orientationis synthesized as a linear combination of a set of basis filters.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 40 / 65
![Page 66: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/66.jpg)
Steerable Filters
Oriented filters are used in many vision and image processing tasks:texture analysis, edge detection, image data compression, motion analysis.
One approach to finding the response of a filter at many orientations is toapply many versions of the same filter, each different from the others bysome small rotation in angle.
More efficient is to apply a few filters corresponding to a few angles andinterpolate between the responses.
One then needs to know how many filters are required and how toproperly interpolate between the responses.
With the correct filter set and the correct interpolation rule, it is possible todetermine the response of a filter of arbitrary orientation without explicitlyapplying that filter.
Steerable filters are a class of filters in which a filter of arbitrary orientationis synthesized as a linear combination of a set of basis filters.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 40 / 65
![Page 67: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/67.jpg)
Steerable Filters
Oriented filters are used in many vision and image processing tasks:texture analysis, edge detection, image data compression, motion analysis.
One approach to finding the response of a filter at many orientations is toapply many versions of the same filter, each different from the others bysome small rotation in angle.
More efficient is to apply a few filters corresponding to a few angles andinterpolate between the responses.
One then needs to know how many filters are required and how toproperly interpolate between the responses.
With the correct filter set and the correct interpolation rule, it is possible todetermine the response of a filter of arbitrary orientation without explicitlyapplying that filter.
Steerable filters are a class of filters in which a filter of arbitrary orientationis synthesized as a linear combination of a set of basis filters.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 40 / 65
![Page 68: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/68.jpg)
Steerable Filters
Oriented filters are used in many vision and image processing tasks:texture analysis, edge detection, image data compression, motion analysis.
One approach to finding the response of a filter at many orientations is toapply many versions of the same filter, each different from the others bysome small rotation in angle.
More efficient is to apply a few filters corresponding to a few angles andinterpolate between the responses.
One then needs to know how many filters are required and how toproperly interpolate between the responses.
With the correct filter set and the correct interpolation rule, it is possible todetermine the response of a filter of arbitrary orientation without explicitlyapplying that filter.
Steerable filters are a class of filters in which a filter of arbitrary orientationis synthesized as a linear combination of a set of basis filters.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 40 / 65
![Page 69: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/69.jpg)
Steerable Filters
Oriented filters are used in many vision and image processing tasks:texture analysis, edge detection, image data compression, motion analysis.
One approach to finding the response of a filter at many orientations is toapply many versions of the same filter, each different from the others bysome small rotation in angle.
More efficient is to apply a few filters corresponding to a few angles andinterpolate between the responses.
One then needs to know how many filters are required and how toproperly interpolate between the responses.
With the correct filter set and the correct interpolation rule, it is possible todetermine the response of a filter of arbitrary orientation without explicitlyapplying that filter.
Steerable filters are a class of filters in which a filter of arbitrary orientationis synthesized as a linear combination of a set of basis filters.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 40 / 65
![Page 70: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/70.jpg)
Steerable Filters
Oriented filters are used in many vision and image processing tasks:texture analysis, edge detection, image data compression, motion analysis.
One approach to finding the response of a filter at many orientations is toapply many versions of the same filter, each different from the others bysome small rotation in angle.
More efficient is to apply a few filters corresponding to a few angles andinterpolate between the responses.
One then needs to know how many filters are required and how toproperly interpolate between the responses.
With the correct filter set and the correct interpolation rule, it is possible todetermine the response of a filter of arbitrary orientation without explicitlyapplying that filter.
Steerable filters are a class of filters in which a filter of arbitrary orientationis synthesized as a linear combination of a set of basis filters.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 40 / 65
![Page 71: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/71.jpg)
Example of Steerable Filter
2D symmetric Gaussian with σ = 1 and assume constant is 1
G (x , y , σ) = exp(−x2 + y2
)The directional derivative operator is steerable.
The first derivative
G 01 =
∂
∂xexp
(−x2 + y2
)= −2x exp
(−x2 + y2
)and the same function rotated 90 degrees is
G 901 =
∂
∂yexp
(−x2 + y2
)= −2y exp
(−x2 + y2
)A filter of arbitrary orientation θ can be synthesized by taking a linearcombination of G 0
1 and G 901
G θ1 = cos θG 01 + sin θG 90
1
G 01 and G 90
1 are the basis filters and cos θ and sin θ are the interpolationfunctions
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 41 / 65
![Page 72: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/72.jpg)
Example of Steerable Filter
2D symmetric Gaussian with σ = 1 and assume constant is 1
G (x , y , σ) = exp(−x2 + y2
)The directional derivative operator is steerable.
The first derivative
G 01 =
∂
∂xexp
(−x2 + y2
)= −2x exp
(−x2 + y2
)and the same function rotated 90 degrees is
G 901 =
∂
∂yexp
(−x2 + y2
)= −2y exp
(−x2 + y2
)
A filter of arbitrary orientation θ can be synthesized by taking a linearcombination of G 0
1 and G 901
G θ1 = cos θG 01 + sin θG 90
1
G 01 and G 90
1 are the basis filters and cos θ and sin θ are the interpolationfunctions
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 41 / 65
![Page 73: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/73.jpg)
Example of Steerable Filter
2D symmetric Gaussian with σ = 1 and assume constant is 1
G (x , y , σ) = exp(−x2 + y2
)The directional derivative operator is steerable.
The first derivative
G 01 =
∂
∂xexp
(−x2 + y2
)= −2x exp
(−x2 + y2
)and the same function rotated 90 degrees is
G 901 =
∂
∂yexp
(−x2 + y2
)= −2y exp
(−x2 + y2
)A filter of arbitrary orientation θ can be synthesized by taking a linearcombination of G 0
1 and G 901
G θ1 = cos θG 01 + sin θG 90
1
G 01 and G 90
1 are the basis filters and cos θ and sin θ are the interpolationfunctions
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 41 / 65
![Page 74: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/74.jpg)
Example of Steerable Filter
2D symmetric Gaussian with σ = 1 and assume constant is 1
G (x , y , σ) = exp(−x2 + y2
)The directional derivative operator is steerable.
The first derivative
G 01 =
∂
∂xexp
(−x2 + y2
)= −2x exp
(−x2 + y2
)and the same function rotated 90 degrees is
G 901 =
∂
∂yexp
(−x2 + y2
)= −2y exp
(−x2 + y2
)A filter of arbitrary orientation θ can be synthesized by taking a linearcombination of G 0
1 and G 901
G θ1 = cos θG 01 + sin θG 90
1
G 01 and G 90
1 are the basis filters and cos θ and sin θ are the interpolationfunctions
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 41 / 65
![Page 75: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/75.jpg)
More on steerable filters
Because convolution is a linear operation, we can synthesize an imagefiltered at an arbitrary orientation by taking linear combinations of theimages filtered with G 0
1 and G 901
if R01 = G 0
1 ∗ I and R901 = G 90
1 ∗ I then Rθ1 = cos θR01 + sin θR90
1
Check yourself that this is the case.
See [Freeman & Adelson, 91] for the conditions on when a filter is steerableand how many basis are necessary.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 42 / 65
![Page 76: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/76.jpg)
More on steerable filters
Because convolution is a linear operation, we can synthesize an imagefiltered at an arbitrary orientation by taking linear combinations of theimages filtered with G 0
1 and G 901
if R01 = G 0
1 ∗ I and R901 = G 90
1 ∗ I then Rθ1 = cos θR01 + sin θR90
1
Check yourself that this is the case.
See [Freeman & Adelson, 91] for the conditions on when a filter is steerableand how many basis are necessary.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 42 / 65
![Page 77: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/77.jpg)
More on steerable filters
Because convolution is a linear operation, we can synthesize an imagefiltered at an arbitrary orientation by taking linear combinations of theimages filtered with G 0
1 and G 901
if R01 = G 0
1 ∗ I and R901 = G 90
1 ∗ I then Rθ1 = cos θR01 + sin θR90
1
Check yourself that this is the case.
See [Freeman & Adelson, 91] for the conditions on when a filter is steerableand how many basis are necessary.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 42 / 65
![Page 78: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/78.jpg)
[Source: W. Freeman 91]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 43 / 65
![Page 79: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/79.jpg)
More complex filters
What about the second order derivative?
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 44 / 65
![Page 80: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/80.jpg)
More complex filters
What about the second order derivative?
Only three basis are required
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 44 / 65
![Page 81: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/81.jpg)
More complex filters
What about the second order derivative?
Only three basis are required
Guu = u2Gxx + 2uvGx,y + v2Gy ,y
with u = (u, v)
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 44 / 65
![Page 82: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/82.jpg)
Other transformations
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 45 / 65
![Page 83: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/83.jpg)
Integral Images
If an image is going to be repeatedly convolved with different box filters, itis useful to compute the summed area table.
It is the running sum of all the pixel values from the origin
s(i , j) =i∑
k=0
j∑l=0
f (k , l)
This can be efficiently computed using a recursive (raster-scan) algorithm
s(i , j) = s(i − 1, j) + s(i , j − 1)− s(i − 1, j − 1) + f (i , j)
The image s(i , j) is called an integral image and can actually be computedusing only two additions per pixel if separate row sums are used.
To find the summed area (integral) inside a rectangle [i0, i1]× [j0, j1] wesimply combine four samples from the summed area table.
S([i0, i1]× [j0, j1]) = s(i1, j1)− s(i1, j0 − 1)− s(i0 − 1, j1) + s(i0 − 1, j0 − 1)
Summed area tables have been used in face detection [Viola & Jones, 04]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 46 / 65
![Page 84: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/84.jpg)
Integral Images
If an image is going to be repeatedly convolved with different box filters, itis useful to compute the summed area table.
It is the running sum of all the pixel values from the origin
s(i , j) =i∑
k=0
j∑l=0
f (k , l)
This can be efficiently computed using a recursive (raster-scan) algorithm
s(i , j) = s(i − 1, j) + s(i , j − 1)− s(i − 1, j − 1) + f (i , j)
The image s(i , j) is called an integral image and can actually be computedusing only two additions per pixel if separate row sums are used.
To find the summed area (integral) inside a rectangle [i0, i1]× [j0, j1] wesimply combine four samples from the summed area table.
S([i0, i1]× [j0, j1]) = s(i1, j1)− s(i1, j0 − 1)− s(i0 − 1, j1) + s(i0 − 1, j0 − 1)
Summed area tables have been used in face detection [Viola & Jones, 04]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 46 / 65
![Page 85: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/85.jpg)
Integral Images
If an image is going to be repeatedly convolved with different box filters, itis useful to compute the summed area table.
It is the running sum of all the pixel values from the origin
s(i , j) =i∑
k=0
j∑l=0
f (k , l)
This can be efficiently computed using a recursive (raster-scan) algorithm
s(i , j) = s(i − 1, j) + s(i , j − 1)− s(i − 1, j − 1) + f (i , j)
The image s(i , j) is called an integral image and can actually be computedusing only two additions per pixel if separate row sums are used.
To find the summed area (integral) inside a rectangle [i0, i1]× [j0, j1] wesimply combine four samples from the summed area table.
S([i0, i1]× [j0, j1]) = s(i1, j1)− s(i1, j0 − 1)− s(i0 − 1, j1) + s(i0 − 1, j0 − 1)
Summed area tables have been used in face detection [Viola & Jones, 04]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 46 / 65
![Page 86: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/86.jpg)
Integral Images
If an image is going to be repeatedly convolved with different box filters, itis useful to compute the summed area table.
It is the running sum of all the pixel values from the origin
s(i , j) =i∑
k=0
j∑l=0
f (k , l)
This can be efficiently computed using a recursive (raster-scan) algorithm
s(i , j) = s(i − 1, j) + s(i , j − 1)− s(i − 1, j − 1) + f (i , j)
The image s(i , j) is called an integral image and can actually be computedusing only two additions per pixel if separate row sums are used.
To find the summed area (integral) inside a rectangle [i0, i1]× [j0, j1] wesimply combine four samples from the summed area table.
S([i0, i1]× [j0, j1]) = s(i1, j1)− s(i1, j0 − 1)− s(i0 − 1, j1) + s(i0 − 1, j0 − 1)
Summed area tables have been used in face detection [Viola & Jones, 04]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 46 / 65
![Page 87: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/87.jpg)
Integral Images
If an image is going to be repeatedly convolved with different box filters, itis useful to compute the summed area table.
It is the running sum of all the pixel values from the origin
s(i , j) =i∑
k=0
j∑l=0
f (k , l)
This can be efficiently computed using a recursive (raster-scan) algorithm
s(i , j) = s(i − 1, j) + s(i , j − 1)− s(i − 1, j − 1) + f (i , j)
The image s(i , j) is called an integral image and can actually be computedusing only two additions per pixel if separate row sums are used.
To find the summed area (integral) inside a rectangle [i0, i1]× [j0, j1] wesimply combine four samples from the summed area table.
S([i0, i1]× [j0, j1]) = s(i1, j1)− s(i1, j0 − 1)− s(i0 − 1, j1) + s(i0 − 1, j0 − 1)
Summed area tables have been used in face detection [Viola & Jones, 04]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 46 / 65
![Page 88: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/88.jpg)
Integral Images
If an image is going to be repeatedly convolved with different box filters, itis useful to compute the summed area table.
It is the running sum of all the pixel values from the origin
s(i , j) =i∑
k=0
j∑l=0
f (k , l)
This can be efficiently computed using a recursive (raster-scan) algorithm
s(i , j) = s(i − 1, j) + s(i , j − 1)− s(i − 1, j − 1) + f (i , j)
The image s(i , j) is called an integral image and can actually be computedusing only two additions per pixel if separate row sums are used.
To find the summed area (integral) inside a rectangle [i0, i1]× [j0, j1] wesimply combine four samples from the summed area table.
S([i0, i1]× [j0, j1]) = s(i1, j1)− s(i1, j0 − 1)− s(i0 − 1, j1) + s(i0 − 1, j0 − 1)
Summed area tables have been used in face detection [Viola & Jones, 04]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 46 / 65
![Page 89: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/89.jpg)
Example of Integral Images
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 47 / 65
![Page 90: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/90.jpg)
Non-linear filters: Median filter
We have seen linear filters, i.e., their response to a sum of two signals isthe same as the sum of the individual responses.
h ◦ (f + g) = h ◦ f + h ◦ g
Median filter: Non linear filter that selects the median value from eachpixels neighborhood.
Robust to outliers, but not good for Gaussian noise.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 48 / 65
![Page 91: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/91.jpg)
Non-linear filters: Median filter
We have seen linear filters, i.e., their response to a sum of two signals isthe same as the sum of the individual responses.
h ◦ (f + g) = h ◦ f + h ◦ g
Median filter: Non linear filter that selects the median value from eachpixels neighborhood.
Robust to outliers, but not good for Gaussian noise.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 48 / 65
![Page 92: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/92.jpg)
Non-linear filters: Median filter
We have seen linear filters, i.e., their response to a sum of two signals isthe same as the sum of the individual responses.
h ◦ (f + g) = h ◦ f + h ◦ g
Median filter: Non linear filter that selects the median value from eachpixels neighborhood.
Robust to outliers, but not good for Gaussian noise.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 48 / 65
![Page 93: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/93.jpg)
Example of non-linear filters
(Median filter) (α-trimmed mean)
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 49 / 65
![Page 94: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/94.jpg)
Non-linear filters: Median filter
We have seen linear filters, i.e., their response to a sum of two signals isthe same as the sum of the individual responses.
h ◦ (f + g) = h ◦ f + h ◦ g
Median filter: Non linear filter that selects the median value from eachpixels neighborhood.
Robust to outliers, but not good for Gaussian noise.
α-trimmed mean: averages together all of the pixels except for the αfraction that are the smallest and the largest.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 50 / 65
![Page 95: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/95.jpg)
Non-linear filters: Median filter
We have seen linear filters, i.e., their response to a sum of two signals isthe same as the sum of the individual responses.
h ◦ (f + g) = h ◦ f + h ◦ g
Median filter: Non linear filter that selects the median value from eachpixels neighborhood.
Robust to outliers, but not good for Gaussian noise.
α-trimmed mean: averages together all of the pixels except for the αfraction that are the smallest and the largest.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 50 / 65
![Page 96: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/96.jpg)
Example of non-linear filters
(Median filter) (α-trimmed mean)
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 51 / 65
![Page 97: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/97.jpg)
Bilateral Filtering
Weighted filter kernel with a better outlier rejection.
Instead of rejecting a fixed percentage, we reject (in a soft way) pixels whosevalues differ too much from the central pixel value.
The output pixel value depends on a weighted combination of neighboringpixel values
g(i , j) =
∑k,l f (k, l)w(i , j , k, l)∑
k,l w(i , j , k, l)
Data-dependent bilateral weight function
w(i , j , k, l) = exp
(− (i − k)2 + (j − l)2
2σ2d
− ||f (i , j)− f (k, l)||2
2σ2r
)composed of the domain kernel and the range kernel.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 52 / 65
![Page 98: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/98.jpg)
Bilateral Filtering
Weighted filter kernel with a better outlier rejection.
Instead of rejecting a fixed percentage, we reject (in a soft way) pixels whosevalues differ too much from the central pixel value.
The output pixel value depends on a weighted combination of neighboringpixel values
g(i , j) =
∑k,l f (k , l)w(i , j , k , l)∑
k,l w(i , j , k , l)
Data-dependent bilateral weight function
w(i , j , k, l) = exp
(− (i − k)2 + (j − l)2
2σ2d
− ||f (i , j)− f (k, l)||2
2σ2r
)composed of the domain kernel and the range kernel.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 52 / 65
![Page 99: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/99.jpg)
Bilateral Filtering
Weighted filter kernel with a better outlier rejection.
Instead of rejecting a fixed percentage, we reject (in a soft way) pixels whosevalues differ too much from the central pixel value.
The output pixel value depends on a weighted combination of neighboringpixel values
g(i , j) =
∑k,l f (k , l)w(i , j , k , l)∑
k,l w(i , j , k , l)
Data-dependent bilateral weight function
w(i , j , k , l) = exp
(− (i − k)2 + (j − l)2
2σ2d
− ||f (i , j)− f (k, l)||2
2σ2r
)composed of the domain kernel and the range kernel.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 52 / 65
![Page 100: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/100.jpg)
Bilateral Filtering
Weighted filter kernel with a better outlier rejection.
Instead of rejecting a fixed percentage, we reject (in a soft way) pixels whosevalues differ too much from the central pixel value.
The output pixel value depends on a weighted combination of neighboringpixel values
g(i , j) =
∑k,l f (k , l)w(i , j , k , l)∑
k,l w(i , j , k , l)
Data-dependent bilateral weight function
w(i , j , k , l) = exp
(− (i − k)2 + (j − l)2
2σ2d
− ||f (i , j)− f (k, l)||2
2σ2r
)composed of the domain kernel and the range kernel.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 52 / 65
![Page 101: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/101.jpg)
Example Bilateral Filtering
Figure: Bilateral filtering [Durand & Dorsey, 02]. (a) noisy step edge input. (b)domain filter (Gaussian). (c) range filter (similarity to center pixel value). (d)bilateral filter. (e) filtered step edge output. (f) 3D distance between pixels
[Source: R. Szeliski]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 53 / 65
![Page 102: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/102.jpg)
Distance Transform
Useful to quickly precomputing the distance to a curve or a set of points.
Let d(k, l) be some distance metric between pixel offsets, e.g., Manhattandistance
d(k , l) = |k |+ |l |
or Euclidean distanced(k , l) =
√k2 + l2
The distance transform D(i , j) of a binary image b(i , j) is defined as
D(i , j) = mink,l ;b(k,l)=0
d(i − k , j − l)
it is the distance to the nearest pixel whose value is 0.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 54 / 65
![Page 103: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/103.jpg)
Distance Transform
Useful to quickly precomputing the distance to a curve or a set of points.
Let d(k, l) be some distance metric between pixel offsets, e.g., Manhattandistance
d(k , l) = |k |+ |l |
or Euclidean distanced(k , l) =
√k2 + l2
The distance transform D(i , j) of a binary image b(i , j) is defined as
D(i , j) = mink,l ;b(k,l)=0
d(i − k , j − l)
it is the distance to the nearest pixel whose value is 0.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 54 / 65
![Page 104: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/104.jpg)
Distance Transform
Useful to quickly precomputing the distance to a curve or a set of points.
Let d(k, l) be some distance metric between pixel offsets, e.g., Manhattandistance
d(k , l) = |k |+ |l |
or Euclidean distanced(k , l) =
√k2 + l2
The distance transform D(i , j) of a binary image b(i , j) is defined as
D(i , j) = mink,l ;b(k,l)=0
d(i − k , j − l)
it is the distance to the nearest pixel whose value is 0.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 54 / 65
![Page 105: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/105.jpg)
Distance Transform Algorithm
The Manhattan distance can be computed using a forward and backwardpass of a simple raster-scan algorithm.
Forward pass: each non-zero pixel in b is replaced by the minimum of 1 +the distance of its north or west neighbor.
Backward pass: the same, but the minimum is both over the current valueD and 1 + the distance of the south and east neighbors.
Figure: City block distance transform: (a) original binary image; (b) top to bottom (forward)raster sweep: green values are used to compute the orange value; (c) bottom to top (backward)raster sweep: green values are merged with old orange value; (d) final distance transform.
[Source: R. Szeliski]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 55 / 65
![Page 106: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/106.jpg)
Distance Transform Algorithm
The Manhattan distance can be computed using a forward and backwardpass of a simple raster-scan algorithm.
Forward pass: each non-zero pixel in b is replaced by the minimum of 1 +the distance of its north or west neighbor.
Backward pass: the same, but the minimum is both over the current valueD and 1 + the distance of the south and east neighbors.
Figure: City block distance transform: (a) original binary image; (b) top to bottom (forward)raster sweep: green values are used to compute the orange value; (c) bottom to top (backward)raster sweep: green values are merged with old orange value; (d) final distance transform.
[Source: R. Szeliski]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 55 / 65
![Page 107: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/107.jpg)
Distance Transform Algorithm
The Manhattan distance can be computed using a forward and backwardpass of a simple raster-scan algorithm.
Forward pass: each non-zero pixel in b is replaced by the minimum of 1 +the distance of its north or west neighbor.
Backward pass: the same, but the minimum is both over the current valueD and 1 + the distance of the south and east neighbors.
Figure: City block distance transform: (a) original binary image; (b) top to bottom (forward)raster sweep: green values are used to compute the orange value; (c) bottom to top (backward)raster sweep: green values are merged with old orange value; (d) final distance transform.
[Source: R. Szeliski]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 55 / 65
![Page 108: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/108.jpg)
Example of Distance Transform
More complicated in the Euclidean case.
Example of a distance transform
The ridges is the skeleton or medial axis.
Extension: Signed distance transform.
[Source: P. Felzenszwalb]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 56 / 65
![Page 109: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/109.jpg)
Fourier Transform
Fourier analysis could be used to analyze the frequency characteristics ofvarious filters.
How can we analyze what a given filter does to high, medium, and lowfrequencies?
Pass a sinusoid of known frequency through the filter and to observe by howmuch it is attenuated
s(x) = sin(2πfx + φi ) = sin(ωx + φi )
with frequency f , angular frequency ω and phase φi .
If we convolve the sinusoidal signal s(x) with a filter whose impulse responseis h(x), we get another sinusoid of the same frequency but differentmagnitude and phase
o(x) = h(x) ∗ s(x) = A sin(ωx + φo)
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 57 / 65
![Page 110: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/110.jpg)
Fourier Transform
Fourier analysis could be used to analyze the frequency characteristics ofvarious filters.
How can we analyze what a given filter does to high, medium, and lowfrequencies?
Pass a sinusoid of known frequency through the filter and to observe by howmuch it is attenuated
s(x) = sin(2πfx + φi ) = sin(ωx + φi )
with frequency f , angular frequency ω and phase φi .
If we convolve the sinusoidal signal s(x) with a filter whose impulse responseis h(x), we get another sinusoid of the same frequency but differentmagnitude and phase
o(x) = h(x) ∗ s(x) = A sin(ωx + φo)
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 57 / 65
![Page 111: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/111.jpg)
Fourier Transform
Fourier analysis could be used to analyze the frequency characteristics ofvarious filters.
How can we analyze what a given filter does to high, medium, and lowfrequencies?
Pass a sinusoid of known frequency through the filter and to observe by howmuch it is attenuated
s(x) = sin(2πfx + φi ) = sin(ωx + φi )
with frequency f , angular frequency ω and phase φi .
If we convolve the sinusoidal signal s(x) with a filter whose impulse responseis h(x), we get another sinusoid of the same frequency but differentmagnitude and phase
o(x) = h(x) ∗ s(x) = A sin(ωx + φo)
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 57 / 65
![Page 112: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/112.jpg)
Fourier Transform
Fourier analysis could be used to analyze the frequency characteristics ofvarious filters.
How can we analyze what a given filter does to high, medium, and lowfrequencies?
Pass a sinusoid of known frequency through the filter and to observe by howmuch it is attenuated
s(x) = sin(2πfx + φi ) = sin(ωx + φi )
with frequency f , angular frequency ω and phase φi .
If we convolve the sinusoidal signal s(x) with a filter whose impulse responseis h(x), we get another sinusoid of the same frequency but differentmagnitude and phase
o(x) = h(x) ∗ s(x) = A sin(ωx + φo)
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 57 / 65
![Page 113: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/113.jpg)
Filtering and Fourier
Convolution can be expressed as a weighted summation of shifted inputsignals (sinusoids); so it is just a single sinusoid at that frequency.
o(x) = h(x) ∗ s(x) = A sin(ωx + φo)
A is the gain or magnitude of the filter, while the phase difference∆φ = φo − φi is the shift or phase
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 58 / 65
![Page 114: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/114.jpg)
Complex notation
The sinusoid is express as s(x) = e jωx = cosωx + j sinωx and the filtersinusoid as
o(x) = h(x) ∗ s(x) = Ae jωx+φ
The Fourier transform pair is
h(x)←→ H(ω)
The Fourier transform in continuous domain
H(ω) =
∫ ∞−∞
h(x)e−jωxdx
The Fourier transform in discrete domain
H(k) =1
N
N−1∑x=0
h(x)e−j2πkxN
where N is the length of the signal.
The discrete form is known as the Discrete Fourier Transform (DFT).
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 59 / 65
![Page 115: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/115.jpg)
Complex notation
The sinusoid is express as s(x) = e jωx = cosωx + j sinωx and the filtersinusoid as
o(x) = h(x) ∗ s(x) = Ae jωx+φ
The Fourier transform pair is
h(x)←→ H(ω)
The Fourier transform in continuous domain
H(ω) =
∫ ∞−∞
h(x)e−jωxdx
The Fourier transform in discrete domain
H(k) =1
N
N−1∑x=0
h(x)e−j2πkxN
where N is the length of the signal.
The discrete form is known as the Discrete Fourier Transform (DFT).
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 59 / 65
![Page 116: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/116.jpg)
Complex notation
The sinusoid is express as s(x) = e jωx = cosωx + j sinωx and the filtersinusoid as
o(x) = h(x) ∗ s(x) = Ae jωx+φ
The Fourier transform pair is
h(x)←→ H(ω)
The Fourier transform in continuous domain
H(ω) =
∫ ∞−∞
h(x)e−jωxdx
The Fourier transform in discrete domain
H(k) =1
N
N−1∑x=0
h(x)e−j2πkxN
where N is the length of the signal.
The discrete form is known as the Discrete Fourier Transform (DFT).
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 59 / 65
![Page 117: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/117.jpg)
Complex notation
The sinusoid is express as s(x) = e jωx = cosωx + j sinωx and the filtersinusoid as
o(x) = h(x) ∗ s(x) = Ae jωx+φ
The Fourier transform pair is
h(x)←→ H(ω)
The Fourier transform in continuous domain
H(ω) =
∫ ∞−∞
h(x)e−jωxdx
The Fourier transform in discrete domain
H(k) =1
N
N−1∑x=0
h(x)e−j2πkxN
where N is the length of the signal.
The discrete form is known as the Discrete Fourier Transform (DFT).
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 59 / 65
![Page 118: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/118.jpg)
Complex notation
The sinusoid is express as s(x) = e jωx = cosωx + j sinωx and the filtersinusoid as
o(x) = h(x) ∗ s(x) = Ae jωx+φ
The Fourier transform pair is
h(x)←→ H(ω)
The Fourier transform in continuous domain
H(ω) =
∫ ∞−∞
h(x)e−jωxdx
The Fourier transform in discrete domain
H(k) =1
N
N−1∑x=0
h(x)e−j2πkxN
where N is the length of the signal.
The discrete form is known as the Discrete Fourier Transform (DFT).
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 59 / 65
![Page 119: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/119.jpg)
Properties Fourier Transform
[Source: R. Szeliski]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 60 / 65
![Page 120: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/120.jpg)
[Source: R. Szeliski]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 61 / 65
![Page 121: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/121.jpg)
[Source: R. Szeliski]Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 62 / 65
![Page 122: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/122.jpg)
2D Fourier Transform
Same as 1D, but in 2D. Now the sinusoid is
s(x , y) = sin(ωxx + ωyy)
The 2D Fourier in continuous domain is then
H(ωx , ωy ) =
∫ ∞−∞
∫ ∞−∞
h(x , y)e−j(ωxx+ωy y)dxdy
and in the discrete domain
H(kx , ky ) =1
MN
M−1∑x=0
N−1∑y=0
h(x , y)e−2πjkx x+ky y
MN
where M and N are the width and height of the image.
All the properties carry over to 2D.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 63 / 65
![Page 123: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/123.jpg)
2D Fourier Transform
Same as 1D, but in 2D. Now the sinusoid is
s(x , y) = sin(ωxx + ωyy)
The 2D Fourier in continuous domain is then
H(ωx , ωy ) =
∫ ∞−∞
∫ ∞−∞
h(x , y)e−j(ωxx+ωy y)dxdy
and in the discrete domain
H(kx , ky ) =1
MN
M−1∑x=0
N−1∑y=0
h(x , y)e−2πjkx x+ky y
MN
where M and N are the width and height of the image.
All the properties carry over to 2D.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 63 / 65
![Page 124: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/124.jpg)
2D Fourier Transform
Same as 1D, but in 2D. Now the sinusoid is
s(x , y) = sin(ωxx + ωyy)
The 2D Fourier in continuous domain is then
H(ωx , ωy ) =
∫ ∞−∞
∫ ∞−∞
h(x , y)e−j(ωxx+ωy y)dxdy
and in the discrete domain
H(kx , ky ) =1
MN
M−1∑x=0
N−1∑y=0
h(x , y)e−2πjkx x+ky y
MN
where M and N are the width and height of the image.
All the properties carry over to 2D.
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 63 / 65
![Page 125: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/125.jpg)
Example of 2D Fourier Transform
[Source: A. Jepson]
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 64 / 65
![Page 126: Visual Recognition: Filtering and Transformationsurtasun/courses/CV/lecture03.pdf · Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 33 / 65. When are Pyramids Useful? We might](https://reader036.fdocuments.us/reader036/viewer/2022070107/6023fd4285671c184365b42a/html5/thumbnails/126.jpg)
Next class ... image features
Raquel Urtasun (TTI-C) Visual Recognition Jan 15, 2012 65 / 65