Post on 26-Jun-2018
Overview
•Importance of Features
•Mathematical Notation & Background
•Fourier Transform
•Windowed Fourier Transform
•Wavelets
•Feature Anecdote
•Literature & Homework
Fall 2004 Pattern Recognition for Vision
General Remarks
x1, x2 ,…, xFeature vector (
Classifier
Feature Extraction
n)
Photograph by MIT OCW.
“The choice of features is more important than the choice of the classifier.”
Fall 2004 Pattern Recognition for Vision
General Remarks—Application specific
Traffic sign recognition Color, shape (Hough transform)
Texture Recognition
DFT, WT
Action Recognition
Motion-based features
Fall 2004 Pattern Recognition for Vision
General Remarks
Is the choice of features really more important than the choice of the classifier?
We know that a fly uses optical flow features for navigation.
Still, technical systems using optical flow for navigation are (far) behind capabilities of a fly.
Fall 2004 Pattern Recognition for Vision
General Remarks—why talk about FT, WFT, WT?
Fourier Transform(FT), Windowed FT (WFT) and Wavelet Transform (WT)
•used in many computer vision applications •derivation from signal processing •basic tools for engineers
Other features:color, motion features (optical flow),gradient features, SIFT (orientation histograms),affine invariant features, steerable filters (overcomplete wavelets),…
Fall 2004 Pattern Recognition for Vision
ˆ ( )f ω
( )f t
, ,
Norm
2j tt j πωπω πω+ =
,f g
f
2 , , ( )L RH L
2 2a jb a b+ = +
Background—Notation
Complex conjugate
Z R C
Inner product
Fourier transform
integers, real, complex
cos 2 sin 2 t e Euler formula
function spaces
Absolute value
Fall 2004 Pattern Recognition for Vision
Background—Vector and Function Spaces
[ ]1
T
Nu u=u
f t t ∈ R
f n n∈ Z
,...,
( ),
Vectors and Functions
( ),
Fall 2004 Pattern Recognition for Vision
Background—Inner Product&Norm
1
, N
n n n
u v =
≡ ∑u v 2
,≡u
f t ∞
−∞
≡ ∫ 2
( ) ( )f t ≡
2( ) ( )f n ≡( )f n
∞
−∞
≡ ∑
2 22
2 nL n
L u= ∑u
Inner Product&Norm
u u
( ), ( ) ( ) ( ) t g t f t g t d ( ), f t f t
( ), f n f n ( ), ( ) ( ) n g n f n g
norm:
Fall 2004 Pattern Recognition for Vision
, , , , , ,h c f g f c+ = + ∈ ∈CH
, ,f g g f=
, 0f t f f> ∈ ≠H
, , ,f g f g f g f g f g+ ≤ + ≤ ∈H
2 ( )L R
2 2 : ,f f f t≡ → ≡ ∫R CH
Background—Function Spaces
Inner Product cont.
for f cg f h g h
( ) 0 for all Positivity:
Hermiticity:
Linearity:
Triangle & Schwarz inequality
for all
Function space, finite energy
( ) dt < ∞
Fall 2004 Pattern Recognition for Vision
1
1 1
, ,
N N N
N n
n n N n n
u u u u =
∀ ∈
= =∑
b b C u C
u b b u
1
1
( ) ( ) , ( ( ) ( )
N
N n
n n n N n n
f f g
g t c c c f
g t g g g f ω ωω ω ω ω
=
∀ ∈
= =
= =
∑
∫
H H
0 ,
1 f fω ω
ω ω ω ω ′
′≠ = ′=
( ) ,f f gω ω ωω= =
Background—Basis
Basis of a Vector and Function Space
,..., is a basis of if
,..., is unique,
,..., is a basis of if
( ) ( ), ,..., is unique, ( ), ( )
( ) ) is unique, ( ),
c f t t g t
f t d t g t
Orthonormal Basis
f g
Fall 2004 Pattern Recognition for Vision
Fourier Series—Continuous Signal
/ 2 2
/ 2
2
2 2
2
( )
, ,
1 , ,
1( )
T j kt
T k
T
j kt lT
k k l kT
k k kT T k
j kt
T k
k
c dt
s e T
c s f c T
f t T
π
π
π
δ
−
−
∞
∞
=
= =
= =
=
∫
∑
∑
-15 -10 -5 0 5 10 15 -1
0
1
T
[ ]( )2( ) / / 2
f t T
f t L T T∈ −
( )
f t e
s s
f t
c e
=−∞
=−∞
Continuous, periodic signal
-0.5
0.5
1.5
( ) periodic with period
2,
Fall 2004 Pattern Recognition for Vision
Fourier Series—Examples
-5 -4 -3 -2 -1 0 1 2 3 4 5 0
1
2
T
1/ T− 1/ T -8 -6 -4 -2 0 2 4 6 8
-1
0
1
0 1
0
1
T
-4 -3 -2 -1 0 1 2 3 4
0
1
0 1
1/τ
1/ T τ
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
-0.8
-0.6
-0.4
-0.2
0.2
0.4
0.6
0.8
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
-0.4
-0.2
0.2
0.4
0.6
0.8
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Fall 2004 Pattern Recognition for Vision
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
/ 2 2 2
/ 2
2 2
2
ˆ( ) ( )
1( )
ˆ( ) ( )
k
k
T j t j t
k
T
j kt jT
k k k k k
j t
c dt f dt
c e c e T
T f e d
πω πω
π πω
πω
ω
ω
ω ω
∞ − −
−
∞ ∞
∞
−∞
= → =
= = ∆
→ ∞ =
∫ ∫
∑ ∑
∫
01
-2 -1 0 1 2 0
1
x
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2( ) ( )f t L∈ R
[ ]( )2 2
1
/ 2 ( )
/
1 k
k k
T
L T T L
k T
T
ω
ω ω ω+
→ ∞
− →
=
∆ = − =
R
Pattern Recognition for Vision Fall 2004
Fourier Transform
( ) f t e f t e
f t
f t
−∞
=−∞ =−∞
Continuous signals,
0.5
-1.5 -0.5 0.5 1.5
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
f(x)
/ 2,
Fourier Transform
Another look at the inverse Fourier transform
' dt e j2πωt ∞ ∞
f t e − j2πωt ' ' f t ∫ ∫( ) = ( ) dω −∞ −∞
∞ ∞ ∞ '
= ∫ ∫ ' −f t e− j2πω (t t )d dt ' = ∫ ' ' ' ( ) ω f t δ − )( ) ( t t dt −∞ −∞ −∞
Fall 2004 Pattern Recognition for Vision
02
0
' ' '
' ' '
ˆ ˆ( ) ( )
ˆ( ) ( )
ˆ ˆ) ( ) ( )
( ) ˆ2 ( )
1 ˆ( )
ˆ ˆ) ( ) ( )
j t
B A f Bg
e f
t f g
j f dt
f At f A A
t f g
πω
ω ω
ω
ω ω
ω
ω ω
−
∞
−∞
∞
−∞
+ +
−
−
+
∫
∫ 2
' ' ' ˆ( ) ( )t f ω ∞
−∞
+∫
Fourier Transform—Properties
Linearity: ( ) ( )
Shift:
Convolution: ( ) (
Derivative:
Scaling:
Correlation : ( ) (
Auto
A f t g t
f t t
f t g t t d
d f t
f t g t t d
πω ω
correlation: ( ) f t f t t d
Fall 2004 Pattern Recognition for Vision
Fourier Transform—Plancherel&Parseval
22
22
ˆ
ˆ
f dt f d
f f
ω ∞ ∞
−∞
=
=
∫ ∫
ˆ ˆ
ˆ ˆ, ,
f g dt
f g
ω ∞ ∞
−∞ −∞
=
=
∫ ∫
−∞
Plancherel’s Theorem
f g d
f g
Parseval Identity
Fall 2004 Pattern Recognition for Vision
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Fourier Transform—Examples
01
-2 -1 0 1 1.5 2 0
1
x
0
1
2 2 )te σ−
01
-1 0 0.2 0.4 1 0
1
x
0
1
( / )T
-4 -3 -2 -1 0 1 2 3 4
0
1
0 1
0
1
) )
TT T T
T
πω ωπω
=
01
-2 -1 0 1 1.5 2 0
1
f
0
1
2 2 222 eπσ −
0.5
-1.5 -0.5 0.5
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
f(x)
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
/(2
0.5
-0.8 -0.6 -0.4 -0.2 0.6 0.8
0.5
1.5
f(x)
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
rect t
-0.4
-0.2
0.2
0.4
0.6
0.8
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
sin( sinc(
0.5
-1.5 -0.5 0.5
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
F(f)
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
π σ ω
Time Frequency
Fall 2004 Pattern Recognition for Vision
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Fourier Transform—Examples
01
-1 0 0.2 0.4 1 0
1
x
0
1
ω Ω
-4 -3 -2 -1 0 1 2 3 4
0
1
0 1
0
1
)2 2 )
2
t t
t
π π
ΩΩ = Ω ΩΩ
Ω
0.5
-0.8 -0.6 -0.4 -0.2 0.6 0.8
0.5
1.5
f(x)
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
( /(2 )) rect
-0.4
-0.2
0.2
0.4
0.6
0.8
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
sin(2 sinc(2
Time Frequency
Fall 2004 Pattern Recognition for Vision
Fourier Transform—Sampling Theorem
Ω
( )
ˆ ( )
( ) 2 2n
f
nf t
ω ω ∞
=
= Ω ∑
-4 -3 -2 -1 0 1 2 3 4 0
1
)Ω
0, for
sinc t n f =−∞
> Ω
Ω −
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1/(2
Fall 2004 Pattern Recognition for Vision
Fourier Transform—Sampling Theorem
( )
( ) ( )
ˆ ( ) 0, for , ( ) sinc 2 2
, ( ) sinc 2 ( )2
n
n n n n
nf f t t n f
n t f t t t f t
ω ω ∞
=−∞
∞
=−∞
= > Ω = Ω − Ω
= = Ω − Ω
∑
∑
2 2 2
Inverse FT
2 2
2 ( ) 2
ˆexpand ( ) into a Fourier series:
1ˆ ˆ ˆ( ) , ( ) ( ) ( )2
ˆ ˆ( ) ( ) ( )
1 1( ) ( )
2 2
n n n
n
j t j t j t n n n
j t j t
j t t j n n
f
f c e c f e d f e d f t
f t f e d f e d
f t e d f t e
πω πω πω
πω πω
πω πω
ω
ω ω ω ω ω
ω ω ω ω
ω
Ω ∞ −
−Ω −∞
∞ Ω
−∞ −Ω Ω
−
−Ω
= = = = Ω
= =
= = Ω Ω
∑ ∫ ∫
∫ ∫
∑∫
( ) ( )
( )
Inverse FT of rect function
sinc 2 ( ) basis of bandlimited functions
nt t
n n n
d
t t f t
ω Ω
−
−Ω
∞
=−∞
= Ω −
∑ ∫
∑
Fall 2004 Pattern Recognition for Vision
Fourier Series—Discrete Signals
21
0
21
0
( )
1( )
j knN N
k n
j knN N
k k
c
f n N
π
π
−−
=
−
=
=
=
∑
∑ -4 -3 -2 -1 0 1 2 3 4
0
1
0
1
Discrete, periodic signal
f n e
c e 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Discrete Fourier Transform (DFT)
Fall 2004 Pattern Recognition for Vision
Fourier Transform—Summary
FS
FTContinuous Continuous
Continuous, periodic
Discrete
Time Frequency
Fall 2004 Pattern Recognition for Vision
Fourier Transform—Summary
Time
Discrete, periodic Discrete, periodic
DFT
Frequency
Fall 2004 Pattern Recognition for Vision
Discrete Fourier Transform—Fast Fourier Transform
Decimation in Space N −1 − j 2π kn
( ) (ck =∑ f n e N O N 2 ) n=0
N / 2 1 − j 2π k n N / 2 1 − j 2π k (2n+1)− 2 − N N(2 )f n e + ∑ f (2n +1) e= ∑
n=0 n=0
N / 2 1 − j 2π kn − j 2π k N / 2 1 − j 2π kn− − N / 2 + e N N / 2(2 ) (2= ∑ f n e f n +1) e ∑
n=0 Wk n=0
e ockck
− j 2π k − j 2π (k N / 2)+
e N = −e N ⇔ Wk = −Wk N / 2+
N / 2 1 − j 2π kn N / 2 1 − j 2π ( k N / 2)n− − +
n e N / 2 n e N / 2 ⇔ cke ef (2 ) f (2 ) +∑ = ∑ = ck N / 2
n=0 n=0
N / 2 1 − j 2π kn N / 2 1 − j 2π ( k N / 2)n− − + N / 2 ⇔ ck
o o(2 (2∑ f n +1) e N / 2 = ∑ f n +1) e c= k N / 2+n=0 n=0
Fall 2004 Pattern Recognition for Vision
Discrete Fourier Transform— Fast Fourier Transform
Wk = −Wk N / 2 ck e +Wk k
o
+ = ck c e e 0 ≤ < ck + e oc= k N / 2 ck N / 2 = ck −W c
k N / 2 + k k
o ock +c= k N / 2
Recursion: N −1 − j 2π kn
( ) N e ock =∑ f n e c0 = c0 + c0n=0
N ' −1 − j2π kn c1 = c0 e − c0
o ' e n e Nc = ∑ f (2 )
k n=0 Complexity: N ' −1 − j2π kn
'oc = ∑ f (2n +1) e N M / 2 ld( M ) Multiplications k
n=0 M ld(M ) Summations ' (N = N / 2 O N 2 / 2)
Fall 2004 Pattern Recognition for Vision
Discrete Fourier Transform—Image Analysis
221 1
0 0
221 1
0 0
1( , ) ( , )
1 ( , )
yx
yx
jjM N N M
x y y x
jjM N N M
y x
e MN
e MN
ππ
ππ
−−− −
= =
−−− −
= =
=
=
∑ ∑
∑ ∑
21
0
21
0
( , ) ( , )
( , )
x
y
jN N
x x
j k yM M
x y x y
π
π
−−
= −−
=
=
=
∑
∑
Mrows
N
y
x
Courtesy of Professors Tomaso Poggio and
k y k x
k y k x
c k k f x y e
f x y e
( , )
k x
c k y f x y e
c k k c k y e
1-D DFT’s
1-D DFT’s columns
Sayan Mukherjee. Used with permission.
Fall 2004 Pattern Recognition for Vision
Discrete Fourier Transform—Image Analysis
xk ykx
y
Example
Fall 2004 Pattern Recognition for Vision
Discrete Fourier Transform—Image Analysis
(A) (B)
(A) (B)
y
x
yω
xω
Image
Amplitude of (A) & Phase of (B)
Spectrum Vertical edges
Horizontal edges
Fall 2004 Courtesy of Professors Tomaso Poggio and Sayan Mukherjee. Used with permission. Pattern Recognition for Vision
Discrete Fourier Transform—Image Analysis
Template Matching∞
' ' ( ) = f (x ) ( ) xcorr x ∫ g x − x dx ' g′( ) = g(−x) −∞
∞ ' ' ' ( ) = ∗ g′ = f (x )g′(x − x )dx f ( ) ˆ ωcorr x f ∫ ω g′( )
−∞
( )f x
( )g x
( )corr x
Fall 2004 Pattern Recognition for Vision
Discrete Fourier Transform—Image Analysis
ϕ
x
y
r
ϕ r
Shape description with Fourier descriptors
Invariance: •Scale •Rotation •Translation
Fall 2004 Pattern Recognition for Vision
Discrete Fourier Transform—Image Analysis
Im Re
t Im
tRe
t
Shape description with Fourier descriptors
Fall 2004 Pattern Recognition for Vision
Windowed Fourier Transform—Motivation
Fourier transform of a chirp signal
2cos( π t ), ω = tinst
Fall 2004 Pattern Recognition for Vision
2) j uf t duπωω ∞
−
−∞
= −∫ [ ]g T⊂ −
( ) )tf u t= − [ ],tf t T t⊂ −
( )f u
u
( )g u t−
t T− t
2j u tf t duπωω
∞ −
−∞
= ∫
Windowed Fourier Transform
( , ) ( ) ( f u g u t e
Windowed Fourier Transform (WFT)
supp , 0
( ) ( f u g u supp
Fourier transform: ( , ) ( ) f u e
Fall 2004 Pattern Recognition for Vision
2) j uf t duπωω ∞
−
−∞
= −∫
2 2ˆˆ ( )j t j tf t e g f e dπω πνω ν ν ∞
−
−∞
= −∫
2 ,
, , ,
( )
ˆˆ, ,
j u t
t t t
g u
f t g u g f g f
πω ω
ω ω ωω ∞
−∞
−
= = =∫
2 2 2 ( ) ,
2 ( ) 2 ( )
ˆ ( ) ( ) ,
by :
ˆ( ) ( )
j u j u j u t
j j t
g e du du
u u t
du e g
πω πν ω
π
ν
ν ω
∞ ∞ − − −
−∞ −∞
∞ ′− + − − −
−∞
= − = −
′ −
′ ′ = −
∫ ∫
∫
Windowed Fourier Transform—Time Frequency Symmetry
( , ) ( ) ( f u g u t e
( , ) ( ) ν ω
substitute by ( )
( , ) ( ) ( )
g u t e
u f u dParseval’s Identity
)(
( )
substitute
u t
g u t e g u t e
g u e
π ν ω
ν ω π ν ω
Fall 2004 Pattern Recognition for Vision
Windowed Fourier Transform—Time Frequency Localization
2 2ˆˆ ( )j t j tf t e g f e dπω πνω ν ν ∞
−
−∞
= −∫
2) j uf t duπωω ∞
−
−∞
= −∫
( )f u
u
0( )g u t−
0t
0( , )f tω
ω
t 0tˆ ( )f ν
ν
0ˆ ( )g ν ω−
0ω
0f tω
t
ω
0ω
[ ]/ / 2g T T⊂ −
Time Frequency Symmetry
( , ) ( ) ν ω
( , ) ( ) ( f u g u t e
( , )
supp 2,
Fall 2004 Pattern Recognition for Vision
2( )mt dt
∞
−∞
= ∫ 2ˆ ( )m g dω ω ω ω
∞
−∞
= ∫ 22 2( )t mt t dtσ
∞
−∞
= −∫ 22 2 ˆ( ) ( )m g dωσ ω ω
∞
−∞
= −∫
4 1tω ≥
2( ) 1g t =
2ˆ ( ) 1g ω =
2
( ) atg t a e π−= 21/ 4 /ˆ ( ) ag a e πωω −=
0m mt ω= =
1
4t a σ
π =
4
a ωσ
π =
Windowed Fourier Transform—Time Frequency Localization
t g t
Time Frequency Localization
( ) g t ω ω
πσ σ Heisenberg’s uncertainty principle
1/ 4 (2 )
(2 / )
Fall 2004 Pattern Recognition for Vision
Windowed Fourier Transform—Time Frequency Localization
1/ T− 1/ T -8 -6 -4 -2 0 2 4 6 8
-1
0
1
0 1
0
1
T
T− T -8 -6 -4 -2 0 2 4 6 8 -1
0
1
0 1
0
1
1/ T
Uncertainty Principle
-0.8
-0.6
-0.4
-0.2
0.2
0.4
0.6
0.8
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Time Frequency
-0.8
-0.6
-0.4
-0.2
0.2
0.4
0.6
0.8
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Fall 2004 Pattern Recognition for Vision
t
ω
tσ ωσ
tσ ωσ
tσ ωσ
tσ ωσ
0ω
0t
[ ] [ ]
0 0
0 0
( ,
t
f t f
t ω
ω
σ ω σ± ±
1
4t ωσ σ π
>
Windowed Fourier Transform—Time Frequency Localization
Time Frequency Localization
) describes
within a resolution cell:
Uncertainty principle:
Fall 2004 Pattern Recognition for Vision
Windowed Fourier—Reconstruction
( ) ( ) ( )tf u f u g u t= −
2( , ) ( ) j u tf t f u e duπωω
∞ −
−∞
= ∫
Reconstruction Formula
2
,
1( ) ( ) ( , )tf u g u f t d dt C g
C ω ω ω= =∫∫
,
2
2 2
( )
( ) ( ) ( , )
( ) ( ) ( , ) ( )
t
j u
j u
g u
C
g u t f u f t e d
f u g u t dt f t g u t e d dt
ω
πω
πω
ω ω
ω ω
∞
−∞ ∞ ∞ ∞
−∞ −∞ −∞
− =
− = −
∫
∫ ∫ ∫
Fall 2004 Pattern Recognition for Vision
Windowed Fourier—Reconstruction
( )f u
u
t
ω
Reconstruction Formula
Reconstruction
Fall 2004 Pattern Recognition for Vision
Windowed Fourier—Redundancy
Redundancy
( , , )
1( , ) ( , , ) ( , )
g
g
K t t
f t K t t f t dt d C
ω ω
ω ω ω ω ω ′ ′
′ ′ ′ ′ ′ ′= ∫∫
2 2 2, ,( ) ( )
1( , ) ( ), ( ) ( , ), ( , )t tL R L R
f t g u f u g t f t Cω ωω ω ω′ ′ ′ ′= =
, , , , ,( , ) ( ), ( ) ( , , ) ( ) ( )t t t g t tg t g u g u K t t g u g u duω ω ω ω ωω ω ω ∞
′ ′ ′ ′ −∞
′ ′ ′ ′= = = ∫
2
2
-Values of are correlated
-Not any function ( , ) can be a WFT
- ( , ) , is a reproducing kernel Hilbert space
f
f t L
f t L
ω ω
∈
∈ ⊂
F F
Parseval for WFT ,
2
( )
1( ) ( , ) ( )
t
j u
g u
f u f t g u t e d dt C
ω
πωω ω ∞ ∞
−∞ −∞
= −∫ ∫
Fall 2004 Pattern Recognition for Vision
Windowed Fourier Transform—Sampling Theorem
t
ω
sω
st
1s stω <
2( , ) ) sj sf n m duπ ω
∞ −
−∞
= −∫
Sampling Theorem
( ) ( m u f u g u nt e
Reconstruction if
Fall 2004 Pattern Recognition for Vision
Windowed Fourier Transform—Examples
2cos( πu ), ω = uinst
Fall 2004 Pattern Recognition for Vision
Time Scale Analysis—Motivation
( )f t
t
t
ω tσ
tσ
Reconstruction
Features with time scales much shorter and longer than
have to be synthesized with many notes.
Fall 2004 Pattern Recognition for Vision
2 ,
1( ) , , 0s t p
u t u u p
ss ψ ψ ψ− = ∈ ≠ >
( ) 2 , , , ,s t s tf s t u f f Lψ ψ
∞
−∞
= = ∈∫
2
10
( )
red
uu ueψ ψ ψ ψ
−
− −
=
Time Scale Analysis—Continuous Wavelet Transform
( ) 0, L s
( , ) ( ) u f u d
0.5,
1,0
2,15
green
blue
Fall 2004 Pattern Recognition for Vision
2
0
ˆ ( )0 C d
ψ ω ω
ω
∞
±
± < = ∫
, 2 3
0
, ,
( ) ( )s t p
s t s t
f u dtdsψ
ψ ψ
∞ ∞ −
−∞
= ∫ ∫
, , ,,
2 2 s t
s t
C CC C ψ ψ ψ− + = = =
2 3 ,
0
2( ) ( ) p
s tf u dtds C
ψ ∞ ∞
−
−∞
= ∫ ∫
Wavelet Transform—Reconstruction
Reconstruction Formula
< ∞ Admissibility condition:
( , )
reciprocal wavelet family of
u f s t s
if then is self reciprocal: s t
( , ) u f s t s
Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Reconstruction
2
0
ˆ ( )0 C d
ψ ω ω
ω
∞
±
± < = ∫
ˆ 0 0u duψ ψ ∞
−∞
= =∫
C C− + =
ˆif
ˆ ˆ( ) ( )
ψ ψ ψ ψ ω ψ ω− =
< ∞
(0) ( )
is symmetric, is symmetric
if is real-valued,
Admissibility condition:
Symmetry condition:
Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Plancherel&Parseval
2
2 ( ), , , ( )
Lf g f g= ∀ ∈
R R
L
2 31 ,
pf g s sdt
C C
∞ ∞ −
+ −
= + ∫ ∫
L
22 31 ,
pf f s dsdt
C C
∞ ∞ −
+ −
= + ∫ ∫
L
f g L
Plancherel Formula
( , ) ( , ) f s t g s t d−∞ −∞
( , ) f s t −∞ −∞
Parseval Identity
Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Time Frequency Symmetry
2ˆ ˆ( ) ( )j t s s e sπωψ ω ψ ω−=
,
1 s t
u t p s u
ss ψ ψ − = > =
, , , ˆ ˆˆ ˆ, , ( ) ( )s t s t s tf s t f f f duψ ψ ψ ω ω
∞
−∞
= = = ∫
2ˆˆ ( ) ( ) j tf s t s s f e dπωψ ω ω ω ∞
−∞
= ∫
1( , )
u tf s t u
ss ψ
∞
−∞
− =
∫
0.5, 0 : ( )
( , )
( , )
( ) f u d
Fall 2004 Pattern Recognition for Vision
0 0ˆ, (tu t ωψ σ ψ ω ω σ
1( , )
u tf s t u
ss ψ
∞
−∞
′− ′ =
∫
2ˆˆ ( ) ( ) j tf s t s s f e dπωψ ω ω ω ∞
′
−∞
′ = ∫
0 ts t t sσ′+
0 / /s sωω σ
Wavelet Transform—Time Frequency Localization
Time Frequency Localization
( ) is centered at with ) is centered at with
( ) f u d
( , )
time window centered at with
frequency window centered at with
Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Time Frequency Localization
t
ω
t ωσ σ ωσ
tσ ωσ
tσ
0.5 ωσ 2 tσ
0.5 ωσ 2 tσ
2 ωσ
tσ
2 ωσ
tσ
0ω
0 / 2ω
02ω
s
1
2
Time Frequency Localization
size of resolution cells
is constant:
0.5 0.5 1/ 2
Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Redundancy
2 3 ( , ( , )
pf s t s K s tψ
− ′ ′ ′ ′ ′ ′= ∫∫
2, ,, ,s t s tL f fψ ψ= =
L 21 2 1 2, ,
Lf f f f=
L
2, , ,, ( ,s t s t L K s t s tψψ ′ ′ ′ ′= =
2
2- , ce
f
L
L
∈
∈ ⊂
L L
Redundancy
( , ) , ) s t s t f s t d
( , ) f s t
, ) s t ψ ψ ′ ′
-Values of are correlated
-Not any function ( , ) can be a WFT
( , ) is a reproducing kernel Hilbert spa
f s t
f s t
Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Frames
1u
2u
3u 2e
1e
3e T
X Y
* *
, , :
, , , :
j j j
T T X Y
T T X
= →
= →
∑x e
x y 1 , ,M
X X N
X M N
Y Y M
= ∈ >
= u u
T=y x
, ,s tf s t fψ=
x
1 can beMu u
Notion of Frames
T Y
x u
x y
Hilbert space , dim
,...,
Hilbert space , dim
( , )
Unlike the basis the frames ,..., linearly dependent
Fall 2004 Pattern Recognition for Vision
1
if
M
X T Y
X X
T
∈ ∈ ∈
x x
u u
1
2 2 2 , 0
M X X
A T B
∈
≤ ≤
u u
x x x x
* 1 *
*
:
( )
T
S −
= = = = →
x y x
x y y
Wavelet Transform—Frames
is uniquely determined by
then ,..., is a frame of
is injective
,..., is a frame of if:
X B A ∀ ∈ ≥ ≥
Reconstruction of from
is selfadjoint eigenvalues are real
T T T
G T T
Frames are the framework for continuous and discrete wavelets
Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Discrete Wavelets
2 3 ( , ( , )
pf s t s s tψ
− ′ ′ ′ ′ ′ ′= ∫∫
, as a σ σ= >
, 0ab= >
, ( , ), ,ψ ψ∈ ∈R Z
( , ) , ) K s t s t f s t d
( ) 1, elementary dilation step
Redundancy: Discrete wavelets
( , ) t a b σ β β
( , ), to s t s t a b a b
Exponential sampling
Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Discrete Wavelets
0a =
t
1σ 0σ
xσ
β
,s t
1a =
a x=
1b = 2b =
s
, 0ab= >( ) , 1as a σ σ= >
Sampling of the plane
( , ) t a b σ β β
Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Multiresolution Analysis
Multiresolution analysis of discrete signals
( ), f n n ∈ Ζ
Sampling f s t ( , ) on a dyadic grid, orthogonal wavelets: a a= ,σ = 2, β = 1: s = 2 , t b 2 a b ∈ Ζ
Two half band filters with impulse response h n g n ( ), ( )
n ( ) ( ) = f k h n k )fH ( ) = f n ∗ h n ∑ ( ) ( − k
( ) = f n ( ) = f k g n k )fL n ( ) ∗ g n ∑ ( ) ( − k
Downsampling: f * ( ) = (2 ), n nn fH n f * ( ) = (2 ) fLH L
Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Multiresolution Analysis
( )h n
2
2
( )f n
* , ( )
L lf n
* , ( )
H lf n
( )g
* ( )g n
, ( )L lf n
* ( )h n2* , ( )
H lf n
2 +
( )g n
Analysis Wavelet coefficients
( ), n h n
Reconstruction
are a Quadrature mirror filter Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Multiresolution Analysis
1
1 t
h
1
t
2
g h
Haar Wavelets
0.5 are orthonormal
easy to compute but poor frequency resolution
Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Multiresolution Analysis
( )f t
( )f n
* ( )f n
( )g t ( )h t ( )g t ( )h t
Daubechie Wavelets
Matlab Documentation
Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Multiresolution Analysis
Haar Wavelets (Matlab Toolbox)
Screenshot from Matlab Toolbox removed due to copyright reasons.
Fall 2004 Pattern Recognition for Vision
Wavelet Transform—Applications
•Image Compression
•Texture Analysis
•De-noising
•Features for Object Detection and Recognition
Fall 2004 Pattern Recognition for Vision
My Face Detector in 2000
x1, x2 ,…, xn)
Photograph by MIT OCW.
Photograph by MIT OCW.
Search for faces at different resolutions and locations
Feature vector (
Classifier
Off-line training
Face examples
Non-face examples
Feature Extraction
Pixel pattern
Classification Result
19x19
Non-linear SVM 5 min / image
Fall 2004 Pattern Recognition for Vision
My Face Detector in 2000
My experiments with different types of features
Gradient
AI Memo 2000
Fall 2004 Pattern Recognition for Vision
Photographs courtesy of CMU/VASC Image Database at http://vasc.ri.cmu.edu/idb/html/face/frontal_images/_________________________________________
19x19 polynomialclassifier
Speeding-up Face Detection
l il i
i l ifier
i il i
7%
8% 1%
Fi ion
Photograph courtesy of CMU/VASC Image Database at http://vasc.ri.cmu.edu/idb/html/face/frontal_images/
3x3 c ass fier 9x9 c ass fier
17x17 l near c ass
Or ginal mage 17x17 polynomial c ass fier
100%
70%
22%
30%
Removed
Propagated
nal detect
Fall 2004 Pattern Recognition for Vision
Hierarchical Face Detection—Heisele, CVPR 2001
Fall 2004 Pattern Recognition for Vision
Viola&Jones Detector—Viola&Jones, CVPR 2001
Speed is proportional to the average number of features computed per sub-window.
On the MIT+CMU test set, an average of 9 features out of a total of 6061 are computed per sub-window.
On a 700 Mhz Pentium III, a 384x288 pixel image takes about 0.067 seconds to process (15 fps).
Roughly 15 times faster than Rowley-Baluja-Kanade and 600 times faster than Schneiderman-Kanade.
Fall 2004 Pattern Recognition for Vision
Viola&Jones Detector—Image Features
ht(x) = +1 if ft(x) > θt -1
100 =×
, P., and M. Jones. Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference1 (2001): I-511 - I-518. Graphs courtesy of IEEE. Copyright 2001 IEEE. Used with Permission.
“Rectangle filters”
Similar to Haar wavelets
Differences between sums of pixels in adjacent rectangles
otherwise 000,000,16 000,160
Unique Features
Series of photos and figures from Rapid object detection using a boosted cascade of simple features. Viola
Fall 2004 Pattern Recognition for Vision
Photo removed due to
copyright considerations.
Viola&Jones Detector
How exactly does it work??Read the CVPR 2001 paper…….or wait until the lecture on object detection by Mike Jones
Fall 2004 Pattern Recognition for Vision
Integral Image—Viola&Jones CVPR 2001
∑ ≤ ≤
= yy xx
yI
' '
(),
D
BACADCBAA
D
= +++−++++=
+−+= )()(
41
Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference1 (2001): I-511 - I-518. Graphs courtesy of IEEE. Copyright 2001 IEEE. Used with Permission.
Define the Integral Image
Any rectangular sum can be computed in constant time:
Rectangle features can be computed as differences between rectangles
x I y x ) ' , ' ( '
)3 2 (
Series of photos and figures from Rapid object detection using a boosted cascade of simple features. Viola, P., and M. Jones.
Fall 2004 Pattern Recognition for Vision
Example Classifier for Face Detection—Viola&Jones CVPR 2001
Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference1 (2001): I-511 - I-518. Graphs courtesy of IEEE. Copyright 2001 IEEE. Used with Permission.
ROC curve for 200 feature classifier
A classifier with 200 rectangle features was learned using AdaBoost
95% correct detection on test set with 1 in 14084 false positives.
Not quite competitive...
Series of photos and figures from Rapid object detection using a boosted cascade of simple features. Viola, P., and M. Jones.
Fall 2004 Pattern Recognition for Vision
Photographs removed due to
copyright considerations.
Literature
Ingrid Daubechies “Ten Lectures on Wavelets”For mathematicians only, many proofs
Gerald Kaiser “A Friendly Guide to Wavelets” Not friendly, but easier to understand than above
Christian Blatter “Wavelets—A Primer”more friendly than Kaiser…
Stephane Mallat “Multifrequency Channel Decomposition” IEEE Acoustics Speech & Signal Proc. 1989 One of the early papers on multiresolution analysis
Fall 2004 Pattern Recognition for Vision
Homework
•Some proofs (optional)
•Template matching with Fourier transform Shape representation with Fourier descriptors (stick to instructions)
•Playing with wavelets
Fall 2004 Pattern Recognition for Vision