Lec10 matching

The University of

Ontario

CS 433/557

Algorithms for Image Analysis

Template Matching

Acknowledgements: Dan Huttenlocher

http://www.uwo.ca/

The University of

Ontario

CS 433/557 Algorithms for Image Analysis Matching and Registration

� Template Matching• intensity based (correlation measures)

• feature based (distance transforms)

� Flexible Templates• pictorial structures

– Dynamic Programming on trees – generalized distance transforms

Extra Material:

http://www.uwo.ca/

The University of

Ontario

Intensity Based Template MatchingBasic Idea

Left ventricle template

Find best template “position” in the image

Face templateimage

image

http://www.uwo.ca/

The University of

Ontario

Intensity-BasedRigid Template matching

imagecoordinate

system stemplate

coordinatesystem

pixel p in template T

pixel p+s in image

For each position s of the template compute some goodness of “match” measure Q(s)

2|)()(|1

1)(

pTspIsQ

Tp∑

∈

−+⋅+=

αe.g. sum of squared

differences

Sum over all pixels p in template T

http://www.uwo.ca/

The University of

Ontario


imagecoordinate

system

s1

templatecoordinate

system

s2

)2()1( sQsQ <

Search over all plausible positions s and find the optimal one that has the largest goodness of match value Q(s)

http://www.uwo.ca/

The University of

Ontario


� What if intensities of your image are not exactly the same as in the template? (e.g. may happen due to different gain setting at image acquisition)

http://www.uwo.ca/

The University of

Ontario

Other intensity basedgoodness of match measures

� Normalized correlation

� Mutual Information (next slide)

)()()()(

)()(

)(pTpTspIspI

pTspI

sQ Tp

⋅⋅+⋅+

⋅+=

∑∈

http://www.uwo.ca/

The University of

Ontario

Other goodness of match measures :Mutual Information

� Will work even in extreme cases

In this example the spatial structure of template and image object are similar while actual

intensities are completely different

http://www.uwo.ca/

The University of

Ontario

Other goodness of match measures : Mutual Information

Fix s and consider joint histogram of intensity “pairs”: TpforIT spp ∈+ ),(

s1

•Mutual information between template T and image I (for given transformation s) describes “peakedness” of the joint histogram•measures how well spatial structures in T and I align

s2

Joint histogram is more concentrated (peaked)

for s2T

I

T

I

Joint histogram is spread-out

for s1T

I

http://www.uwo.ca/

The University of

Ontario

Mutual Information(technical definition)

),()()(),( YXeYeXeYXMI −+=

∑∈

⋅−=)(

)Pr(ln)Pr()(Xrangex

xxXe

∑ ⋅−=yx

yxyxYXe,

),Pr(ln),Pr(),(

entropy and joint entropy e for random variables X and Ymeasures “peakedness” of histogram/distribution

Assuming two random variables X and Y their mutual information is

joint histogram (distribution)

marginal histogram (distribution)

http://www.uwo.ca/

The University of

Ontario

Mutual InformationComputing MI for a given position s

∑ ⋅yx yx

yxyx

, )Pr()Pr(

),Pr(ln),Pr(

•We want to find s that maximizes MI that can be written as

T

I

joint distribution Pr(x,y)(normalized histogram)

for a fixed given sNOTE: has to be careful when computing.For example, what if H(x,y)=0 for a given pair (x,y)?

∑=y

yxx ),Pr()Pr( ∑=x

yxy ),Pr()Pr(

marginal distributions Pr(x) and Pr(y)

http://www.uwo.ca/

The University of

OntarioFinding optimal template position s

� Need to search over all feasible values of s• Template T could be large

– The bigger the template T the more time we spend computing goodness of match measure at each s

• Search space (of feasible positions s) could be huge– Besides translation/shift, position s could include scale,

rotation angle, and other parameters (e.g. shear)

� Q: Efficient search over all s?

http://www.uwo.ca/

The University of

OntarioFinding optimal template position s

� One possible solution: Hierarchical Approach

1. Subsample both template and image. Note that the search space can be significantly reduced. The template size is also reduced.

2. Once a good solution(s) is found at a corser scale, go to a finer scale. Refine the search in the neighborhood of the courser scale solution.

http://www.uwo.ca/

The University of

OntarioFeature Based Template Matching

� Features: edges, corners,… (found via filtering)� Distance transforms of binary images� Chamfer and Housdorff matching� Iterated Closed Points

http://www.uwo.ca/

The University of

Ontario

Feature-based Binary Templates/ModelsWhat are they?

� What are features? • Object edges, corners, junctions, e.t.c.

– Features can be detected by the corresponding image filters

• Intensity can also be a considered a feature but it may not be very robust (e.g. due to illumination changes)

� A model (binary template) is a set of feature points in N-dimensional space (also called feature space)

• Each feature is defined by a descriptor (vector)

{ } Nn RMMM ⊂= ,...,1

Ni RM ∈

http://www.uwo.ca/

The University of

Ontario

Binary Feature Templates (Models) 2D example

• Links may represent neighborhood relationshipsbetween the features of the model

• Model’s features are represented by points

– descriptor could be a 2D vector specifying feature position with respect to model’s coordinate system

– Feature spaces could be 3D (or higher). E.g., position of an edge in a medical volumes is a 3D vector. But even in 2D images edge features can be described by 3D vectors (add edge’s angular orientation to its 2D location)

referencepoint

iM

jM

iM

2D feature space

For simplicity, we will mainly concentrate on 2D feature space examples

iM

http://www.uwo.ca/

The University of

OntarioMatching Binary Template to Image

iM

L

L - model’s positioning iML ⊕ - position of feature i

At fixed position L we can compute match quality Q(L)

using some goodness of match criteria.

Example: Q(L) = number of (exact) matches (in red) between model and

image features (e.g. edges).

Object is detected at all positions which are local maxima of function Q(L)

such that where K is some presence threshold

L

KLQ >)ˆ(

http://www.uwo.ca/

The University of

OntarioExact feature matching is not robust

iM

L

Counting exact matches may be sensitive to even minor deviation in shape between the model and the actual object appearance

http://www.uwo.ca/

The University of

OntarioDistance TransformMore robust goodness of match measures use

distance transform of image features

1. Detect desirable image features (edges, corners, e.t.c.) using appropriate filters

2. For all image pixels p find distance D(p) to the nearest image feature

p

0)( =pD

q

0)( >qD

)()( qDsD >

s

http://www.uwo.ca/

The University of

OntarioDistance Transform

34

23

23

5 4 4

223

112

2 1 1 2 11 0 0 1 2 1

0001

2321011 0 1 2 3 3 2

101110 1

2

1 0 1 2 3 4 3 210122

Distance Transform Image features (2D)

Distance Transform is a function that for each image pixel p assigns a non-negative number corresponding to

distance from p to the nearest feature in the image I

)(⋅ID)( pDI

http://www.uwo.ca/

The University of

OntarioDistance Transform

Image features(edges)

Distance Transform

Distance Transform can be visualized as a gray-scale image

ID

ID

http://www.uwo.ca/

The University of

Ontario

Distance Transform can be very efficiently computed

http://www.uwo.ca/

The University of

Ontario

Metric properties ofdiscrete Distance Transforms

- 1

1 0

0 1

1 -

Forward mask

Backward mask

Manhattan (L1) metric

Set of equidistantpoints

Metric

1.4 1

1 0

1.4 0

1.4 1

1

1.4Better approximationof Euclidean metric

Exact Euclidean Distance transform can be computed fairly efficiently (in linear time) without bigger masks.www.cs.cornell.edu/~dph/matchalgs/

Euclidean (L2) metric

http://www.uwo.ca/

The University of

Ontario

Goodness of Match viaDistance Transforms

At each model position one can “probe” distance transform values at locations specified by model (template) features

34

23

23

5 4 4

223

112

2 1 1 2 11 0 0 1 2 1

0001

2321011 0 1 2 3 3 2

101110 1

2

1 0 1 2 3 4 3 210122

Use distance transform values as evidence of

proximity to image features.

http://www.uwo.ca/

The University of

Ontario

Goodness of Match Measuresusing Distance Transforms

� Chamfer Measure• sum distance transform values “probed” by template features

� Hausdorff Measure• k-th largest value of the distance transform at locations

“probed” by template features• (Equivalently) number of template features with “probed”

distance transform values less than fixed (small) threshold– Count template features “sufficiently” close to image features

� Spatially coherent matching

http://www.uwo.ca/

The University of

OntarioHausdorff Matching

counting matches with a dialated set of image features

http://www.uwo.ca/

The University of

OntarioSpatial Coherence of Feature Matches

50% matched

50% matched

'L "LSpatially incoherent

matchesSpatially incoherent

matches

• Few “discontinuities” between neighboring features• Neighborhood is defined by links between template/model features

Spatial coherence:

http://www.uwo.ca/

The University of

OntarioSpatially Coherent Matching

Separate template/model features into three subsets

Count the number of non-boundary matchable features

•Matchable (red) -near image features•Boundary (blue circle)

-matchable but “near” un-matchable-links define “near” for model features

•Un-matchable (gray) -far from image features

http://www.uwo.ca/

The University of

Ontario

L

Spatially Coherent Matching

L

Percentage of non-boundary matchable features (spatially coherent matches)

%0 %50≈

http://www.uwo.ca/

The University of

OntarioComparing different match measures

Binary model(edges)

5% clutter image

• Monte Carlo experiments with known object location and synthetic clutter and occlusion -Matching edge locations• Varying percent clutter -Probability of edge pixel 2.5-15%• Varying occlusion -Single missing interval 10-25% of the boundary• Search over location, scale, orientation

http://www.uwo.ca/

The University of

Ontario

Comparing different match measures:ROC curves

Probability of false alarm versus detection- 10% and 15% of occlusion with 5% clutter-Chamfer is lowest, Hausdorff (f=0.8) is highest-Chamfer truncated distance better than trimmed

http://www.uwo.ca/

The University of

Ontario

ROC’s forSpatial Coherence Matching

Clutter 3%Occlusion 20%

FA

CD

0 1

1


0=β

0>β


0=β

0>β


0=β

0>β

FA

CD

0 1

1

FA

CD

0 1

1

FA

CD

0 1

1

• Parameterdefined degree of connectivity betweenmodel features

• If then model features are not connected at all. In this case, spatially coherent matching reduces to plain Hausdorff matching.

0=β

β

http://www.uwo.ca/

The University of

OntarioEdge Orientation Information

� Match edge orientation (in addition to location)• Edge normals or gradient direction

� 3D model feature space (2D location + orientation)� Extract 3D (edge) features from image as well.� Requires 3D distance transform of image features

• weight orientation versus location • fast forward-backward pass algorithm applies

� Increases detection robustness and speeds up matching• better able to discriminate object from clutter• better able to eliminate cells in branch and bound search

http://www.uwo.ca/

The University of

OntarioROC’s for Oriented Edge Pixels

� Vast Improvement for moderate clutter• Images with 5% randomly generated contours• Good for 20-25% occlusion rather than 2-5%

Oriented Edges Location only

http://www.uwo.ca/

The University of

Ontario

Efficient search for good matching positions L

� Distance transform of observed image features needs to be computed only once (fast operation).

� Need to compute match quality for all possible template/model locations L (global search)• Use hierarchical approach to efficiently prune the search space.

� Alternatively, gradient descent from a given initial position (e.g. Iterative Closest Point algorithm, …later)• Easily gets stuck at local minima • Sensitive to initialization

http://www.uwo.ca/

The University of

Ontario

Global Search Hierarchical Search Space Pruning

Assume that the entire box might be pruned out if the match quality is sufficiently bad in the center of the box

(how? … in a moment)

http://www.uwo.ca/

The University of

Ontario

Global SearchHierarchical Search Space Pruning

If a box is not pruned then subdivide it into smaller boxes and test the centers of these smaller boxes.

http://www.uwo.ca/

The University of

Ontario

Global Search Hierarchical Search Space Pruning

•Continue in this fashion until the object is localized.

http://www.uwo.ca/

The University of

Ontario

Pruning a Box (preliminary technicality)

Location L’ is uniformly better than L” if for all model features i

A uniformly better location is guaranteed to have better match quality!

5 6 7 7 76 6

2

4

34

1

2

0

1 0 1

0

5

1

2

34

4

4

3

21

1

0

1

33

4

11

2

0

1

L’

)"()'( iIiI MLDMLD ⊕≤⊕

L”

9 10 11 10 98 7

12

9

1110

11

8

10

7 6 5

4

8

3

2

34

5

4

3

21

1

0

1

67

8

34

5

2

9

http://www.uwo.ca/

The University of

Ontario

Pruning a Box (preliminary technicality)

7

5 6 7 6 54 3

8

56

7

4

6

3 2 1

0

4

0

0

00

1

0

0

00

0

0

0

23

4

00

1

0

5

λhypothetical

location

� Assume that is uniformly better than any location

then the match quality satisfies for any

� If the presence test fails ( for a given threshold K)

then any location must also fail the test

� The entire box can be pruned by one test at !!!! λBoxL ∈

λ BoxL ∈

Assume that is uniformly better than any

location in the box

λ

KQ <)(λ

)()( LQQ ≥λ BoxL ∈

http://www.uwo.ca/

The University of

OntarioBuilding “ “ for a Box of “Radius” n

at the center of the box

9 10 11 10 98 7

12

9

1110

11

8

10

7 6 5

4

8

3

2

34

5

4

3

21

1

0

1

67

8

34

5

2

9

• value of the distance transform changes at most by 1 between neighboring pixels

λ

BoxLanyforMLDnpD ii ∈+≤− ),())((

)( ipD• value of can decrease by at most n (box radius) for other box positions

)( ipD7

5 6 7 6 54 3

8

56

7

4

6

3 2 1

0

4

0

0

00

1

0

0

00

0

0

0

23

4

00

1

0

5

λhypothetical

location

{ }0,))((max npD i −

http://www.uwo.ca/

The University of

Ontario

Global Hierarchical Search (Branch and Bound)� Hierarchical search works in more

general case where “position” L includes translation, scale, and orientation of the model• N-dimensional search space

� Guaranteed or admissible search heuristic• Bound on how good answer could be in

unexplored region– can not miss an answer

• In worst case won’t rule anything out� In practice rule out vast majority of

template locations (transformations)

http://www.uwo.ca/

The University of

Ontario

Local Search (gradient descent):Iterated Closest Point algorithm

� ICP: Iterate until convergence1. Estimate correspondence between each template feature i and

some image feature located at F(i) (Fitzgibbons: use DT)2. Move model to minimize the sum of distances between the

corresponding features (like chamfer matching)

� Alternatively, find local move of the model improving DT-based match quality function Q(L)

)(~ LQL −∇∆

L∆

i

)(iF

∑ −⊕∇−∆i

i iFMLL 2))()((~

http://www.uwo.ca/

The University of

Ontario

Problems with ICPand gradient descent matching

� Slow• Can take many iterations• ICP: each iteration is slow due to search for

correspondences – Fitzgibbons: improve this by using DT

� No convergence guarantees• Can get stuck in local minima

– Not much to do about this

– Can be improved by using robust distance measures (e.g. truncated Euclidean measure)

http://www.uwo.ca/

The University of

OntarioObservations on DT based matching

� Main point of DT: allows to measure match quality without explicitly finding correspondence between pairs of mode and image features (hard problem!)

� Hierarchical search over entire transformation space� Important to use robust distance

• Straight Chamfer very sensitive to outliers• Truncated DT can be computed very fast

� Fast exact or approximate methods for DT ( metric)� For edge features use orientation too

• edge normals or intensity gradients

2L

http://www.uwo.ca/

The University of

Ontario

Rigid 2D templatesShould we really care?

� So far we studied matching in case of 2D images and rigid 2D templates/models of objects• When do rigid 2D templates work?

– there are rigid 2D objects (e.g. fingerprints)

– 3D object may be imaged from the same view point:• controlled image-based data bases (e.g. photos of employees,

criminals) • 2D satellite images always view 3D objects from above• X-Rays, microscope photography, e.t.c.

http://www.uwo.ca/

The University of

OntarioMore general 3D objects

� 3D image volumes and 3D objects• Distance transforms, DT-based matching criteria,

and hierarchical search techniques easily generalize • Mainly medical applications

� 2D image and 3D objects• 3D objects may be represented by a collection of 2D

templates (e.g. tree-structured templates, next slide)• 3D objects may be represented by flexible 2D

templates (soon)

http://www.uwo.ca/

The University of

OntarioTree-structured templates

Larger pair-wise differences higher in tree

http://www.uwo.ca/

The University of

Ontario

� Rule out multiple templates simultaneously

- Speeds up matching - Course-to-fine search where coarse granularity can rule out many templates

- Applies to variety of DT based matching measures:

Chamfer, Hausdorff, robust Chamfer

Tree-structured templates

http://www.uwo.ca/

The University of

OntarioFlexible Templates

• parts connected by springs and appearance models for each part

• Used for human bodies, faces

• Fischler & Elschlager, 1973 – considerable recent work (e.g. Felzenszwalb & Huttenlocher, 2003 )

Flexible Template combines a number of r igid templates

connected by f lexible strings

http://www.uwo.ca/

The University of

Ontario

Flexible TemplatesWhy?

� To account for significant deviation between proportions of generic model (e.g average face template) and a multitude of actual object appearance

� non-rigid (3D) objects may consist of multiple rigid parts with (relatively) view independent 2D appearance

http://www.uwo.ca/

The University of

Ontario

Flexible Templates:Formal Definition

� Set of parts� Positioning Configuration

• specifies locations of the parts� Appearance model

• matching quality of part i at location � Edge for connected parts

• explicit dependency between edge-connected parts� Interaction/connection energy

• e.g. elastic energy

},...,{ 1 nvvV =},...,{ 1 nllL =

Evve jiij ∈= ),(

),( jiij llC2||||),( jijiij llllC −=

)( ii lm

il

http://www.uwo.ca/

The University of

Ontario

Flexible Templates:Formal Definition

� Find configuration L (location of all parts) that minimizes

� Difficulty depends on graph structure• Which parts are connected (E) and how (C)

� General case: exponential time

),()()( ji Eij

iijii llClmLE ∑ ∑∈

+=

http://www.uwo.ca/

The University of

Ontario

Flexible Templates:simplistic example from the past

� Discrete Snakes• What graph?• What appearance model?• What connection/interaction model?

• What optimization algorithm?

1v2v

3v

4v6v

5v

),()()( ji Eij


+=

http://www.uwo.ca/

The University of

Ontario

Flexible Templates:special cases

� Pictorial Structures • What graph?

• What appearance model?

-intensity based match measure

-DT based match measure (binary templates)• What connection/interaction model?

-elastic springs• What optimization algorithm?

),()()( ji Eij


+=

http://www.uwo.ca/

The University of

Ontario

Dynamic Programming forFlexible Template Matching

� DP can be used for minimization of E(L) for tree graphs (no loops!)

),()()( ji Eij


+=

1v

7v9v8v 10v

6v

2v

3v

4v

5v

11v

http://www.uwo.ca/

The University of

Ontario


� DP algorithm on trees• Choose post-order traversal for

any selected “root” site/part

• Compute for all “leaf” parts• Process a part after its children are processed

• Select best energy position for the “root” and backtrack to “leafs”

),()()( ji Eij


+=

1v

7v9v8v 10v

6v

2v

3v

4v

5v

11v root

)()( lmlE ii =

)},()({min)()( , llClElmlE aiaaal

iia

++=

...)},()({min)},()({min)()( ,, +++++= llClEllClElmlE bibbbl

aiaaal

iiba

I f part ‘ i ‘ has only one child ‘a’

If part ‘ i ‘ has two (or more) chi ldren ‘a’, ‘b’, …

http://www.uwo.ca/

The University of

Ontario


� DP’s complexity on trees (same as for 1D snakes)

– n parts, m positions – OK complexity for local search where “m”

is relatively small (e.g. in snakes)• E.g. for tracking a flexible model from

frame to frame in a video sequence

),()()( ji Eij


+=

1v

7v9v8v 10v

6v

2v

3v

4v

5v

11v root

2mn ⋅

http://www.uwo.ca/

The University of

Ontario

Local SearchTracking Flexible Templates

http://www.uwo.ca/

The University of

Ontario

Searching in the whole image (large m)

m = image size or

m = image size*rotations

� Then complexity is not good � For some interactions can improve to based on

Generalized Distance Transform (from Computational Geometry)

2nm)( mnO ⋅

This is an amazing complexity for matching n dependent partsNote that is the number of operations for finding n independent matches mn ⋅

)( mnO ⋅

http://www.uwo.ca/

The University of

OntarioGeneralized Distance Transform

� Idea: improve efficiency of the key computational step (performed for each parent-child pair, n-times)

( operations))},()({min)(ˆ yxCxEyEx

+= 2m

Intuit ively: if x and y describe all feasible positions of “parts” in the image then energy functions and can be though of as some gray-scale images (e.g. like responses of the original image to some filters)

)(ˆ yE)(xE

http://www.uwo.ca/

The University of

OntarioGeneralized Distance Transform

� Idea: improve efficiency of the key computational step ( operations performed for each parent-child pair)

� Let (distance between x and y)

reasonable interactions model!

� Then is called a Generalized Distance Transform of

)},()({min)(ˆ yxCxEyEx

+=

||||),( yxyxC −⋅= α

2m

)(ˆ yE)(xE

http://www.uwo.ca/

The University of

Ontario

From Distance Transform toGeneralized Distance Transform

� Assuming

then

is standard Distance Transform (of image features)

||}||)({min)(ˆ yxxEyEx

−+=

∞

=..

0)(

WO

featureimageisxifxE

)(xE

)(ˆ yE

∞+ ∞+∞+

Locations of binary image features

http://www.uwo.ca/

The University of

Ontario

From Distance Transform toGeneralized Distance Transform

� For general and any fixed

is called Generalized Distance Transform of


−⋅+= α

)(xE

E(x) may represent non-binary image features (e.g. image intensity gradient)

)(xE

)(xE

)(ˆ yEEE(y) may preferstrength of E(x)

to proximity

α

http://www.uwo.ca/

The University of

Ontario

Algorithm for computing Generalized Distance Transform

� Straightforward generalization of forward-backward pass algorithm for standard Distance Transforms

∞=

..

0)(

WO

featureimageisxifxδ• Initialize to E(x) instead of

• Use instead of 1 α

http://www.uwo.ca/

The University of

Ontario

Flexible Template Matching Complexity

� Computing

via Generalized Distance Transform:

previously

(m-number of positions x and y)

� Improves complexity of Flexible Template Matching to in case of interactions


−⋅+= α

)(mO)( 2mO

||||),( yxyxC −⋅= α)(nmO

http://www.uwo.ca/

The University of

Ontario

“Simple” Flexible Template Example:Central Part Model

� Consider special case in which parts translate with respect to common origin• E.g., useful for faces

•Parts•Distinguished central part•Connect each to•Elastic spring costs

},...,{ 1 nvvV =1v

iv 1v

NOTE: for simplicity (only) we consider part positions that are translations only (no rotation or scaling of parts)

il

http://www.uwo.ca/

The University of

OntarioCentral Part Model example

� “Ideal” location w.r.t. is given by where is a fixed translation vector for each i>1

il 1lio

ii ollT += 11)(

ivioil

3v3o3l

2v

2o

2l

1l

1v

∑>

−⋅++1

111 ||})(||)({)(i

iiiii lTllmlm α

� Whole template energy

||)(|| 1lTl iii −⋅α� “String cost for deformation from this “ideal” location

http://www.uwo.ca/

The University of

Ontario

Central Part Model Summary of search algorithm

1. For each non-central part i>1 compute matching cost for all possible positions of that part in the image

2. For each i>1 compute Generalized DT of

3. For all possible positions of the central part compute energy

4. Select the best location or select all locations with larger then a fixed threshold

∑>

−⋅++1

111 ||})(||)({)(i

iiiii lTllmlm α

||}||)({min)(ˆ yxxmyE iix

i −⋅+= α

)( ii lm

il)(xmi

∑>

+=1

11111 ))((ˆ)()(ˆi

ii lTElmlE

)( mnO ⋅

)(ˆ11 lE1l

Matching cost:

http://www.uwo.ca/

The University of

Ontario

Central Part Modelfor face detection

http://www.uwo.ca/

The University of

Ontario

Search Algorithmfor tree-based pictorial structures

)( mnO ⋅

•Algorithm is basically the same as for Central Part Model. •Each “parent” part knows ideal positions of “child” parts.•String deformations are accounted for by the Generalized Distance Transform of the children’s positioning energies

http://www.uwo.ca/

Lec10 matching

Documents

Transcript of Lec10 matching