Lec10 matching
-
Upload
suravet-konsetthee -
Category
Documents
-
view
125 -
download
2
Transcript of Lec10 matching
The University of
Ontario
CS 433/557
Algorithms for Image Analysis
Template Matching
Acknowledgements: Dan Huttenlocher
The University of
Ontario
CS 433/557 Algorithms for Image Analysis Matching and Registration
� Template Matching• intensity based (correlation measures)
• feature based (distance transforms)
� Flexible Templates• pictorial structures
– Dynamic Programming on trees – generalized distance transforms
Extra Material:
The University of
Ontario
Intensity Based Template MatchingBasic Idea
Left ventricle template
Find best template “position” in the image
Face templateimage
image
The University of
Ontario
Intensity-BasedRigid Template matching
imagecoordinate
system stemplate
coordinatesystem
pixel p in template T
pixel p+s in image
For each position s of the template compute some goodness of “match” measure Q(s)
2|)()(|1
1)(
pTspIsQ
Tp∑
∈
−+⋅+=
αe.g. sum of squared
differences
Sum over all pixels p in template T
The University of
Ontario
Intensity-BasedRigid Template matching
imagecoordinate
system
s1
templatecoordinate
system
s2
)2()1( sQsQ <
Search over all plausible positions s and find the optimal one that has the largest goodness of match value Q(s)
The University of
Ontario
Intensity-BasedRigid Template matching
� What if intensities of your image are not exactly the same as in the template? (e.g. may happen due to different gain setting at image acquisition)
The University of
Ontario
Other intensity basedgoodness of match measures
� Normalized correlation
� Mutual Information (next slide)
)()()()(
)()(
)(pTpTspIspI
pTspI
sQ Tp
⋅⋅+⋅+
⋅+=
∑∈
The University of
Ontario
Other goodness of match measures :Mutual Information
� Will work even in extreme cases
In this example the spatial structure of template and image object are similar while actual
intensities are completely different
The University of
Ontario
Other goodness of match measures : Mutual Information
Fix s and consider joint histogram of intensity “pairs”: TpforIT spp ∈+ ),(
s1
•Mutual information between template T and image I (for given transformation s) describes “peakedness” of the joint histogram•measures how well spatial structures in T and I align
s2
Joint histogram is more concentrated (peaked)
for s2T
I
T
I
Joint histogram is spread-out
for s1T
I
The University of
Ontario
Mutual Information(technical definition)
),()()(),( YXeYeXeYXMI −+=
∑∈
⋅−=)(
)Pr(ln)Pr()(Xrangex
xxXe
∑ ⋅−=yx
yxyxYXe,
),Pr(ln),Pr(),(
entropy and joint entropy e for random variables X and Ymeasures “peakedness” of histogram/distribution
Assuming two random variables X and Y their mutual information is
joint histogram (distribution)
marginal histogram (distribution)
The University of
Ontario
Mutual InformationComputing MI for a given position s
∑ ⋅yx yx
yxyx
, )Pr()Pr(
),Pr(ln),Pr(
•We want to find s that maximizes MI that can be written as
T
I
joint distribution Pr(x,y)(normalized histogram)
for a fixed given sNOTE: has to be careful when computing.For example, what if H(x,y)=0 for a given pair (x,y)?
∑=y
yxx ),Pr()Pr( ∑=x
yxy ),Pr()Pr(
marginal distributions Pr(x) and Pr(y)
The University of
OntarioFinding optimal template position s
� Need to search over all feasible values of s• Template T could be large
– The bigger the template T the more time we spend computing goodness of match measure at each s
• Search space (of feasible positions s) could be huge– Besides translation/shift, position s could include scale,
rotation angle, and other parameters (e.g. shear)
� Q: Efficient search over all s?
The University of
OntarioFinding optimal template position s
� One possible solution: Hierarchical Approach
1. Subsample both template and image. Note that the search space can be significantly reduced. The template size is also reduced.
2. Once a good solution(s) is found at a corser scale, go to a finer scale. Refine the search in the neighborhood of the courser scale solution.
The University of
OntarioFeature Based Template Matching
� Features: edges, corners,… (found via filtering)� Distance transforms of binary images� Chamfer and Housdorff matching� Iterated Closed Points
The University of
Ontario
Feature-based Binary Templates/ModelsWhat are they?
� What are features? • Object edges, corners, junctions, e.t.c.
– Features can be detected by the corresponding image filters
• Intensity can also be a considered a feature but it may not be very robust (e.g. due to illumination changes)
� A model (binary template) is a set of feature points in N-dimensional space (also called feature space)
• Each feature is defined by a descriptor (vector)
{ } Nn RMMM ⊂= ,...,1
Ni RM ∈
The University of
Ontario
Binary Feature Templates (Models) 2D example
• Links may represent neighborhood relationshipsbetween the features of the model
• Model’s features are represented by points
– descriptor could be a 2D vector specifying feature position with respect to model’s coordinate system
– Feature spaces could be 3D (or higher). E.g., position of an edge in a medical volumes is a 3D vector. But even in 2D images edge features can be described by 3D vectors (add edge’s angular orientation to its 2D location)
referencepoint
iM
jM
iM
2D feature space
For simplicity, we will mainly concentrate on 2D feature space examples
iM
The University of
OntarioMatching Binary Template to Image
iM
L
L - model’s positioning iML ⊕ - position of feature i
At fixed position L we can compute match quality Q(L)
using some goodness of match criteria.
Example: Q(L) = number of (exact) matches (in red) between model and
image features (e.g. edges).
Object is detected at all positions which are local maxima of function Q(L)
such that where K is some presence threshold
L
KLQ >)ˆ(
The University of
OntarioExact feature matching is not robust
iM
L
Counting exact matches may be sensitive to even minor deviation in shape between the model and the actual object appearance
The University of
OntarioDistance TransformMore robust goodness of match measures use
distance transform of image features
1. Detect desirable image features (edges, corners, e.t.c.) using appropriate filters
2. For all image pixels p find distance D(p) to the nearest image feature
p
0)( =pD
q
0)( >qD
)()( qDsD >
s
The University of
OntarioDistance Transform
34
23
23
5 4 4
223
112
2 1 1 2 11 0 0 1 2 1
0001
2321011 0 1 2 3 3 2
101110 1
2
1 0 1 2 3 4 3 210122
Distance Transform Image features (2D)
Distance Transform is a function that for each image pixel p assigns a non-negative number corresponding to
distance from p to the nearest feature in the image I
)(⋅ID)( pDI
The University of
OntarioDistance Transform
Image features(edges)
Distance Transform
Distance Transform can be visualized as a gray-scale image
ID
ID
The University of
Ontario
Metric properties ofdiscrete Distance Transforms
- 1
1 0
0 1
1 -
Forward mask
Backward mask
Manhattan (L1) metric
Set of equidistantpoints
Metric
1.4 1
1 0
1.4 0
1.4 1
1
1.4Better approximationof Euclidean metric
Exact Euclidean Distance transform can be computed fairly efficiently (in linear time) without bigger masks.www.cs.cornell.edu/~dph/matchalgs/
Euclidean (L2) metric
The University of
Ontario
Goodness of Match viaDistance Transforms
At each model position one can “probe” distance transform values at locations specified by model (template) features
34
23
23
5 4 4
223
112
2 1 1 2 11 0 0 1 2 1
0001
2321011 0 1 2 3 3 2
101110 1
2
1 0 1 2 3 4 3 210122
Use distance transform values as evidence of
proximity to image features.
The University of
Ontario
Goodness of Match Measuresusing Distance Transforms
� Chamfer Measure• sum distance transform values “probed” by template features
� Hausdorff Measure• k-th largest value of the distance transform at locations
“probed” by template features• (Equivalently) number of template features with “probed”
distance transform values less than fixed (small) threshold– Count template features “sufficiently” close to image features
� Spatially coherent matching
The University of
OntarioHausdorff Matching
counting matches with a dialated set of image features
The University of
OntarioSpatial Coherence of Feature Matches
50% matched
50% matched
'L "LSpatially incoherent
matchesSpatially incoherent
matches
• Few “discontinuities” between neighboring features• Neighborhood is defined by links between template/model features
Spatial coherence:
The University of
OntarioSpatially Coherent Matching
Separate template/model features into three subsets
Count the number of non-boundary matchable features
•Matchable (red) -near image features•Boundary (blue circle)
-matchable but “near” un-matchable-links define “near” for model features
•Un-matchable (gray) -far from image features
The University of
Ontario
L
Spatially Coherent Matching
L
Percentage of non-boundary matchable features (spatially coherent matches)
%0 %50≈
The University of
OntarioComparing different match measures
Binary model(edges)
5% clutter image
• Monte Carlo experiments with known object location and synthetic clutter and occlusion -Matching edge locations• Varying percent clutter -Probability of edge pixel 2.5-15%• Varying occlusion -Single missing interval 10-25% of the boundary• Search over location, scale, orientation
The University of
Ontario
Comparing different match measures:ROC curves
Probability of false alarm versus detection- 10% and 15% of occlusion with 5% clutter-Chamfer is lowest, Hausdorff (f=0.8) is highest-Chamfer truncated distance better than trimmed
The University of
Ontario
ROC’s forSpatial Coherence Matching
Clutter 3%Occlusion 20%
FA
CD
0 1
1
Clutter 5%Occlusion 20%
0=β
0>β
Clutter 5%Occlusion 40%
0=β
0>β
Clutter 3%Occlusion 40%
0=β
0>β
FA
CD
0 1
1
FA
CD
0 1
1
FA
CD
0 1
1
• Parameterdefined degree of connectivity betweenmodel features
• If then model features are not connected at all. In this case, spatially coherent matching reduces to plain Hausdorff matching.
0=β
β
The University of
OntarioEdge Orientation Information
� Match edge orientation (in addition to location)• Edge normals or gradient direction
� 3D model feature space (2D location + orientation)� Extract 3D (edge) features from image as well.� Requires 3D distance transform of image features
• weight orientation versus location • fast forward-backward pass algorithm applies
� Increases detection robustness and speeds up matching• better able to discriminate object from clutter• better able to eliminate cells in branch and bound search
The University of
OntarioROC’s for Oriented Edge Pixels
� Vast Improvement for moderate clutter• Images with 5% randomly generated contours• Good for 20-25% occlusion rather than 2-5%
Oriented Edges Location only
The University of
Ontario
Efficient search for good matching positions L
� Distance transform of observed image features needs to be computed only once (fast operation).
� Need to compute match quality for all possible template/model locations L (global search)• Use hierarchical approach to efficiently prune the search space.
� Alternatively, gradient descent from a given initial position (e.g. Iterative Closest Point algorithm, …later)• Easily gets stuck at local minima • Sensitive to initialization
The University of
Ontario
Global Search Hierarchical Search Space Pruning
Assume that the entire box might be pruned out if the match quality is sufficiently bad in the center of the box
(how? … in a moment)
The University of
Ontario
Global SearchHierarchical Search Space Pruning
If a box is not pruned then subdivide it into smaller boxes and test the centers of these smaller boxes.
The University of
Ontario
Global Search Hierarchical Search Space Pruning
•Continue in this fashion until the object is localized.
The University of
Ontario
Pruning a Box (preliminary technicality)
Location L’ is uniformly better than L” if for all model features i
A uniformly better location is guaranteed to have better match quality!
5 6 7 7 76 6
2
4
34
1
2
0
1 0 1
0
5
1
2
34
4
4
3
21
1
0
1
33
4
11
2
0
1
L’
)"()'( iIiI MLDMLD ⊕≤⊕
L”
9 10 11 10 98 7
12
9
1110
11
8
10
7 6 5
4
8
3
2
34
5
4
3
21
1
0
1
67
8
34
5
2
9
The University of
Ontario
Pruning a Box (preliminary technicality)
7
5 6 7 6 54 3
8
56
7
4
6
3 2 1
0
4
0
0
00
1
0
0
00
0
0
0
23
4
00
1
0
5
λhypothetical
location
� Assume that is uniformly better than any location
then the match quality satisfies for any
� If the presence test fails ( for a given threshold K)
then any location must also fail the test
� The entire box can be pruned by one test at !!!! λBoxL ∈
λ BoxL ∈
Assume that is uniformly better than any
location in the box
λ
KQ <)(λ
)()( LQQ ≥λ BoxL ∈
The University of
OntarioBuilding “ “ for a Box of “Radius” n
at the center of the box
9 10 11 10 98 7
12
9
1110
11
8
10
7 6 5
4
8
3
2
34
5
4
3
21
1
0
1
67
8
34
5
2
9
• value of the distance transform changes at most by 1 between neighboring pixels
λ
BoxLanyforMLDnpD ii ∈+≤− ),())((
)( ipD• value of can decrease by at most n (box radius) for other box positions
)( ipD7
5 6 7 6 54 3
8
56
7
4
6
3 2 1
0
4
0
0
00
1
0
0
00
0
0
0
23
4
00
1
0
5
λhypothetical
location
{ }0,))((max npD i −
The University of
Ontario
Global Hierarchical Search (Branch and Bound)� Hierarchical search works in more
general case where “position” L includes translation, scale, and orientation of the model• N-dimensional search space
� Guaranteed or admissible search heuristic• Bound on how good answer could be in
unexplored region– can not miss an answer
• In worst case won’t rule anything out� In practice rule out vast majority of
template locations (transformations)
The University of
Ontario
Local Search (gradient descent):Iterated Closest Point algorithm
� ICP: Iterate until convergence1. Estimate correspondence between each template feature i and
some image feature located at F(i) (Fitzgibbons: use DT)2. Move model to minimize the sum of distances between the
corresponding features (like chamfer matching)
� Alternatively, find local move of the model improving DT-based match quality function Q(L)
)(~ LQL −∇∆
L∆
i
)(iF
∑ −⊕∇−∆i
i iFMLL 2))()((~
The University of
Ontario
Problems with ICPand gradient descent matching
� Slow• Can take many iterations• ICP: each iteration is slow due to search for
correspondences – Fitzgibbons: improve this by using DT
� No convergence guarantees• Can get stuck in local minima
– Not much to do about this
– Can be improved by using robust distance measures (e.g. truncated Euclidean measure)
The University of
OntarioObservations on DT based matching
� Main point of DT: allows to measure match quality without explicitly finding correspondence between pairs of mode and image features (hard problem!)
� Hierarchical search over entire transformation space� Important to use robust distance
• Straight Chamfer very sensitive to outliers• Truncated DT can be computed very fast
� Fast exact or approximate methods for DT ( metric)� For edge features use orientation too
• edge normals or intensity gradients
2L
The University of
Ontario
Rigid 2D templatesShould we really care?
� So far we studied matching in case of 2D images and rigid 2D templates/models of objects• When do rigid 2D templates work?
– there are rigid 2D objects (e.g. fingerprints)
– 3D object may be imaged from the same view point:• controlled image-based data bases (e.g. photos of employees,
criminals) • 2D satellite images always view 3D objects from above• X-Rays, microscope photography, e.t.c.
The University of
OntarioMore general 3D objects
� 3D image volumes and 3D objects• Distance transforms, DT-based matching criteria,
and hierarchical search techniques easily generalize • Mainly medical applications
� 2D image and 3D objects• 3D objects may be represented by a collection of 2D
templates (e.g. tree-structured templates, next slide)• 3D objects may be represented by flexible 2D
templates (soon)
The University of
OntarioTree-structured templates
Larger pair-wise differences higher in tree
The University of
Ontario
� Rule out multiple templates simultaneously
- Speeds up matching - Course-to-fine search where coarse granularity can rule out many templates
- Applies to variety of DT based matching measures:
Chamfer, Hausdorff, robust Chamfer
Tree-structured templates
The University of
OntarioFlexible Templates
• parts connected by springs and appearance models for each part
• Used for human bodies, faces
• Fischler & Elschlager, 1973 – considerable recent work (e.g. Felzenszwalb & Huttenlocher, 2003 )
Flexible Template combines a number of r igid templates
connected by f lexible strings
The University of
Ontario
Flexible TemplatesWhy?
� To account for significant deviation between proportions of generic model (e.g average face template) and a multitude of actual object appearance
� non-rigid (3D) objects may consist of multiple rigid parts with (relatively) view independent 2D appearance
The University of
Ontario
Flexible Templates:Formal Definition
� Set of parts� Positioning Configuration
• specifies locations of the parts� Appearance model
• matching quality of part i at location � Edge for connected parts
• explicit dependency between edge-connected parts� Interaction/connection energy
• e.g. elastic energy
},...,{ 1 nvvV =},...,{ 1 nllL =
Evve jiij ∈= ),(
),( jiij llC2||||),( jijiij llllC −=
)( ii lm
il
The University of
Ontario
Flexible Templates:Formal Definition
� Find configuration L (location of all parts) that minimizes
� Difficulty depends on graph structure• Which parts are connected (E) and how (C)
� General case: exponential time
),()()( ji Eij
iijii llClmLE ∑ ∑∈
+=
The University of
Ontario
Flexible Templates:simplistic example from the past
� Discrete Snakes• What graph?• What appearance model?• What connection/interaction model?
• What optimization algorithm?
1v2v
3v
4v6v
5v
),()()( ji Eij
iijii llClmLE ∑ ∑∈
+=
The University of
Ontario
Flexible Templates:special cases
� Pictorial Structures • What graph?
• What appearance model?
-intensity based match measure
-DT based match measure (binary templates)• What connection/interaction model?
-elastic springs• What optimization algorithm?
),()()( ji Eij
iijii llClmLE ∑ ∑∈
+=
The University of
Ontario
Dynamic Programming forFlexible Template Matching
� DP can be used for minimization of E(L) for tree graphs (no loops!)
),()()( ji Eij
iijii llClmLE ∑ ∑∈
+=
1v
7v9v8v 10v
6v
2v
3v
4v
5v
11v
The University of
Ontario
Dynamic Programming forFlexible Template Matching
� DP algorithm on trees• Choose post-order traversal for
any selected “root” site/part
• Compute for all “leaf” parts• Process a part after its children are processed
• Select best energy position for the “root” and backtrack to “leafs”
),()()( ji Eij
iijii llClmLE ∑ ∑∈
+=
1v
7v9v8v 10v
6v
2v
3v
4v
5v
11v root
)()( lmlE ii =
)},()({min)()( , llClElmlE aiaaal
iia
++=
...)},()({min)},()({min)()( ,, +++++= llClEllClElmlE bibbbl
aiaaal
iiba
I f part ‘ i ‘ has only one child ‘a’
If part ‘ i ‘ has two (or more) chi ldren ‘a’, ‘b’, …
The University of
Ontario
Dynamic Programming forFlexible Template Matching
� DP’s complexity on trees (same as for 1D snakes)
– n parts, m positions – OK complexity for local search where “m”
is relatively small (e.g. in snakes)• E.g. for tracking a flexible model from
frame to frame in a video sequence
),()()( ji Eij
iijii llClmLE ∑ ∑∈
+=
1v
7v9v8v 10v
6v
2v
3v
4v
5v
11v root
2mn ⋅
The University of
Ontario
Searching in the whole image (large m)
m = image size or
m = image size*rotations
� Then complexity is not good � For some interactions can improve to based on
Generalized Distance Transform (from Computational Geometry)
2nm)( mnO ⋅
This is an amazing complexity for matching n dependent partsNote that is the number of operations for finding n independent matches mn ⋅
)( mnO ⋅
The University of
OntarioGeneralized Distance Transform
� Idea: improve efficiency of the key computational step (performed for each parent-child pair, n-times)
( operations))},()({min)(ˆ yxCxEyEx
+= 2m
Intuit ively: if x and y describe all feasible positions of “parts” in the image then energy functions and can be though of as some gray-scale images (e.g. like responses of the original image to some filters)
)(ˆ yE)(xE
The University of
OntarioGeneralized Distance Transform
� Idea: improve efficiency of the key computational step ( operations performed for each parent-child pair)
� Let (distance between x and y)
reasonable interactions model!
� Then is called a Generalized Distance Transform of
)},()({min)(ˆ yxCxEyEx
+=
||||),( yxyxC −⋅= α
2m
)(ˆ yE)(xE
The University of
Ontario
From Distance Transform toGeneralized Distance Transform
� Assuming
then
is standard Distance Transform (of image features)
||}||)({min)(ˆ yxxEyEx
−+=
∞
=..
0)(
WO
featureimageisxifxE
)(xE
)(ˆ yE
∞+ ∞+∞+
Locations of binary image features
The University of
Ontario
From Distance Transform toGeneralized Distance Transform
� For general and any fixed
is called Generalized Distance Transform of
||}||)({min)(ˆ yxxEyEx
−⋅+= α
)(xE
E(x) may represent non-binary image features (e.g. image intensity gradient)
)(xE
)(xE
)(ˆ yEEE(y) may preferstrength of E(x)
to proximity
α
The University of
Ontario
Algorithm for computing Generalized Distance Transform
� Straightforward generalization of forward-backward pass algorithm for standard Distance Transforms
∞=
..
0)(
WO
featureimageisxifxδ• Initialize to E(x) instead of
• Use instead of 1 α
The University of
Ontario
Flexible Template Matching Complexity
� Computing
via Generalized Distance Transform:
previously
(m-number of positions x and y)
� Improves complexity of Flexible Template Matching to in case of interactions
||}||)({min)(ˆ yxxEyEx
−⋅+= α
)(mO)( 2mO
||||),( yxyxC −⋅= α)(nmO
The University of
Ontario
“Simple” Flexible Template Example:Central Part Model
� Consider special case in which parts translate with respect to common origin• E.g., useful for faces
•Parts•Distinguished central part•Connect each to•Elastic spring costs
},...,{ 1 nvvV =1v
iv 1v
NOTE: for simplicity (only) we consider part positions that are translations only (no rotation or scaling of parts)
il
The University of
OntarioCentral Part Model example
� “Ideal” location w.r.t. is given by where is a fixed translation vector for each i>1
il 1lio
ii ollT += 11)(
ivioil
3v3o3l
2v
2o
2l
1l
1v
∑>
−⋅++1
111 ||})(||)({)(i
iiiii lTllmlm α
� Whole template energy
||)(|| 1lTl iii −⋅α� “String cost for deformation from this “ideal” location
The University of
Ontario
Central Part Model Summary of search algorithm
1. For each non-central part i>1 compute matching cost for all possible positions of that part in the image
2. For each i>1 compute Generalized DT of
3. For all possible positions of the central part compute energy
4. Select the best location or select all locations with larger then a fixed threshold
∑>
−⋅++1
111 ||})(||)({)(i
iiiii lTllmlm α
||}||)({min)(ˆ yxxmyE iix
i −⋅+= α
)( ii lm
il)(xmi
∑>
+=1
11111 ))((ˆ)()(ˆi
ii lTElmlE
)( mnO ⋅
)(ˆ11 lE1l
Matching cost:
The University of
Ontario
Search Algorithmfor tree-based pictorial structures
)( mnO ⋅
•Algorithm is basically the same as for Central Part Model. •Each “parent” part knows ideal positions of “child” parts.•String deformations are accounted for by the Generalized Distance Transform of the children’s positioning energies