Maximum Entropy Matching: An Approach to Fast Template ...288327/FULLTEXT01.pdf · can be used to...

Maximum Entropy Matching: An Approach toFast Template Matching

Frans Lundberg

October 25, 2000

Contents

1 Introduction 2

2 Maximum Entropy Matching 22.1 The cornerstones of Maximum Entropy Matching . . . . . . . . . . . 22.2 Bitset creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 Bitset comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 PAIRS and the details of the bitset comparison algorithm 63.1 PAIRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63.2 Motivation for PAIRS . . . . . . . . . . . . . . . . . . . . . . . . . . 73.3 The bitset comparison algorithm . . . . . . . . . . . . . . . . . . . . 73.4 Implementation issues . . . . . . . . . . . . . . . . . . . . . . . . . 83.5 Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4 A comparison between PAIRS and normalized cross-correlation 114.1 Test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.2 Generation of image distortions . . . . . . . . . . . . . . . . . . . . . 12

4.2.1 Gaussian noise, NOISE . . . . . . . . . . . . . . . . . . . . . 124.2.2 Rotation of the image, ROT . . . . . . . . . . . . . . . . . . 134.2.3 Scaling of the image, ZOOM . . . . . . . . . . . . . . . . . . 134.2.4 Perspective change, PERSP . . . . . . . . . . . . . . . . . . 134.2.5 Salt and pepper noise, SALT . . . . . . . . . . . . . . . . . . 134.2.6 A gamma correction of the intensity values, GAMMA . . . . 134.2.7 NODIST and STD . .. . . . . . . . . . . . . . . . . . . . . 14

4.3 Relevance of the distortions . . . . . . . . . . . . . . . . . . . . . . . 144.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.5 Other template sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.6 Performance using other images . . . . . . . . . . . . . . . . . . . . 19

5 Statistics of PAIRS bitsets 225.1 Statistics of acquired bitsets . . . . . . . . . . . . . . . . . . . . . . . 22

6 Comments 24

1

1 Introduction

One important problem in image analysis is the localization of a template in a largerimage. Applications where the solution of this problem can be used include: tracking,optical flow, and stereo vision. The matching method studied here solve this problemby defining a newsimilarity measurementbetween a template and an image neigh-borhood. This similarity is computed for all possible integer positions of the templatewithin the image. The position for which we get the highest similarity is considered tobe the match. The similarity is not necessarily computed using the original pixel valuesdirectly, but can of course be derived from higher level image features.

The similarity measurement can be computed in different ways and the simplest ap-proach are correlation-type algorithms. Aschwanden and Guggenb¨uhl [2] have done acomparison between such algorithms. One of best and simplest algorithms they testedis normalized cross-correlation(NCC). Therefore this algorithm has been used to com-pare with the PAIRS algorithm that is developed by the author and described in thistext. It uses a completely different similarity measurement based on sets of bits ex-tracted from the template and the image.

This work is done within WITAS which is a project dealing with UAV’s (unmannedaerial vehicles). Two specific applications of the developed template matching algo-rithm have been studied.

1. One application is tracking of cars in video sequences from a helicopter.

2. The other one is computing optical flow in such video sequences in order todetect moving objects, especially vehicles on roads.

The video from the helicopter is in color (RGB) and this fact is used in the presentedtracking algorithm. The PAIRS algorithm have been applied to these two applicationsand the results are reported.

A part of this text will concern a general approach to template matching calledMaximum Entropy Matching (MEM) that is developed here. The main idea of MEMis that the more data we compare on a computer the longer it takes and therefore thedata that we compare should have maximum average information, that is, maximumentropy. We will see that this approach can be useful to create template matchingalgorithms which are in the order of 10 times faster then correlation (NCC) withoutdecreasing the performance.

2 Maximum Entropy Matching

2.1 The cornerstones of Maximum Entropy Matching

The purpose of template matching in image processing is to find the displacementrsuch that the image functionI(x� r) is as similar as possible to the template functionT(x). This can be expressed as

rmatch= argmax�

similarity�T(x); I(x� r)

��: (1)

Maximum Entropy Matching (MEM) and the PAIRS method described later arevalid for all types of discretely sampled signals of arbitrary dimension, but here wewill discuss the specific case of template matching of RGB images.

2

The difficult part with template matching is to find a similarity measurement thatwill give a displacement of the template which corresponds to the real displacement ofthe signal in the world around us. For many applications it is difficult to even definethis ideal displacement, since the difference between the image neighborhood and thetemplate does not consist of a pure translation. This fact makes it difficult to comparedifferent template matching algorithms. Furthermore, the similarity measurement thatshould be used is application dependent. For example, rotation invariance might bewanted for one application, but not for another.

Maximum Entropy Matching does not necessarily lead to a similarity measurementthat is better than others, but it aims to increase the speed of the template matchingwhile keeping the performance. It works by comparing derived image features of theimage and the template for each possible displacement of the template. The approachis based on the following statements.

1. The less data we compare for each possible template position, the faster thiscomparison will be.

2. The data we compare should have high entropy.

3. On the average, less data needs to be compared to conclude two objects aredissimilar then to conclude they are similar. This statement will be called thefast dissimilarity principle.

4. The data that we use for the comparison should be chosen so that the similaritymeasurement will be distortion persistent.

Statement 1 is true in the sense that the time to compute a similarity measurementis usually proportional to the amount of data that is compared. For correlation-typetemplate matching all of the pixel data in the template is used in the matching algo-rithm. We will see that the amount of data that is used for comparison can be decreasedsubstantially using the MEM approach. The compare time also depends on the way thedata is compared. Not counting normalizations, the similarity measurement for thesealgorithms is acquired by one multiplication and one addition for each byte of pixeldata (assuming each intensity value is stored as one byte). The comparison of data forthe MEM approach is done by an XOR-operation, and a look-up table, which is fasterper byte then the correlation-type approaches1 and simple to implement in hardware.

Statement 2 is intuitively appealing. To increase the speed of the matching algo-rithm we want to use as little data as possible in the comparisons, but we wish to use asmuch information as possible. Therefore the data used in the comparison should havehigh average information, that is, high entropy. Experiments show that it is possible toreduce the original amount of data used in the comparisons in the order of 10 to 100times while keeping good performance.

It is not very difficult to prove that the maximum entropy of digital data is onlyachieved when the following two criteria are fulfilled. One, the probability of each bitin the data being 1 is 0.50. Two, all bits should be statistically independent. When wetalk about entropy we view the data to be compared as one random variable, and whenwe talk about independence of bits, the single bits are considered random variables.

Since statistical independence of the bits is necessary to achieve maximum entropyit is natural and necessary to view the compare data as a set of bits. This view is usedin MEM where a bitset is extracted from the template and from each neighborhood in

1This result is obtained using my C-implementations, see later sections for implementation issues.

3

the image. The similarity measurement used to compare the image neighborhood bitsetand the template bitset is simply the number of equal bits.

Lossy data compression of images is a large research area that I believe can bevery useful in order to find high entropy image features that are good to use for tem-plate matching. However, the problems of finding these compare features and how tocompress image data is fundamentally different, since there is no demand for imagereconstruction from the compare features.

Statement 3 (thefast dissimilarity principle) is an important and a very generalstatement that is valid for all types of objects that are built up by smaller parts. Thestatement comes from the fact that two objects are considered similar only if all theirparts are similar. If a part from object A isdissimilar to the corresponding part ofobject B, we can conclude that A and B are dissimilar. If the part from A and thecorresponding part of B aresimilar we cannot conclude anything about the similaritybetween the whole objects. Therefore it usually takes less data to conclude that twoobjects are dissimilar then to conclude they are similar. We will see how this statementcan be used to speed up the matching algorithm.

Statement 4. In template matching no image neighborhood is identical to the tem-plate. There is always some distortion (for example: noise, rotation or a shadow on thetemplate) present and we must try to choose the data we compare so that the similaritymeasurement is affected as little as possible by these distortions.

The following two sections will describe how to extract compare data, and thenhow to compare this data for fast, high performance template matching.

2.2 Bitset creation

Maximum Entropy Matching consists of two separate parts:bitset creationandbitsetcomparison. In the bitset creation part a set of bits is produced for the template andfor each neighborhood in the image directly or indirectly from the pixel data. Howthese bitsets are created is not determined by MEM. The optimal bitsets to extract isdependent on what image distortions that are expected for the intended application.One example of a bitset creation algorithm is PAIRS. We demand three things from thebitset creation algorithm.

1. The created bitsets should have high entropy.

2. The created bitsets should be resistant to the image distortions that appears forthe intended application.

3. The bitset creation time should be short.

In order to compare two different bitset creation algorithms we must have mea-surements of “high entropy”, “distortion resistance” and “bitset creation time”. Wewill suggest possible ways of measuring this quantities.

It is difficult to estimate the entropy of a bitset consisting of more then a few bits.If the extracted bitset only has, say, 8 bits, we can estimate the full discrete probabilitydistribution using a database of image neighborhoods. The entropy is then computedby its definition from the probability distribution. This is possible for a 1-byte bitsetwhich has only 256 possible states. But, for a 4-byte bitset we have 232

� 4�109 pos-sible states and an explicit estimation of the full probability distribution is not possible.There are other ways to estimate entropy. In [3] a method for estimating the entropy ofone-dimensional information sequences is applied to gray-scale images. The method

4

uses pattern matching to estimate the entropy. More about pattern matching in infor-mation theory can be found in [4].

Since the entropy is difficult to estimate we can instead use a measurement of howclose to maximum entropy the data is. Assuming we have a database of image neigh-borhoods we can find the probabilities for each bit being set to 1. These probabilitiesshould be close to 0.50 to achieve high entropy. Also, we can measure how independentthe bits are by estimating the correlationρ.

ρi j =E (bi�E (bi) (bj �E (bj)))p

V(bi)V(bj)(2)

E denotes expectation value,V denotes variance, andbk denotes thek’th bit in thebitset. Since we are dealing with binary distributions we can fortunately conclude thatif ρi j = 0 thei’th and thej ’th bits are independent.

We can construct a measurement of how much the bitsets deviate from having max-imum entropy by studying how much they deviate from the assumption of 50 per centprobability of a bit set to one and from the desired independence of the bits.

If the bits in a bitset have a probability of being 1 equal to 0.50 and they are in-dependent the distribution of the number of ones in the bitset will follow a binomialdistribution. Therefore we can define another measurement of how close to maximumentropy the bitsets are as the deviation from a binomial distribution of the number ofones in the bitsets. These two ways of measuring how close to maximum entropy thebitsets are will be exemplified.

Distortion persistenceof the bits can be measured by performing experiments ona number of templates subject to controlled distortions. A bitset is created from thetemplate before and after the distortion. The number of bits that are equal of the twobitsets is a measurement of how persistent the bitset creation algorithm is to the applieddistortion.

Thebitset creation timecan be measured for a specific computer. However, if bitsetcreation method A is faster then method B on computer X, A is not necessarily fasteron computer Y. Also, the implementations are often not trivial to optimize. So it is notalways possible to determine which bitset creation method that is generally the fastest.

2.3 Bitset comparison

The bitset comparison part of the MEM is not application dependent. For each imageneighborhood and for the template a set of bits is generated somehow. The similaritymeasure in the template matching algorithm is simply the number of equal bits in thetemplate bitset and the image neighborhood bitset. The bitsets consists of a wholenumber of bytes2 for practical reasons.

It is possible to use thefast dissimilarity principle(MEM Statement 3 in section2.1) to decrease the bitset compare time. This is done by comparing the first number ofbytes in the neighborhood and the template bitsets. If the number of equal bits in theseparts of the bitsets is below a certain threshold the bitsets are considered dissimilar, andthe similarity value is set to zero. If the number of equal bits is not below the thresholdthe whole bitsets are compared and the similarity measure is the number of equal bitsof the whole bitsets. This algorithm and its implementation is described in detail in thenext section.

2A byte is here assumed to be 8 bits.

5

3 PAIRS and the details of the bitset comparison algo-rithm

3.1 PAIRS

PAIRS is an algorithm to create bitsets of arbitrary number of bytes from neighbor-hoods in RGB images. PAIRS can easily be modified to deal with other kinds ofsignals of arbitrary inner and outer dimension The PAIRS method is based on randompairs of pixels within a neighborhood of an image. Each bit in a bitset is created froma certain pair of pixels. A bit is set to 1 if the first pixel value is larger than the other inthe pair. Otherwise the bit is set to 0. The pair of pixel values are chosen in the samecolor band. The random pairs are chosen according to Algorithm 1 which is presentedin C-like pseudo-code.

---------- Algorithm 1 ----------// Computes a list of pairs to be used// for bitset creation.

INPUT VARIABLESn Number of bytes in each bitsetcolors Number of colors (3 for RGB images)

OUTPUT VARIABLESlist List of pixel pair coordinates used to

form the bitsets, size: 8 x n where eachelement contains the coordinates ofthe pixel pair

FUNCTIONS CALLEDrand rand(low,high) returns a random integer

between low and high.

ALGORITHMFor i1=0 to n2*8-1 {

index1x = rand(0,N-1); index1y = rand(0,N-1);index2x = rand(0,N-1); index2y = rand(0,N-1);index3 = rand(0,colors-1);Store all five index variables in list[i1].

}---------------------------------

When the list of pixel pairs have been created according to Algorithm 1 or a pre-computed list is loaded from a file, the actual bitsets are created according to Algo-rithm 2.

---------- Algorithm 2 ----------// Creates bitsets from image neighborhoods.

INPUT VARIABLESim An RGB image,

size: imSize1 x imSize2 x 3list A list of pixel pairs created

by Algorithm 1

OUTPUT VARIABLESbs Image bitset,

size: (imSize1-N+1) x(imSize2-N+1) x 8*n

6

ALGORITHMFor i1=0 to imSize1-N, i2=0 to imSize2-N/* For all image neighborhoods */{

For i3=0 to 8*n-1/* For all bits in the bitset */{

Get index1x, index1y, index2x, index2y andindex3 from list[i3].

If im[i1+index1x, i2+index1y, index3] >im[i1+index2x, i2+index2y, index3] {

bs[i1,i2,i3] = 1;}Else {

bs[i1,i2,i3] = 0;}

}}---------------------------------

Note that Algorithm 2 is used to create both the image bitset and the template bitset.The size of the resulting template bitset will be 1�1�n or simply n if we neglect thesingleton dimensions.

3.2 Motivation for PAIRS

The PAIRS method for bitset creation that was described in the previous section hasbeen developed since it is a good compromise between the desired properties of MEMbitsets as described in 2.2. The entropy of these bitsets are high, the similarity mea-surement is resistant to certain kinds of distortion, and the bitset creation time is low. Ibelieve other ways to create bitsets can prove better then PAIRS for some applications,but PAIRS is fast and rather simple to implement, and it works on the original inputintensity data. The method has proved useful in applications and is used to demonstratethe maximum entropy approach to template matching. Matching using bitsets createdwith PAIRS compares very well with correlation approaches according to experimentswith controlled distortions, see Section 4. When high invariance against certain typesof distortions, such as rotation, is needed I believe higher-level image features shouldbe used when forming the bitsets.

3.3 The bitset comparison algorithm

The previous section described the PAIRS way to create bitsets. The bitset comparisonalgorithm is used to match the template bitset with the image bitsets. The algorithmis not dependent on what bitset creation method that is used. There are two versionsof the bitset comparison algorithm (Algorithm 3), with or withoutsort out. If sort outis not used the similarity measurement between two bitsets is simply the number ofequal bits. Sort out can be used to increase the speed of the algorithm by setting thesimilarity to zero if the firstn1 bytes is less then a certain limit. The principle behindthis is thefast dissimilarity principlediscussed in Section 2.1. The average number ofbytes that have to be compared can be reduced substantially by using sort out. Noticethat a drawback with using sort out is that the execution time will be dependent on theinput data.

---------- Algorithm 3 ----------

7

// Computes a similarity measurement between// the template bitset and the image bitsets.

INPUT VARIABLESim_bs Image bitset, size: s1 x s2 x 8*ntemp_bs Template bitset, size: 8*n

ADDITIONAL INPUT VARIABLES FOR SORT OUT VERSIONn1 Number of bytes to use for initial

sort out.thres Threshold

OUTPUT VARIABLESs The similarity measurement,

size: s1 x s2

FUNCTIONS CALLEDsimil simil(bs1, bs2) computes the number

of equal bits in bitset bs1 and bs2.

ALGORITHM (without sort out)For i1=0 to s1-1, i2=0 to s2-1

s[i1,i2] = simil(im_bs[i1,i2,0:n-1],temp_bs(0:n-1));

}

ALGORITHM (with sort out)

For i1=0 to s1-1, i2=0 to s2-1 {sortout_sim = simil(im_bs[i1,i2,0:n1-1],

temp_bs(0:n1-1);if sortout_sim<thres {

s[i1,i2] = 0;}else {

s[i1,i2] = sortout_sim +simil(im_bs[i1,i2,n1:n-1],

temp_bs[i1,i2,n1:n-1]);}

}---------------------------------

3.4 Implementation issues

The implementations of the algorithms presented in this text do not follow the exactsyntax presented, but they are functionally equivalent. To compare the speed of MEM-PAIRS with correlation-type approaches Algorithm 1 through 3 and NCC have beenimplemented using C and run on a general purpose computer of the type UltraSPARK-II, 333 MHz. Execution times mentioned in this text refer to runs on this computer. Thedetails of the NCC algorithm is explained in section 4.1. Algorithm 1 is not time criticalsince the list of pixel pairs can be pre-computed. The bitsets created by Algorithm 2are stored as arrays ofunsigned char ’s, that is, as arrays of bytes. This is notnecessarily the fastest way, but it is flexible. Some effort has been made to optimizethe innermost loop of the algorithm. The bitset creation timetcr has been measured to0.40µs/byte. This time does not depend much on the number of bytes per bitset or thesize of the image.

Algorithm 3 is about counting the number of equal bits in two arrays of bytes.An XOR-operation is performed between each of the bytes in the image array and the

8

corresponding template array byte. The number of zeros in the resulting byte (which isthe number of equal bits of the two compared bytes) is computed with a 256-item longlookup table. The number of equal bits from each byte in the arrays is summed andthe result is the similarity measurement between the bitsets. The time to compare twobytes in the bitsetstcmp has been measured to to 0.025µs.

The NCC algorithm has also been implemented in C. Some efforts have been madeto optimize the code. The time it takes to compare two bytes (two intensity values) is0.120µs, denotedtcmpN. The compare time per byte is based on the amount ofinputdata even though the bytes are converted todouble ’s internally to do the multiplica-tion in the NCC algorithm.

3.5 Speed

In this section we will compare the speed of MEM-PAIRS3 and NCC for templatematching. Lettm andtmN denote the total time to match the template with the imagefor PAIRS and NCC. Lets denote the size of the template,sx the width of the searchspace,sy the height of the search space, andn the number of bytes used in the bitsets.By search spaceis meant the rectangular set of tested template positions. If no sort outis used in the PAIRS method and both the bitsets have to be created to do the matchingthe match time for PAIRS is

tm = (nsxsy+n)tcr +nsxsytcmp: (3)

Usually the time to create the template can be neglected and then we have

tm = nsxsy (tcr + tcmp) : (4)

The match time for NCC istmN = 3sxsys

2tcmpN: (5)

We exemplify using the following values:s= 16, sx = sy = 41, n = 64, tcr =0:400�10�6 s, tcmp= 0:025�10�6 s, andtcmpN= 0:120�10�6 s. These values areused in the comparative test between PAIRS and NCC in section 4. The resulting matchtimes for this example are:

tm = 46 ms

tmN = 155 ms

We see that PAIRS is 3 times faster for this case. We note that the time to compare thebitsetstcmp is much shorter than the time to create themtcr. For certain applications thebitsets could be pre-computed or a large number of templates used on one image. Wewill see later that if optical flow estimation is done by PAIRS template matching thetotal time to create the bitsets is shorter than the total compare time since each bitset isused in comparisons many times. For an application where the bitset creation time isof no importantance the actual match time for PAIRS would instead be

tm = 2:7 ms

if the same parameters as above are used. For this case PAIRS is 60 times faster thenNCC.

3The MEM-PAIRS template matching algorithm will from now on often be abbreviated to just “PAIRS”or the “PAIRS method”.

9

sparse

dsparsesparsed

sparsed d

Image Template

Figure 1: This figure shows how the bitsets can be created sparsely in the image ifseveral template bitsets are used in the matching.

For the application of tracking an object in a video sequence only one template ismatched with a certain image. Therefore the bitsets created from the image will beused only once, so the bitset creation will take most of the time. There is a remedyto speed up the process of creating the bitsets. Have a look at Figure 1. The figuredepicts how we can reduce the total bitset creation time by creating fewer image neigh-borhood bitsets and more template bitsets. The image bitsets are only created at everydsparse’th pixel position in the x- and y-direction as shown on the left side in the figurefor the case whendsparse= 3. The right hand side of the figure shows a 10�10 tem-plate that we wish to match with the image. We view this template as a collection ofd2

sparse= 9 number of 8�8 templates with the upper left corner positioned within thegray square. One of these 8�8 templates is marked in the figure. The 8�8 templatesare denoted with the position of their upper left corner in the 10� 10 neighborhood.The position indicesk andl run from 0 todsparse�1. The assumption we use now isthat if the(k; l)-template matches the image in position(i; j) then the 10�10 templatematches the image in position(i�k; j� l). This assumption is reasonable whendsparse

is small compared to the template size. The assumption makes it possible to find a sim-ilarity measurement of all possible positions of the template in the image even thoughthe image bitsets are created sparsely. The algorithms for bitset creation and compar-isons become somewhat more complicated to implement, but the total match time isdecreases substantially for many applications. We will created2

sparsetimes fewer imagebitsets andd2

sparsetemplate bitsets instead of only one. Equation 3 can be modified forthe case of sparsely sampled image bitsets to

tm = nsxsy

d2sparse

tcr +nd2sparsetcr +nsxsytcmp: (6)

The terms in the sum are from left to right the time to create the image bitsets, the timeto create the template bitsets and the time to compare them. The equation neglects theedge effects that occur ifsx or sy is not a multiple ofdsparse. For the example parametersgiven above and withdsparse= 4 the total match time would be

tm = 2:7+0:4+2:7 ms= 5:8 ms

which is 27 times faster then NCC. Later we will see how the use of sparsely sampledbitsets affects the performance of template matching.

10

4 A comparison between PAIRS and normalized cross-correlation

4.1 Test setup

In this section we describe a performance test of PAIRS matching and normalizedcross-correlation (NCC). The PAIRS matching algorithm uses Algorithm 1 and 2 tocreate the bitsets, and Algorithm 3 without sort out to match the bitsets. The NCCmethod uses a similarity measurements between the image neighborhoodI and thetemplateT as defined below.

s=2

∑k=0

∑N�1i=0 ∑N�1

j=0 I (i; j;k)T (i; j;k)q∑N�1

i=0 ∑N�1j=0 I2 (i; j;k)T2 (i; j;k)

(7)

i and j are spatial coordinates, andk is the color index. Since RGB images are assumedk runs from 0 to 2. The similarity measurement can be said to be standard gray scalenormalized cross-correlation done separately for all three color bands and the resultingsimilarity is the sum of the similarity measurements from each color band.

The approach in this test is to define a template as a neighborhood of an image.Then controlled distortions are applied to the template and the template is matchedwith the original image. The RGB test images are chosen as random parts of imagesfrom a database containing 20000 images. The database is Photo Library 1 and 2 inCorel’s productCorel GALLERY Magic, see their homepage [1] for more information.5000 testimages of size 56�56 are chosen from the database and the templates aretaken from the central neighborhoods of these images. The size of the templates is16� 16. This implies that the number of possible positions of the template in theimage is 412. When other sizes of the templates are used the image sizes are adjustedso that the number of possible positions of the template is the same. Figure 2 shows 20of these images and the region from where the templates are taken. In the order of 50per cent of the images initially chosen was not used since the energy of the templatearea was too low. (The image was refused if the average standard deviation of the threecolor bands in the template area was less then 20. The intensity values are between 0and 255.)

The distortions that are applied to the templates are are defined in the subsequentsections. After the template is distorted it is matched with the image using the PAIRSand the NCC method. If the result is within a distance of 2 pixels from the correctposition it is considered a hit. All the 5000 images are used for each of the 26 differentcases of distortions. A new list of pairs (as defined by Algorithm 1) is produced every100’th image. The the number of misses for NCC and PAIRS are recorded. The testsare performed for different number of bytes in the bitsets formed by the PAIRS. Thesedifferent methods will be denoted PAIRSX or just PX where X is the number of bytesused to form the bitsets. PAIRSX+ denotes PAIRS with a number of bytes greater thenor equal to X.

4.2 Generation of image distortions

The template is always shifted a subpixel distance in both directions before any otherdistortion is applied. This is a natural thing to do since even integer shifts are notmore common than others for real applications. The sizes of the x- and y-shifts are

11

These images are from Corel GALLERY Magic which are protected by coperight laws. Used under license.

Figure 2: Examples of 20 images used in a comparative test between PAIRS and NCC.The template regions are marked.

chosen randomly from a uniform distribution. The shift is performed using bicubicinterpolation.

The distortions are applied only to the template. If more than one distortion is usedthey are applied in the same order as they appear below. The distortions are applied to atemplate larger than the final distorted template so that the geometric distortions such asrotation and zoom can be performed without introducing undetermined intensity valuesin certain regions. The different types and strengths of the distortions are denoted bya LABEL (for example NOISE, or ROT) followed by the distortion strength parameterPLABEL. For example “ROT10” denotes a rotation distortion with a distortion strengthparameter (PROT) of 10. (For this example, it means a rotation with a maximum angleof 10Æ).

4.2.1 Gaussian noise, NOISE

The distortion called NOISE is generated by adding zero mean Gaussian noise to theimage. The signal energy of the image is computed as the sum of the squares of allpixel values for all colors.PNOISE is the noise to signal energy ratio4.

4Unconventionally noise to signal ratio is used instead of signal to noise ratio, since we want a greaterdistortion parameter to correspond to a greater distortion strength.

12

C

y

z

y’

z’

C’

x

x’

θφ

O

Figure 3: Perspective change for the PERSP distortion.

4.2.2 Rotation of the image, ROT

The rotation of the template is done by an angle of between�φmaxandφmax uniformlydistributed. Bicubic interpolation is used.PNOISE equalsφmax.

4.2.3 Scaling of the image, ZOOM

When this distortion is used the template is scaled with a factor drawn from a uniformdistribution between 1�PZOOM and 1+PZOOM. Bicubic interpolation is used. Whenthe scale factor is< 1 a suitable low-pass filter is applied before the interpolation toavoid aliasing.

4.2.4 Perspective change, PERSP

Have a look at Figure 3. We assume that the undistorted template comes from a flatsurface represented by the rectangle in the figure. The camera is located at C on thez-axis which is perpendicular to the surface. The distorted template is then the imagethat would be acquired if we move the camera to C’ and still aims it at the originO. The distance from the origin is not changed. The z’-axis is obtained by rotatingthe unprimed coordinate system an angleφ around the z-axis followed by an angleθaround the x-axis. The angleφ is picked from a uniform distribution between 0 and 2π,andθ from a uniform distribution between 0 andPPERSP.

4.2.5 Salt and pepper noise, SALT

This type of noise is created by randomly setting some intensity values to 0 or 255.The probability of setting the intensity to 0 is the same as the probability of setting itto 255. This probability is denotedPSALT.

4.2.6 A gamma correction of the intensity values, GAMMA

This distortion is applied independently for all intensity values. In this case the inputintensity valuesiin are integer values between 0 and 255 and the output valuesiout are

iout = round

��iin

255

�γ255

�: (8)

13

The value ofγ is chosen from the distributionPU(�1;1)GAMMA whereU(�1;1) is a uniform

distribution between -1 and 1.

4.2.7 NODIST and STD

The label NODIST denotes no distortion except for the subpixel shift. This case canbe used as a reference as the minimum miss rate that is possible. STD (standard distor-tion) is a mixture of the distortions defined above and consists of NOISE0.01, ROT5,ZOOM0.05, PERSP10, SALT0.02 and GAMMA1.5. STD is a standard test case withdistortions that are considered relevant for many applications including tracking andoptical flow estimation.

4.3 Relevance of the distortions

NOISE Noise in imaging systems is often modeled well as additive Gaussian noise.

ROT, ZOOM, PERSP Template matching for tracking must be able to handle smallrotations, image scaling and perspective changes due to object and camera mo-tion.

SALT Salt-and-pepper noise occurs in some image measurement systems. Further-more, this distortion can be used to model object occlusion.

GAMMA Cameras are often not calibrated. This distortion is relevant when the tem-plate and the image are acquired from different cameras. Also, it reflects howwell the template matching algorithm can handle different lightning conditionsfor the template and the image.

The strength of the distortions is chosen quite high, since we need a large number ofmisses in the template matching tests in order to get statistically reliable data. However,the distortions are not so large that they are not relevant for practical applications.

4.4 Results

The results of the comparative test are shown in Figure 4 through 10 and the numericalfigures are presented in Table 1. The miss rate is on the y-axis and the number of bytesfor the PAIRS method is on the logarithmic x-axis of the graphs. The horizontal linein the figures corresponds to the miss rate of the NCC matching. If the line is missingit is out of the y-axes range. The other curve in the figures corresponds to the miss rateof the PAIRS method for different number of bytes in the bitsets.

Figure 4 show the miss rate for NODIST and for the standard test case, STD. Theonly distortion applied for the NODIST case is the subpixel shift. Opposite to expecta-tions PAIRS32+ perform significantly better then NCC for the NODIST case. Also forthe STD test case PAIRS32+ perform better. The miss rate for NCC is 7 times greaterthen for PAIRS256.

For the case of Gaussian noise PAIRS64 performs as well as NCC for a low amountof noise. For very high noise levels, NCC outperforms PAIRS. See Figure 5. A noiseto signal energy ratio higher than 0.01 is not common for high quality video sequencesfrom scenes with good lightning conditions.

The performance between NCC and PAIRS can be characterized the same way forall the three types of geometrical distortions that have been tested: rotation (ROT),

14

NCC P16 P32 P64 P128 P256 P512 P1024NODIST 0.008 0.009 0.005 0.003 0.004 0.002 0.002 0.002STD 0.046 0.065 0.034 0.019 0.011 0.007 0.005 0.006NOISE0.001 0.008 0.018 0.009 0.005 0.004 0.004 0.005 0.004NOISE0.01 0.008 0.043 0.021 0.012 0.009 0.006 0.005 0.005NOISE0.1 0.012 0.140 0.084 0.047 0.030 0.022 0.018 0.015NOISE1 0.032 0.499 0.311 0.203 0.140 0.103 0.077 0.067ROT5 0.011 0.014 0.006 0.005 0.003 0.003 0.002 0.002ROT10 0.039 0.054 0.030 0.018 0.011 0.008 0.008 0.007ROT15 0.102 0.151 0.097 0.066 0.056 0.047 0.043 0.041ROT20 0.213 0.272 0.205 0.163 0.150 0.132 0.128 0.123ZOOM0.05 0.010 0.015 0.006 0.004 0.003 0.003 0.003 0.003ZOOM0.10 0.013 0.022 0.008 0.004 0.003 0.003 0.002 0.002ZOOM0.15 0.028 0.055 0.028 0.018 0.013 0.012 0.012 0.011ZOOM0.20 0.040 0.082 0.049 0.032 0.024 0.021 0.020 0.017PERSP10 0.006 0.012 0.004 0.003 0.002 0.001 0.001 0.001PERSP30 0.012 0.016 0.008 0.004 0.004 0.003 0.003 0.003PERSP40 0.021 0.034 0.017 0.011 0.008 0.006 0.006 0.005PERSP50 0.056 0.084 0.051 0.036 0.030 0.025 0.024 0.021SALT0.02 0.019 0.014 0.005 0.003 0.002 0.002 0.001 0.001SALT0.04 0.037 0.019 0.007 0.003 0.002 0.002 0.001 0.001SALT0.06 0.062 0.026 0.007 0.004 0.002 0.003 0.002 0.002SALT0.08 0.082 0.038 0.011 0.005 0.003 0.002 0.001 0.001GAMMA15 0.016 0.009 0.005 0.002 0.002 0.002 0.002 0.002GAMMA20 0.039 0.011 0.004 0.002 0.001 0.002 0.002 0.002GAMMA30 0.144 0.012 0.007 0.004 0.002 0.002 0.002 0.002GAMMA50 0.252 0.022 0.011 0.009 0.008 0.006 0.006 0.005

Table 1: The miss rate of template matching experiments with different types of distor-tions and similarity measurements. PX denotes the PAIRS algorithm with X numberof bytes in the bitsets.

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10NODIST

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10STD

Figure 4: The miss rate for the NODIST and the STD distortion.

15

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10NOISE0.001

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10NOISE0.01

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10NOISE0.1

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10NOISE1

Figure 5: The miss rate for different levels of Gaussian noise. The noise to signal ratiosof 0.001, 0.01, 0.1 and 1 corresponds to a SNR of 30 dB, 20 dB, 10 dB and 0 dB.

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10ROT5

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10ROT10

16 64 256 10240

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0.18

0.20ROT15

16 64 256 10240

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0.18

0.20ROT20

Figure 6: The miss rate for different amounts of rotation of the template. Note thedifferent scales of the y-axis. The NCC miss rate is out of the y-axes range for theROT20 distortion.

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10ZOOM0.05

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10ZOOM0.10

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10ZOOM0.15

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10ZOOM0.20

Figure 7: The miss rate for different amounts of rescaling of the template.

16

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10PERSP10

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10PERSP30

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10PERSP40

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10PERSP50

Figure 8: The miss rate for different amounts of perspective changes applied to thetemplate.

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10SALT0.02

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10SALT0.04

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10SALT0.06

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10SALT0.08

Figure 9: The miss rate after a salt and pepper type of noise have distorted the template.

rescaling (ZOOM) and a perspective change (PERSP). See Figures 6 to 8. PAIRS32+works better for all these distortion cases except for ZOOM0.20. For this case PAIRS64+works better.

PAIRS16+ clearly outperforms NCC when the template is distorted with salt andpepper noise. See Figure 9. This can be explained by the fact that if a pixel value isset to 255, corresponding to “white” or maximum intensity, it will have a larger effecton the total similarity measurement in NCC then in PAIRS since large intensity valuesaffect the NCC similarity measurement more then small intensity values do. This isnot the case for PAIRS where each pixel value (statistically) have the same amount ofinfluence on the similarity measurement independent of the magnitude of its intensityvalue.

The results from theγ-distortion is shown in Figure 10. As expected PAIRS isfar superior then NCC for this type of distortion. This is due to the fact that onlythe information about which of two intensity values is greater is used when forming abit in a bitset. This information stays intact for any strictly increasing intensity valuetransform, at least if we neglect quantization effects. Thus, the bitsets are (nearly)

17

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10GAMMA1.5

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10GAMMA2.0

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10GAMMA3.0

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10GAMMA5.0

Figure 10: The miss rate for different amounts ofγ-distortion of the pixel values. Themiss rate of NCC for GAMMA30 and GAMMA50 is out of the y-axes range andtherefore not shown in the figure.

1 2 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1PAIRS16

1 2 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1PAIRS64

1 2 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1PAIRS1024

1 2 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1NCC

Figure 11: The distribution of distances between match position and the correct posi-tion. The bar furthest to the right in each graph indicates the relative frequency of alldistances greater than 3.0 pixels.

invariant to arbitrary intensity transforms. This fact is especially useful when the imageand the template is acquired with different cameras.

In the above experiments a “miss” have been defined as a matching position greaterthan 2.0 pixels from the correct position. This threshold was set rather arbitrary. Exper-iments with other thresholds seem to give the same qualitative results on how PAIRSand NCC compare. Figure 11 shows the distribution of distances from the match posi-tion to the correct position for NCC and PAIRS for the standard test case (STD). Mostmatch positions are less then one pixel from the correct position. For PAIRS1024 therelative frequency of matches less then 1.0 pixels from the correct position is 0.984.The same figure for PAIRS16, PAIRS64, and NCC is 0.859, 0.949, and 0.911 respec-tively.

To summarize the results PAIRS performs better for most types of distortions ex-cept for Gaussian noise with very high energy. PAIRS64 performs better than NCC for23 out of the 26 different test cases. PAIRS clearly outperforms NCC for the SALTand GAMMA distortions even when few bytes are used in the bitsets. The distribu-tions of distances between the match position and the correct position are similar for

18

NCC P16 P32 P64 P128 P256 P512 P1024STD4 0.526 0.532 0.462 0.429 0.401 0.394 0.387 0.384STD8 0.152 0.111 0.067 0.048 0.039 0.035 0.034 0.033STD16 0.046 0.065 0.034 0.019 0.011 0.007 0.005 0.006STD32 0.022 0.089 0.042 0.020 0.009 0.004 0.003 0.002STD64 0.064 0.243 0.145 0.087 0.056 0.040 0.031 0.023

Table 2: The miss rate for NCC and PAIRS for different template sizes. The standardtest case of distortions has been used. STDX denotes this distortion and the templatesize X.

PAIRS and NCC. Note that these results are valid for this type of images, templatesand distortions. Other choices of test images could give different results.

4.5 Other template sizes

For the tests described above the template size was kept constant and equal to 16 sincethis is a reasonable size for the intended applications. Figure 12 and Table 2 showsthe results of the STD test case when different template sizes was used. The cases aredenoted STDX where X equals the template size. The test images have been chosenfrom an image database as previously explained. Different sets of images are used forthe five different tests since the images that are sorted out due to low variance dependson the size of the template. The number of possible positions of the template in theimage is still 412.

The first thing to notice from the figure is that both NCC and PAIRS work best fortemplate sizes 16 and 32. For smaller templates the uniqueness and the structure of thetemplates decrease which increase the miss rate. The reason why a larger template sizealso increases the miss rate is that the geometrical distortions affect the outer pixels ina large template more then for a small template. Another interesting thing is to lookat the number of bytes needed in the PAIRS algorithm compared to the total numberof bytes in the template which is used for the comparisons in the NCC algorithm. ThePAIRS algorithm performs as well as NCC when 32, 64 and 128 bytes are used for thetemplate sizes 16, 32 and 64. The ratio between the amount of data in the template andthe amount used in the bitsets is 0.04, 0.02, and 0.01 for the cases above. That is, forlarger templates we can reduce the data amount more then for small templates and stillget the same performance as NCC. This is not surprising since the larger template themore redundant data we have.

The results show that for this test case PAIRS128+ performs better than NCC forall template sizes.

4.6 Performance using other images

How well PAIRS compares to NCC depends on the set of images used in the test. Theimages previously used are picked randomly from a large database of photographs.Only the images with a template region with low variance have been discarded. Theother set of images are taken from a video sequence. The sequence show two carsdriving through an intersection. The video is acquired for the WITAS project froma radio-controlled helicopter and is called REVINGE2D. The position of one of thecars have been tracked for 500 frames and the set of test templates consists of 20�20

19

16 64 256 10240

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0STD4

16 64 256 10240

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0.18

0.20STD8

16 64 256 10240

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0.18

0.20STD16

16 64 256 10240

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0.18

0.20STD32

16 64 256 10240

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0.18

0.20STD64

Figure 12: The miss rate for NCC and PAIRS for different template sizes. The standardtest case of distortions (STD) was used. Note the different scale for STD4.

Figure 13: Frame 0, 299 and 499 from sequence REVINGE2D. The template positionsare marked.

neighborhoods around the tracked position. The test images are 60� 60 neighbor-hoods around that position. Possibly these images are more relevant for the specificapplication of tracking cars then the randomly chosen images. Figure 13 shows threeof these frames with the template positions marked. All together 5000 matchings aredone for each type of applied distortion. Each image is reused 10 times. To limit theexperiments only one distortion strength is used per type of distortion. The distortionsused are: NODIST, STD, NOISE0.1, ROT15, ZOOM0.20, PERSP50, SALT0.15 andGAMMA5.0. The results are presented in Table 3, and in Figure 14 and 15.

The results from these tests are difficult to interpret since it is difficult to know howwell these results generalize to other sequences. They should have some relevance forthe specific application of vehicle tracking. When we compare these results with theones from the Corel image database, we see that PAIRS does not compare as well withNCC for a low number of bytes in the bitsets. However, PAIRS512+ performs betteror as good as NCC for all the distortions tested. Note also that many results are closeto zero and therefore statistically uncertain. The reason why PAIRS with a low numberof bytes does not compare as well with NCC for these images compared to when theprevious images were used is not clear. Possibly PAIRS generally compares better withNCC for bad templates, but a definite conclusion cannot be drawn from these limitedtests. Further testing where the templates are classified on a scale from “bad” to “good”would possibly provide interesting results.

20

NCC P16 P32 P64 P128 P256 P512 P1024NODIST 0.000 0.005 0.001 0.000 0.000 0.000 0.000 0.000STD 0.003 0.053 0.014 0.003 0.001 0.000 0.000 0.000NOISE0.1 0.000 0.052 0.011 0.002 0.000 0.000 0.000 0.000ROT15 0.012 0.121 0.050 0.032 0.020 0.014 0.010 0.009ZOOM0.20 0.038 0.134 0.080 0.052 0.043 0.038 0.034 0.033PERSP50 0.038 0.113 0.060 0.042 0.033 0.025 0.024 0.023SALT0.15 0.017 0.219 0.048 0.005 0.000 0.000 0.000 0.000GAMMA5.0 0.295 0.004 0.001 0.000 0.000 0.000 0.000 0.000

Table 3: The miss rate for NCC and PAIRS for different distortions. The set of testim-ages are taken from the REVINGE2D sequence.

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10NODIST

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10STD

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10NOISE0.1

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10ROT15

Figure 14: The miss rate for the NODIST, STD, NOISE0.1, and ROT15 distortions.The test images are taken from the sequence REVINGE2D.

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10ZOOM0.20

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10PERSP50

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10SALT0.15

16 64 256 10240

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.10GAMMA5.0

Figure 15: The miss rate for the ZOOM0.20, PERSP50, SALT0.15, and GAMMA5.0distortions. The test images are taken from the sequence REVINGE2D.

21

0 20 40 600.35

0.4

0.45

0.5

0.55

0.6

0.65Graph 1

0 20 40 600.35

0.4

0.45

0.5

0.55

0.6

0.65Graph 2

0 20 40 600.35

0.4

0.45

0.5

0.55

0.6

0.65Graph 3

Figure 16: The probabilities for a zero of the first 64 bits in three different populationsof bitsets. The population in Graph 1 is generated by Algorithm 1, the populationcorresponding to Graph 2 is created by a slightly modified version of Algorithm 1,and the last graph corresponds to a population created by a random number generatorwhich optimally has maximum possible entropy.

5 Statistics of PAIRS bitsets

5.1 Statistics of acquired bitsets

The bitsets used in Maximum Entropy Matching should of course have maximum pos-sible entropy which is fulfilled when the probability of each bit being set to 1 is 0.50and the bitsbi are statistically independent. Given a specific template bitsetbtemp;i thatwe match with an image bitsetbim;i the number of equal bits is the number of zeros in

bmatch;i = XOR(bim;i ;btemp;i) : (9)

If the image bitsets have maximum entropy the number of equal bitsm follows a bi-nomial distribution. We continue by studying a population of bitsets which are createdrandomly using the same database as in Section 4. Random 16�16 neighborhoods arechosen from which 64-byte bitsets are created.

First we investigate the probability of each bit being set to 0 which should be closeto 0.50 to obtain high entropy. Figure 16, Graph 1 shows the relative frequency of theevent that the bit is set to 0 for the first 64 bits in the bitset. 10000 bitsets were used.The average of the relative frequencies for the first 64 bits is 0.544 and the standarddeviation is 0.034. The reason why these values are not closer to 0.50 is due to thefact that pixel values in a pair are sometimes equal due to the limited resolution andthen the bit is set to 0 according to Algorithm 2. This problem is made worse by thefact that the images in the Corel image database are compressed so that neighboringpixel values are more likely be be exactly equal. This problem can be solved by addingan if-statement to Algorithm 2. When the pixel values in a pair are equal the bit canbe set to 1 if the pixel value is even. This makes the probabilities closer to 0.50. Theresult when Algorithm 2 has been modified can be seen in 16, Graph 2. For this casethe average is 0.501 and the standard deviation is 0.007. We call these bitsets REAL.Graph 3 is included for reference. The bits in the bitsets used for the result in thisgraph have been created by a random number generator with the probability of a bitset to 0 equal to 0.50. The average and the standard deviation is 0.502 and 0.005 forthis case. This set of bitsets are called OPTIMAL. 10000 bitsets have been used for allthree cases. We do not expect a much better results by modifying Algorithm 2 in the

22

REAL

10 20 30 40 50 60

10

20

30

40

50

60

OPTIMAL

10 20 30 40 50 60

10

20

30

40

50

60

Figure 17: The absolute value of the correlation matrices of the first 64 bits in thebitsets called REAL and OPTIMAL.

way described above but it makes the bitset follow our theory better. For all practicalapplications we can say that the probability of a bit set to 0 isexactlyequal to 0.50 ifwe modify Algorithm 2 which is assumed in the rest of this section.

We know that the probability of a bit being 0 is 0.50. To investigate how close tomaximum entropy the bitsets are we now study how dependent the bits are. Since thebits are independent if they are uncorrelated we study the correlation matrixρi j , seeEquation 2. The absolute value of the correlation matrix of the first 64 bits for REALand OPTIMAL can be seen Figure !!!. To be able to compare different populationsof bitsets we need a measurement on how dependent the bits are. We can use thedependence value

pdep=

vuut�

∑N�1i=0 ∑N�1

j=0 ρ2i j

��N

N2�N

(10)

as such a measurement.N is the number of bits.pdep is simply the root mean squarevalue of the off-diagonal elements in the matrixρ. We also define anindependencevalueas

pind = 1� pdep: (11)

To verify the correlation between entropy and theindependence value pind bitsets con-sisting of 8 bits are formed randomly the same way as the REAL set of bitsets wereformed. This is done for a number of different template sizes and compared to theestimatedrelative entropy HR defined as

HR =∑N�1

i=0 �pi log2 pi

log2N: (12)

pdep, pind andHR are limited to the interval[0;1]. Since the list of pixel pairs formedby Algorithm 1 influences the relative entropy and the independence value significantlyfor small bitsets we study the average of these values for many different lists of pairs.Figure 5.1 shows the result of the average relative entropy and the average indepen-dence value when 50 different lists are used. 10000 bitsets have been used for each listand template size. We can see that the estimated entropy and the independence valueare closely related. It is difficult to say how well this result generalizes for bitsets withmuch more data than one byte. We make the assumption that when two populations ofbitsets are compared the one with the highest independence value also has the highestentropy. A proof of the validity of this assumption is not available.

23

2 4 6 8 10 12 14 160

0.2

0.4

0.6

0.8

1

template size

independence valuerelative entropy

Figure 18: Blahh....

It is interesting to study the distribution of the number of equal bits when a templatebitset is compared to an image neighborhood corresponding to a miss...

6 Comments

This paper was never completed. Hopefully this part of the paper is still of somevalue. This paper lacks some important things such as the applications of tracking carsand estimating optical flow. A section about possible improvements of the algorithmshould also be added. However, the existing main sections are fairly complete exceptfor Section 5.

References

[1] Corel’s homepage, August 2000,www.corel.com .

[2] P. Aschwanden, W. Guggenb¨uhl. Experimental Results from a Comparative Studyon Correlation-Type Registration algorithms.Robust Computer Vision, Forstner,Rudwiedel (Eds.), Wichmann 1992, pp. 268-289.

[3] Salvatore D. Morgera, Jihad M. Hallik. A Fast Algorithm for Entropy Estimationof Grey-level Images.Workshop on Physics and Computation, 1994. PhysComp’94, Proceedings.

[4] Aaron D Wyner, Jacob Ziv, Abraham J. Wyner. On the Role of Pattern Matchingin Information Theory.IEEE Transactions of Information Theory, Vol. 44, No. 6,October 1998.

24

Maximum Entropy Matching: An Approach to Fast Template ...288327/FULLTEXT01.pdf · can be used to...

Documents

Transcript of Maximum Entropy Matching: An Approach to Fast Template ...288327/FULLTEXT01.pdf · can be used to...