Digital Image Processing Lecture 25: Object Recognition Prof. Charlene Tsai.

Digital Image Processing Lecture 25: Object

Recognition

Prof. Charlene TsaiProf. Charlene Tsai

2

Review

Matching Specified by the mean vector of each class

Optimum statistical classifiers Probabilistic approach Bayes classifier for Gaussian pattern classes Specified by mean vector and covariance matrix

of each class Neural network

3

Foundation

Probability that x comes from class is Average loss/risk incurred in assigning x to

Using basic probability theory, we get

i xip

W

kkkjj pLr

1

xx Loss incurred if x actually came from , but assigned to k j

j

kW

kkkjj PpL

pr

1

xx

1x

p(A/B)p(B)=p(B/A)p(A)

4

(con’d)

Because 1/p(x) is positive and common to all rj(x), so it can be dropped w/o affecting the comparison among rj(x)

The classifier assigns x to the class with the smallest average loss --- Bayes classifier

kW

kkkjj PpLr

1

xx

ijjPpLPpL q

W

qqkjk

W

kkki

; ,xx11

Eqn#1

5

The Loss Function (Lij)

0 loss for correct decision, and same nonzero value (say 1) for any incorrect decision.

where

ijijL 1

jiji ijij if 0 and if 1

Eqn#2

6

Bayes Classifier

Substituting eqn#2 into eqn#1 yields

The classifier assigns x to class if for all

jj

k

W

kkkjj

Ppp

Ppr

xx

x1x1

i ij

ijWjPpPp jjii ;,...,2,1 ,xx

p(x) is common to all classes, so is dropped

7

Decision Function

Using Bayes classifier for a 0-1 loss function, the decision function for is

Now the questions are How to get ? How to estimate ?

j

WjPpd jjj ,...,2,1 ,xx

jP jp x

8

Using Gaussian Distribution

Most prevalent form (assumed) for is the Gaussian probability density function.

Now consider a 1D problem with 2 pattern classes (W=2)

jp x

2,1 ,2

1

xx

2

2

2

jPe

Ppd

j

mx

j

jjj

j

j

variance

mean

9

Example

Where is the decision if 1. 2.3.

21 pp 21 pp 21 pp

jjj Ppd xx

10

N-D Gaussian

For jth pattern class,

where,

jjT

j mCm

jnj eC

p

xx

2

1

212

1

2

1x

jj

j Nm

x

x1

j

Tjj

T

jj mmN

Cx

xx1

Remember this from Principle component Analysis?

11

(con’t)

Working with the logarithm of the decision function:

If all covariance matrices are equal, then

jj

Tjjj

jjj

mxCmxCn

P

Ppd

1

2

1ln

2

12ln

2ln

lnxlnx

jTjj

Tjj mCmmCPd 11

2

1xln x

Common covariance

12

For C=I

If C=I (identity matrix) and is 1/W, we get

which is the minimum distance classifier Gaussian pattern classes satisfying these

conditions are spherical clouds of identical shape in N-D.

jp

Wjmmmd jTjj

Tj ,...,2,1 ,

2

1x x

13

Example in Gonzalez (pg709)

1

1

3

4

11m

3

3

1

4

12m

Decision boundary

311

131

113

16

121 CC

14

(con’t)

Assuming

We get

The decision surface is

2121 pp

jTjj

Tj mCmmCd 11

2

1x x

Dropping , which is common to all classes

jp ln

844

484

448

16

111C

5.5884x and 5.14 x 321211 xxxdxd

04888xx 32121 xxxdd

15

Neural Network

Simulating the brain activity in which the elemental computing elements are treated as the neurons.

The trend of research dates back to early 1940s.

The perceptron learn a linear decision function that separate 2 training sets.

16

Perceptron for 2 Pattern Classes

11

x

n

n

iii wxwd

17

(con’t)

The coefficients wi are the weights, which are analogous to synapses in the human neural system.

When d(x)>0, the output is +1, and the x pattern belongs to . The reverse is true when d(x)<0.

This is as far as we go. This concept has be adopted in many real

systems, when the underlying distributions are unknown.

1

Digital Image Processing Lecture 25: Object Recognition Prof. Charlene Tsai.

Documents

Transcript of Digital Image Processing Lecture 25: Object Recognition Prof. Charlene Tsai.