1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing...

1

Pattern Recognition:Statistical and Neural

Lonnie C. Ludeman

Lecture 14

Oct 14, 2005

Nanjing University of Science & Technology

2

Lecture 14 Topics

1. Review structures of Optimal Classifier

2. Define Linear functions, hyperplanes, boundaries, unit normals,various distances

3. Use of Linear Discriminant functions for defining classifiers- Examples

3

Motivation!

Motivation!

Motivation!

4

if- (x – M

1)TK

1

-1(x – M1) + (x – M

2)TK

2

-1(x – M2)

><

C1

C2

T1

Optimum Decision Rules: 2-class Gaussian

Quadratic Processing

if ( M1 – M

2)T K-1 x >

<C

1

C2

T2

Case 2: K1 = K

2 = K

Case 1: K1 = K

2

Linear Processing

Review 1

5

if ( M1 – M

2)T x >

<C

1

C2

T3

Case 3: K1 = K

2 = K = s2 I

Optimum Decision Rules: 2-class Gaussian (cont)

Linear Processing

Review 2

6

Qi(x) = (x – M

j)TK

j

-1(x – Mj) } – 2 ln P(C

j) + ln | K

i |

M-Class General Gaussian MPE and MAP

Select Class Cj if Q

j(x) is MINIMUM

Select Class Cj if L

j(x) is MAXIMUM

Lj(x) = M

j

TK-1x – ½ Mj

T

K-1M

j

+ lnP(Cj)

Case 2: K1 = K

2 = … = K

M = K

Case 1: K1 = K

2

Review 3

7

Bayes decision rule is determined form a set of yi(x)

defined by

p(x|Ck) = 1

(2 )N/2 Kk

½ exp(- ½ (x – M

k)TK

k

-1(x – Mk) )

Ck :

X ~ N( M

k, K

k ) , P(C

k)

M-Class General Gaussian: Bayes

where

Review 4

8

(2 )N/2 Kj

½ C

ij exp(- ½ (x – M

j)TK

j

-1(x – Mj)) P(C

j)

yi(x) = j=1

M

Taking the ln of the yi(x) for this case

does not simplify to a linear or quadratic processor

The structure of the optimum classifier uses a sum of exp( quadratic forms) and thus is a special form of nonlinear processing using quadratic forms.

Review 5

9

Gaussian assumptions Quadratic

processing

Linear and

Reasons for studying linear, quadratic and other special forms of non linear processing

If Gaussian we can find or learn a usable decision rule and the rule is optimum

If non-Gaussian case we can find or learn a usable decision rule; however the rule is NOT necessarily optimum

10

Linear functions

f(x1) = w1x1 + w2

One Variable

f(x1, x2 ) = w1x1 + w2x2 + w3

Two Variables

f(x1, x2 , x3) = w1x1 + w2x2 + w2x2 + w3

Three Variables

11

w1x1 + w2 = 0

w1x1 + w2x2 + w3 = 0

w1x1 + w2x2 + w3x3 + w4 = 0

Constant

Line

Plane

w1x1 + w2x2 + w3x3 + w4x4 + w5 = 0 ?

Answer = Hyperplane

12

Hyperplanes

w1x1 + w2x2 + … + wnxn + wn+1 = 0

x = [ x1 , x2 ,… , xn ]

T

w0= [ w1, w2 ,… , wn ]T

w0 x + wn+1 = 0

Define

An alternative representation of a Hyperplane is

n-dimensional Hyperplane

T

13

Hyperplanes as boundaries for Regions

R+ = { x: }

R- = { x: }

Positive side of Hyperplane boundary

Negative side of Hyperplane boundary

w0 x + wn+1 = 0Hyperplane boundary

15

Definitions

(1) Unit Normal u

16

(2) Distance from a point y to the hyperplane

(3) Distance from the origin to the hyperplane

17

(4) Linear Discriminate Functions

whereAugmented Pattern Vector

Weight vector

18

Linear Decision Rule: 2-Class Case using single linear discriminant function

No claim of optimality !!!

for a vector x if

given: d(x)=w1x1 + w2x2 + … + wnxn + wn+1

20

Linear Decision Rule: 2-Class Case using two linear discriminant function

except on boundaries d1(x) = 0 and d

2(x) = 0

where we decide randomly between C1 and C2

given two discriminant functions

define decision rule by

21

Decision regions (2-class case) using two linear discriminant functions and AND logic

22

Decision regions (2-class case) using two linear discriminant functions(continued)

23

Decision regions (2-class case) alternative formulation using two linear discriminant functions

24

Decision regions (2-class case) using alternative form of two linear discriminant functions

equivalent to

25

Decision regions (3-class case) using two linear discriminant functions

26

Decision regions (4-class case) using two linear discriminant functions

27

Decision region R1 (M-class case) using

K linear discriminant functions

28

Example: Piecewise linear boundaries

Given the following discriminant functions

29

If d1(x) > 0 AND d

2(x) > 0

Define the following decision rule

Show the decision regions in the two dimensional pattern space

OR

d3(x) > 0 AND d

4(x) > 0 AND d

5(x) > 0 AND d

6(x) > 0

then decide x comes from class C1,

on the boundaries decide randomly, otherwise decide C

2

Example Continued

30

Solution:

31

Lecture 14 Summary

1. Reviewed structures of Optimal Classifier

2. Defined Linear functions, hyperplanes, boundaries, unit normals,various distances

3. Used Linear Discriminant functions for defining classifiers- Examples

32

End of Lecture 14

1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing...

Documents

Transcript of 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing...