Chiung Yao Fang Department of Computer Science and Information Engineering

41
An Introduction to Classification Techniques and Their Applications: k-means Clustering and Decision Trees Chiung Yao Fang Department of Computer Science and Information Engineering National Taiwan Normal University Taipei, Taiwan

description

An Introduction to Classification Techniques and Their Applications: k -means Clustering and Decision Trees. Chiung Yao Fang Department of Computer Science and Information Engineering National Taiwan Normal University Taipei, Taiwan. Outline. k -means Clustering - PowerPoint PPT Presentation

Transcript of Chiung Yao Fang Department of Computer Science and Information Engineering

Page 1: Chiung Yao Fang Department of Computer Science and Information Engineering

An Introduction to Classification Techniques and Their Applications:

k-means Clustering and Decision Trees

Chiung Yao Fang

Department of Computer Science and Information Engineering

National Taiwan Normal University

Taipei, Taiwan

Page 2: Chiung Yao Fang Department of Computer Science and Information Engineering

OutlineOutline

• k-means Clustering– Application: Infant Face Detection

– Experimental Results

• Decision trees– Application: Infant Facial Expression Recognition

– Experimental Results

• Conclusion

2

Page 3: Chiung Yao Fang Department of Computer Science and Information Engineering

3

Infant Face Detection Infant Face Detection

• System setup

camera

Page 4: Chiung Yao Fang Department of Computer Science and Information Engineering

Color Space SelectionColor Space Selection

4infant skin color

HSI color spaceRGB color space

YCrCb color space H, Cr, Cb model

Page 5: Chiung Yao Fang Department of Computer Science and Information Engineering

An ExampleAn Example

5

Page 6: Chiung Yao Fang Department of Computer Science and Information Engineering

kk-Means Clustering-Means Clustering• Given N input feature data xt, t =1,...,N

• Find k reference vectors which represent input feature data

• Reference vectors, mj, j =1,...,k

• Use nearest reference

• Reconstruction error

mint ti j

j x m x m

2

1

1 if min

0 otherwise

k t ti i ii t i

t ti jt j

i

E b

b

Xm x m

x m x m

6

Page 7: Chiung Yao Fang Department of Computer Science and Information Engineering

7

Page 8: Chiung Yao Fang Department of Computer Science and Information Engineering

kk-means Clustering-means Clustering

8

Page 9: Chiung Yao Fang Department of Computer Science and Information Engineering

Experimental ResultsExperimental Results

9

original image k = 4 k = 5

k = 6 k = 7 k = 8

Page 10: Chiung Yao Fang Department of Computer Science and Information Engineering

Experimental ResultsExperimental Results

10

Input image

k=3

k=4

k=5

k=6

Page 11: Chiung Yao Fang Department of Computer Science and Information Engineering

kk-means Clustering-means Clustering

• The methods to initial mi – Randomly select k instances

– Calculate the mean of all data and add small random vectors

– Calculate the principal component partitioning the data into k groups, and then take the means of these groups

– …

• How to decide k?

11

Page 12: Chiung Yao Fang Department of Computer Science and Information Engineering

Face Region DecisionFace Region Decision

12

object size

object density fit large small

high set k increase k next image

low increase k increase k next image

Page 13: Chiung Yao Fang Department of Computer Science and Information Engineering

DemoDemo

13

Page 14: Chiung Yao Fang Department of Computer Science and Information Engineering

14

Infant Facial Expression RecognitionInfant Facial Expression Recognition Five infant facial expressions

crying, gazing, laughing, yawning and vomiting

Three poses of the infant head front, turn left and turn right

Total classes: 15 classes

crying gazing laughing

yawning vomiting

frontturn right

turn left

Page 15: Chiung Yao Fang Department of Computer Science and Information Engineering

15

Moments Moments

To calculate three types of moments Hu moment [Hu1962] R moment [Liu2008] Zernike moment [Zhi2008]

Given an image I and let f be an image function. The digital (p, q)th moment of I is given by

The central (p, q)th moments of I can be defined as

where and

The normalized central moments of I

where

.),( )(),(

Iyx

qppq yxfyxIm

).,()()( )(),(

00 yxfyyxxIIyx

qppq

00

pqpq

00

100 m

mx

00

010 m

my

12

qp

Page 16: Chiung Yao Fang Department of Computer Science and Information Engineering

16

Hu MomentHu Moment

• Hu moments are translation, scale, and rotation invariant.

])()(3)[)(3(

])(3))[()(3(

))((4

])())[((

])()(3)[)(3(

])(3))[()(3(

)()(

)3()3(

4)(

20321

2123003213012

20321

21230123003217

0321123011

20321

2123002206

20321

2123003210321

20321

21230123012305

20321

212304

20321

212303

211

202202

02201

H

H

H

H

H

H

H normalized central moments

Page 17: Chiung Yao Fang Department of Computer Science and Information Engineering

17

Example: Hu Moments

yawningcrying

•If the infant facial expressions are different then the values of Hu moments are also different.

Page 18: Chiung Yao Fang Department of Computer Science and Information Engineering

18

R Moment R Moment

• Liu (2008) proposed ten R moments which can improve the scale invariability of Hu moments.

43

510

52

69

23

68

51

67

31

66

5

45

5

34

4

33

21

212

1

21

||

||

||

||

||

||

||

||

||

HH

HR

HH

HR

HH

HR

HH

HR

HH

HR

H

HR

H

HR

H

HR

HH

HHR

H

HR

Hu moments

Page 19: Chiung Yao Fang Department of Computer Science and Information Engineering

19

Example: R Momentscrying

Hu moments

•R moments and Hu moments may have different properties.

Page 20: Chiung Yao Fang Department of Computer Science and Information Engineering

20

Zernike MomentZernike Moment

Zernike moments of order p with repetition q for an image function f is

where

To simplify the index, we use Z1, Z2,…, Z10 to represent Z80, Z82,…, Z99, respectively.

),(4

cos)/2(22

2/

1

8

12

vsfs

qvNsR

N

pC

N

s

s

vpqpq

),(4

sin)/2(22

2/

1

8

12

vsfs

qvNsR

N

pS

N

s

s

vpqpq

2/122 )( pqpqpq SCZ

sp

qp

spq r

qspqsps

sprR 2

2/|)|(

0

! 2/||2 ! 2/||2 !

)!()1()(

real part

imaginary part

Page 21: Chiung Yao Fang Department of Computer Science and Information Engineering

21

Example: ZernikeZernike Momentscrying

Page 22: Chiung Yao Fang Department of Computer Science and Information Engineering

22

Correlation CoefficientsCorrelation Coefficients

• A facial expression is a sequential change of the values of the moments.

• The correlation coefficients of two moments may be used to represent the facial expressions.

• Let Ai = , i = 1, 2,…, m,

indicates the ith moment Ai of the frame Ik, k = 1, 2,…, n.

The correlation coefficients between Ai and Aj can be defined as

where and

ji

ji

ji SS

Sr

AA

AAAA

n

kiiI AA

nS

ki

1

22 )(1

1A

n

kjjIiiI AAAA

nS

kkji

1

)])([(1

1AA

iA : the mean of the elements in Ai

},...,,{21 niIiIiI AAA

kiIA

Page 23: Chiung Yao Fang Department of Computer Science and Information Engineering

23

Correlation CoefficientsCorrelation Coefficients

• The correlation coefficients between seven Hu moment sequences.

H1 H2 H3 H4 H5 H6 H7

H1 1 0.8778 0.9481 -0.033 -0.571 -0.8052 0.8907

H2 1 0.9474 0.1887 -0.4389 -0.8749 0.9241

H3 1 0.1410 -0.6336 -0.9044 0.9719

H4 1 0.0568 -0.3431 0.2995

H5 1 0.7138 -0.6869

H6 1 -0.9727

H7 1

yawning

Page 24: Chiung Yao Fang Department of Computer Science and Information Engineering

Decision TreesDecision Trees

• Properties of decision trees– An efficient nonparametric method– A hierarchical model– Divided-and-conquer strategy– Supervised learning

24

Page 25: Chiung Yao Fang Department of Computer Science and Information Engineering

Tree Uses Nodes, and LeavesTree Uses Nodes, and Leaves

25

Page 26: Chiung Yao Fang Department of Computer Science and Information Engineering

EntropyEntropy• For node m, Nm instances reach m, Ni

m belong to class Ci

– Node m is pure if pim is 0 or 1

• Measure of impurity is entropy

m

imi

mi NN

pm,CP̂ x|

26

K

i

im

imm pp

12logI

01log10log0 where

Yes No

Yes No

Page 27: Chiung Yao Fang Department of Computer Science and Information Engineering

Example of EntropyExample of Entropy

• In a two-class problem

– If p1 = 1 and p2 = 0

all examples are of class C1

the entropy is 0

– If p1 = p2 = 0.5

the entropy is 1Entropy function for a two-class problem

)1(log)1(log)1,( 22 pppppp

27

Page 28: Chiung Yao Fang Department of Computer Science and Information Engineering

Best SplitBest Split

• If node m is pure, generate a leaf and stop, otherwise split and continue recursively

• Impurity after split:

Nmj of Nm take branch j, Nimj belong to Ci

• Find the variable and split that min impurity

mj

imji

mji N

Npj,m,CP̂ x|

K

i

imj

imj

n

j m

mjm pp

N

N

12

1

logI'

28

Yes No

Yes No

Page 29: Chiung Yao Fang Department of Computer Science and Information Engineering

A Decision TreeA Decision Tree

• Decision trees are used to classify the infant facial expressions.

29

H1H2 H1H3 H2H3

- + +

+ + -

+ + +

+ + -

- + +

- - +

- - -

+ - -

- - -

+ - +

H1H3>0

Yes No

correlation coefficients

Page 30: Chiung Yao Fang Department of Computer Science and Information Engineering

30

A Decision TreeA Decision Tree

• The correlation coefficients between two attributes Ai and Aj are used to split the training instances.

• Let the training instances in S be split into two subsets S1 and S2 by the correlation coefficient, then the measure function is

• The best correlation coefficient selected by the system is

K

h

K

h S

hS

S

hS

S

hS

S

hS

r N

N

N

N

N

N

N

NSE

ji1 1

22 ,loglog)(2

2

2

2

1

1

1

1

AA

)(minarg)( ,

**SESr

jiji r

ji AAAA

Page 31: Chiung Yao Fang Department of Computer Science and Information Engineering

31

Decision tree constructionDecision tree construction

Step 1: Initially, put all the training instances into the root SR, regard SR as an internal

decision node and input SR into a decision node queue.

Step 2: Select an internal decision node S from the decision node queue calculate the entropy of node S.

If the entropy of node S larger then a threshold Ts, then goto Step 3, else label

node S as a leaf node, goto Step 4.

Step 3: Find the best correlation coefficient to split the training instances in node S.

Split the training instances in S into two nodes S1 and S2 by correlation

coefficients and add S1, S2 into the decision node queue. Goto Step 2.

Step 4: If the queue is not empty, then goto Step 2, else stop the algorithm.

Page 32: Chiung Yao Fang Department of Computer Science and Information Engineering

32

Experimental ResultsExperimental Results

Training: 59 sequences Testing: 30 sequences Five infant facial expressions: crying, laughing, dazing, yawning, vomiting Three different poses of infant head: front, turn left, and turn right Fifteen classes are classified.

crying laughing dazing yawning vomiting

Turn

left

Front

Turn right

Page 33: Chiung Yao Fang Department of Computer Science and Information Engineering

Feature type: Hu momentsInternal nodes: 16Leaf nodes: 17Height: 8

076 HH

rHS9

yes no

041 HH

r

HS10crying

yes no

062 HH

rHS11 yawning

yes no

061 HH

rHS12 crying

yes no

yawningHS13

075 HH

r

021 HH

rHS16

yes noHS14

043 HH

r

no

laughing

no

dazing dazing

yesyes

HS15

031 HH

r

no

cryingdazing

yes

HS2

yes no

076 HH

r

HS3

061 HH

r

HS6

yes no

063 HH

r

HS5vomiting

yes no

063 HH

r

HS5 vomiting

yes no

laughingvomiting

yes no

051 HH

rHS7 yawning

yes no

031 HH

rHS8 crying

yes no

laughingcrying

yes no

HS1054

HHr

053 HH

r

Page 34: Chiung Yao Fang Department of Computer Science and Information Engineering

Experimental ResultsExperimental Results

Testing sequences Classification results

34

laughing

laughing

dazing

vomiting

•The classification results of the Hu-moment decision tree

Page 35: Chiung Yao Fang Department of Computer Science and Information Engineering

35

Feature type: R momentsInternal nodes: 15Leaf nodes: 17Height: 10

no

yes

vomiting

yes no075

RRr

yawning

no021

RRr

yes

074 RR

ryes

0105 RR

r

dazing

no

crying

vomiting

yes no064

RRr

laughingno

014 RR

ryes

no0101

RRr

yes

no0105

RRr

yes

021 RR

ryes

nocrying

dazing laughing

no

051 RR

r

yes

no

041 RR

r

yes

081 RR

r

yes nodazing

crying vomiting

no0109

RRr

yes

071 RR

r 061 RR

r

031 RR

r

yes

yes no

laughingdazing

no

vomiting

yes no

laughingcrying

Page 36: Chiung Yao Fang Department of Computer Science and Information Engineering

36

Experimental ResultsExperimental Results

Testing sequences Classification results

vomiting

dazing

yawning

dazing

•The classification results of the R-moment decision tree

Page 37: Chiung Yao Fang Department of Computer Science and Information Engineering

Feature type: Zernike momentsInternal nodes: 19Leaf nodes: 20Height: 7

no

yes no062

ZZr

no084

ZZryes

063 ZZ

ryes

crying

vomiting

laughing

dazing yawning

yes no0109

ZZr

yesno

075 ZZ

r

no

076 ZZ

r

yesyes

no

0107 ZZ

r

cryingyes no

076 ZZ

r

no

071 ZZ

r

yes

crying

no

041 ZZ

r

yes

no

051 ZZ

r

yes

no

071 ZZ

r

yes

crying laughing

021 ZZ

r

yesno

081 ZZ

r

vomiting

yes no

031 ZZ

r

crying

yes no

041 ZZ

r

dazing yawning

yes no

laughingdazing

dazing032 ZZ

r

yes

021 ZZ

r

no

vomitingyes

no

laughing dazing

no

091 ZZ

r

yawning

yes

Page 38: Chiung Yao Fang Department of Computer Science and Information Engineering

38

Experimental ResultsExperimental Results

Testing sequences Classification results

crying

crying

vomiting

crying

•The classification results of the Zernike-moment decision tree

Page 39: Chiung Yao Fang Department of Computer Science and Information Engineering

39

Experimental ResultsExperimental Results

The comparison of the results

The correlation coefficients of the moments are useful attributes to classify the infant facial expressions.

The classification tree created by the Hu moments has less height and number of node, but higher classification rate.

Height of the decision

tree

Number of nodes

Number of training

sequences

Number of testing

sequences

Classification Rate

Hu moments 8 16+17 59 30 90%

R moments 10 15+17 59 30 80%

Zernike moments 7 19+20 59 30 87%

Page 40: Chiung Yao Fang Department of Computer Science and Information Engineering

40

ConclusionsConclusions

Conclusion– k-means Clustering

• Example: Infant Face Detection

– Decision trees• Example: Infant Facial Expression Recognition

Page 41: Chiung Yao Fang Department of Computer Science and Information Engineering

41