ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise...

18
ENN: Extended Nearest Neighbor Method for Pattern Recognition This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pa@ern RecogniBon," IEEE Computa+onal Intelligence Magazine, vol.10, no.3, pp.52 60, Aug. 2015 Prof. Haibo He Electrical Engineering University of Rhode Island, Kingston, RI 02881 ComputaBonal Intelligence and SelfAdapBve Systems (CISA) Laboratory h@p://www.ele.uri.edu/faculty/he/ Email: [email protected]

Transcript of ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise...

Page 1: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

ENN: Extended Nearest Neighbor Method for Pattern Recognition

This  lecture  notes  is  based  on  the  following  paper:  B.  Tang  and  H.  He,  "ENN:  Extended  Nearest  Neighbor  Method  for  Pa@ern  RecogniBon,"  IEEE  

Computa+onal  Intelligence  Magazine,    vol.10,  no.3,  pp.52  -­‐  60,  Aug.  2015  

Prof.  Haibo  He  Electrical  Engineering  

University  of  Rhode  Island,  Kingston,  RI  02881    

ComputaBonal  Intelligence  and  Self-­‐AdapBve  Systems  (CISA)  Laboratory  h@p://www.ele.uri.edu/faculty/he/  

Email:  [email protected]      

Page 2: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

Extended  Nearest  Neighbor    for  Pa3ern  Recogni6on  

1.  Limita6ons  of  K-­‐Nearest  Neighbors  (KNN)  

2.  “Two-­‐way  communica-on”:  Extended  Nearest  

Neighbors  (ENN)  

3.  Experimental  Analysis    

4.  Conclusion  

Page 3: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

Pattern Recognition ü  Parametric  Classifier  

§  Class-­‐wise  density  es6ma6on,  including  naive  Bayes,  mixture  Gaussian,    etc.  

ü  Non-­‐Parametric  Classifier  §  Nearest  Neighbors  §  Neural  Network  §  Support  Vector  Machine  

Nonparametric  nature  

Easy  implementa6on  

Powerfulness  

Robustness  

Consistency  

Page 4: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

Scale-­‐SensiBve  Problem:    The  class  1  samples  dominate  their  near  neighborhood  with  higher  density  (i.e.,  more  concentrated  distribuBon).  The  class  2  samples  are  distributed  in  regions  with  lower  density  (i.e.,  more  spread  out  distribuBon).  

Limitations of traditional KNN

Those  class  2  samples  which  are  close  to  the  region  of  class  1  may  be  easily  misclassified.    

Page 5: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

ENN: A New Approach

Define generalized class-wise statistic for each class:

Si denotes the samples in class i, and NNr(x, S) denotes the r-th nearest neighbor of x in S.

Ti measures the coherence of data from the same class. 0 ≤ Ti ≤ 1 with Ti = 1 when all the nearest neighbors of class i data are also from the same class i, and with Ti = 0 when all the nearest neighbors are from other classes.

Page 6: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

Intra-class coherence:

Given an unknown sample Z to be classified, we iteratively assign it to class 1 and class 2, respectively, to obtain two new generalized class-wise statistics Ti

j, where j=1,2. Then, the sample Z is classified according to:

ENN Classification Rule: Maximum Gain of Intra-class Coherence.

For N-class classification:  

Page 7: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

To avoid the recalculation of generalized class-wise statistics in testing stage, an Equivalent Version of ENN is proposed:

Page 8: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

The equivalent version has the same result as the original one, but avoids the recalculation of Ti

j

Page 9: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

How this simple rule works better than KNN The ENN method makes a prediction in a “two-way communication” style: it considers not only who are the nearest neighbors of the test sample, but also who consider the test sample as their nearest neighbors.

Page 10: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

Experimental Results and Analysis

Page 11: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

Sampling  methods  

Synthetic Data Set: A 3-dimensional Gaussian data with 3 classes:

Considering the following four models, their error rates are:

               Model  2   Class  1 Class  2 Class  3

KNN ENN KNN ENN KNN ENN k  =  3 32 31.9 39.3 34.4 31.4 30.5 k  =  5 31.2 29.7 40.5 33.7 28.6 26.7 k  =  7 28.5 28.3 40.8 33.6 25 24.3                  Model  3   Class  1 Class  2 Class  3   KNN ENN KNN ENN KNN ENN k  =  3 33.2 31 27 26.8 38.8 33.7 k  =  5 30.3 27.3 24 23.2 40.2 33.5 k  =  7 26.7 25.1 20.8 20.8 40.6 33

2 2 21 2 35, 20, 5σ σ σ= = =

2 2 21 2 35, 5, 20σ σ σ= = =

Page 12: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

• MNIST  Handwri3en  Digit  Recogni6on  

 

Sampling  methods  

Real-life Data Sets:

Data Examples

Page 13: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

Sampling  methods  

Real-life Data Sets:

t-­‐test  shows  that  ENN  can  significantly  improve  the  classifica6on  performance  in  17  out  of  20  datasets,  in  comparison  with  KNN.  

•  20  data  sets  from  UCI  Machine  Learning  Repository  

 

Page 14: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

B.  Tang  and  H.  He,  "ENN:  Extended  Nearest  Neighbor  Method  for  Pa3ern  Recogni6on,"  IEEE  Computa6onal  Intelligence  Magazine,    vol.10,  no.3,  pp.52  -­‐  60,  Aug.  2015  

ENN    

Summary:  Three  versions  of  ENN  

Page 15: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

ENN.V1  

B.  Tang  and  H.  He,  "ENN:  Extended  Nearest  Neighbor  Method  for  Pa3ern  Recogni6on,"  IEEE  Computa6onal  Intelligence  Magazine,    vol.10,  no.3,  pp.52  -­‐  60,  Aug.  2015  

Summary:  Three  versions  of  ENN  

Page 16: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

B.  Tang  and  H.  He,  "ENN:  Extended  Nearest  Neighbor  Method  for  Pa3ern  Recogni6on,"  IEEE  Computa6onal  Intelligence  Magazine,    vol.10,  no.3,  pp.52  -­‐  60,  Aug.  2015  

Summary:  Three  versions  of  ENN  

ENN.V2  

Page 17: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

B.  Tang  and  H.  He,  "ENN:  Extended  Nearest  Neighbor  Method  for  Pa3ern  Recogni6on,"  IEEE  Computa6onal  Intelligence  Magazine,    vol.10,  no.3,  pp.52  -­‐  60,  Aug.  2015  

Summary:  Three  versions  of  ENN  

Online  Resources  

h@p://www.ele.uri.edu/faculty/he/research/ENN/ENN.html  

Supplementary  materials  and  Matlab  source  code  implementaBon  available  at:      

Page 18: ENN: Extended Nearest Neighbor Method for …...ENN: A New Approach Define generalized class-wise statistic for each class: S i denotes the samples in class i, and NNr (x, S) denotes

1.  A  new  ENN  classifica6on  methodology  based  on  the  maximum  gain  of  

intra-­‐class  coherence.  

2.  “Two-­‐way  communica-on”:  ENN  considers  not  only  who  are  the  nearest  

neighbors  of  the  test  sample,  but  also  who  consider  the  test  sample  as  

their  nearest  neighbors.    

3.  Important  and  useful  for  many  other  machine  learning  and  data  mining  

problems,  such  as  density  es6ma6on,  clustering,  regression,  among  

others.    

Conclusion  

B.  Tang  and  H.  He,  "ENN:  Extended  Nearest  Neighbor  Method  for  Pa3ern  Recogni6on,"  IEEE  Computa6onal  Intelligence  Magazine,    vol.10,  no.3,  pp.52  -­‐  60,  Aug.  2015