Boosting ---one of combining models

31
Boosting ---one of combining models Xin Li Machine Learning Course

description

Boosting ---one of combining models. Xin Li Machine Learning Course. Outline. Introduction and background of Boosting and Adaboost Adaboost Algorithm introduction Adaboost Algorithm example Experiment results. Boosting. Definition of Boosting[1]: - PowerPoint PPT Presentation

Transcript of Boosting ---one of combining models

Page 1: Boosting ---one of combining models

Boosting---one of combining models

Xin Li

Machine Learning Course

Page 2: Boosting ---one of combining models

Outline

Introduction and background of Boosting and Adaboost

Adaboost Algorithm introductionAdaboost Algorithm exampleExperiment results

Page 3: Boosting ---one of combining models

Boosting

Definition of Boosting[1]:

Boosting refers to a general method of producing a very accurate prediction rule by combining rough and moderately inaccurate rules-of-thumb.

Intuition:

1) No learner is always the best;

2) Construct a set of base-learners which when combined achieves higher accuracy

Page 4: Boosting ---one of combining models

Boosting(cont’d)

3) Different learners may:

--- Be trained by different algorithms

--- Use different modalities(features)

--- Focus on different subproblems

--- ……

4) A week learner is “rough and moderately inaccurate” predictor but one that can predict better than chance.

Page 5: Boosting ---one of combining models

background of Adaboost[2]

Page 6: Boosting ---one of combining models

Outline

Introduction and background of Boosting and Adaboost

Adaboost Algorithm introductionAdaboost Algorithm exampleExperiment results

Page 7: Boosting ---one of combining models

Schematic illustration of the boosting Classifier

Page 8: Boosting ---one of combining models

Adaboost

1. Initialize the data weighting coefficients by setting for

2. For : (a) Fit a classifier to the training data by

minimizing the weighted error function

Where is the indicator function and equals 1 when and 0 otherwise.

{ }nw(1) 1/nw N 1,...,n N

1,...,m M

( )my x

( )

1

( ( ) )N

mm n m n n

n

J w I y x t

( ( ) )m n nI y x t

( )m n ny x t

Page 9: Boosting ---one of combining models

Adaboost(cont’d)

(b) Evaluate the quantities

and then use these to evaluate

( )

1

( )

1

( ( ) )N

mn m n n

nm N

mn

n

w I y x t

w

1ln{ }mm

m

Page 10: Boosting ---one of combining models

Adaboost(cont’d)

(c) Update the data weighting coefficients

3. Make predictions using the final model, which is given by

( 1) ( ) exp{ ( ( ) )}m mn n m m n nw w I y x t

1

( ) ( ( ))M

M m mm

Y x sign y x

Page 11: Boosting ---one of combining models

Prove Adaboost

Consider the exponential error function defined by

------training set target values

------classifier defined in terms of a linear

combination of base classifiers

1

exp{ ( )}N

n m nn

E t f x

1

1( ) ( )

2

m

m l ll

f x y x

{ 1,1}nt

( )ly x

11

1exp{ ( ) ( )}

2

N

n m n n m m nn

E t f x t y x

11

1exp{ ( )}*exp{ ( )}

2

N

n m n n m m nn

t f x t y x

( )

1

1*exp{ ( )}

2

Nmn n m m n

n

w t y x

Page 12: Boosting ---one of combining models

Prove Adaboost(cont’d)

denote the set of data points that are correctly classified by

denote misclassified points by ( )

1

1*exp{ ( )}

2

Nmn n m m n

n

E w t y x

/ 2 / 2 / 2( ) ( )

1 1

( ) ( ( ) )m m m

N Nm mn m n n n

n n

e e w I y x t e w

/ 2 / 2( ) ( )m m

m m

m mn n

n T n M

e w e w

mT

mM

( )

1

( ( ) )N

mm n m n n

n

J w I y x t

Page 13: Boosting ---one of combining models

Outline

Introduction and background of Boosting and Adaboost

Adaboost Algorithm introductionAdaboost Algorithm exampleExperiment results

Page 14: Boosting ---one of combining models

A toy example[2]

Training set: 10 points (represented by plus or minus)

Original Status: Equal Weights for all training samples

Page 15: Boosting ---one of combining models

A toy example(cont’d)

Round 1: Three “plus” points are not correctly classified;They are given higher weights.

Page 16: Boosting ---one of combining models

A toy example(cont’d)

Round 2: Three “minuse” points are not correctly classified;They are given higher weights.

Page 17: Boosting ---one of combining models

A toy example(cont’d)

Round 3: One “minuse” and two “plus” points are not correctly classified;They are given higher weights.

Page 18: Boosting ---one of combining models

A toy example(cont’d)

Final Classifier: integrate the three “weak” classifiers and obtain a final strong classifier.

Page 19: Boosting ---one of combining models

Revisit Bagging

Page 20: Boosting ---one of combining models

Bagging vs Boosting

Bagging: the construction of complementary base-learners is left to chance and to the unstability of the learning methods.

Boosting: actively seek to generate complementary base-learner--- training the next base-learner based on the mistakes of the previous learners.

Page 21: Boosting ---one of combining models

Outline

Introduction and background of Boosting and Adaboost

Adaboost Algorithm introductionAdaboost Algorithm exampleExperiment results(Good Parts Selection)

Page 22: Boosting ---one of combining models

Browse all birds

Page 23: Boosting ---one of combining models

Curvature Descriptor

Page 24: Boosting ---one of combining models

Adaboost with CPM

Page 25: Boosting ---one of combining models

Adaboost with CPM(con’d)

Page 26: Boosting ---one of combining models

Adaboost with CPM(con’d)

Page 27: Boosting ---one of combining models

Adaboost without CPM(con’d)

The Alpha Values

Other Statistical Data: zero rate: 0.6167; covariance: 0.9488; median: 1.6468

2.521895 0 2.510827 0.714297 0 0

1.646754 0 0 0 0 0

2.134926 0 2.167948 0 2.526712 0

0.279277 0 0 0 0.0635 2.322823

0 0 2.516785 0 0 0

0 0.04174 0 0.207436 0 0

0 0 1.30396 0 0 0.951666

0 2.513161 2.530245 0 0 0

0 0 0 0.041627 2.522551 0

0.72565 0 2.506505 1.303823 0 1.611553

Page 28: Boosting ---one of combining models

Parameter Discussion

For error bound, this depends on the specific method to calculate the error:

1) two class separation[3]:

2) one vs several classes[3]:

1

: | ( ) |N

tt t i t i i

i

h p h x y

1

: [ ( ) ] |N

tt t i t i i

i

h p h x y

Page 29: Boosting ---one of combining models

The error bound figure

Page 30: Boosting ---one of combining models

Thanks a lot!Enjoy Machine Learning!

Page 31: Boosting ---one of combining models

Reference

[1] Yoav Freund, Robert Schapire, a short Introduction to Boosting

[2] Robert Schapire, the boosting approach to machine learning; Princeton University

[3] Yoav Freund, Robert Schapire, A decision-theoretic generalization of on-line learning and application to boosting

[4] Pengyu Hong, Statistical Machine Learning lecture notes.