Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... ·...

57
Ensemble Learning

Transcript of Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... ·...

Page 1: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

Ensemble  Learning  

Page 2: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

Ensemble  Learning  Consider  a  set  of  classifiers  h1,  ...,  hL!  

Idea:    construct  a  classifier  H(x)  that  combines  the  individual  decisions  of  h1,  ...,  hL!•  e.g.,  could  have  the  member  classifiers  vote,  or  •  e.g.,  could  use  different  members  for  different  regions  of  the  

instance  space  •  Works  well  if  the  members  each  have  low  error  rates    

Successful  ensembles  require  diversity  •  Classifiers  should  make  different  mistakes  •  Can  have  different  types  of  base  learners  

4  [Based  on  slide  by  Léon  BoFou]  

Page 3: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

PracIcal  ApplicaIon:    NeMlix  Prize  Goal:    predict  how  a  user  will  rate  a  movie  •  Based  on  the  user’s  raIngs  for  other  movies  •  and  other  peoples’  raIngs  •  with  no  other  informaIon  about  the  movies  

This  applicaIon  is  called  “collaboraIve  filtering”    

Ne1lix  Prize:    $1M  to  the  first  team  to  do  10%  beFer  then  NeMlix’  system  (2007-­‐2009)  

 

Winner:    BellKor’s  PragmaIc  Chaos  –  an  ensemble  of  more  than  800  raIng  systems  

 5  [Based  on  slide  by  Léon  BoFou]  

Page 4: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

Combining  Classifiers:    Averaging  

•  Final  hypothesis  is  a  simple  vote  of  the  members  

6  

x!

h1!h2!

hL!

H(x)!

Page 5: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

Combining  Classifiers:    Weighted  Average  

•  Coefficients  of  individual  members  are  trained  using  a  validaIon  set  

7  

x!

h1!h2!

hL!

H(x)!

w1!

w2!

wi!wL!

Page 6: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

Combining  Classifiers:    GaIng  

•  Coefficients  of  individual  members  depend  on  input  •  Train  gaIng  funcIon  via  validaIon  set  

8  

x!

h1!h2!

hL!

H(x)!

Gating Fn!

w1!

w2!

wi!

wL!

Page 7: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

C!

Combining  Classifiers:    Stacking  

•  PredicIons  of  1st  layer  used  as  input  to  2nd  layer  •  Train  2nd  layer  on  validaIon  set  

9  

x!

h1!h2!

hL!

H(x)!

1st  Layer   2nd  Layer  

Page 8: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

How  to  Achieve  Diversity  

Cause  of  the  Mistake  PaFern  was  difficult  Overfijng  Some  features  are  noisy  

Diversifica@on  Strategy  Hopeless  Vary  the  training  sets  Vary  the  set  of  input  features  

10  [Based  on  slide  by  Léon  BoFou]  

Page 9: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

ManipulaIng  the  Training  Data  Bootstrap  replica@on:  •  Given  n  training  examples,  construct  a  new  training  set  by  

sampling  n  instances  with  replacement  •  Excludes  ~30%  of  the  training  instances  

Bagging:  •  Create  bootstrap  replicates  of  training  set  •  Train  a  classifier  (e.g.,  a  decision  tree)  for  each  replicate  •  EsImate  classifier  performance  using  out-­‐of-­‐bootstrap  data  •  Average  output  of  all  classifiers  

Boos@ng:    (in  just  a  minute...)  11  [Based  on  slide  by  Léon  BoFou]  

Page 10: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

ManipulaIng  the  Features  Random  Forests  •  Construct  decision  trees  on  bootstrap  replicas  

–  Restrict  the  node  decisions  to  a  small  subset  of  features  picked  randomly  for  each  node  

•  Do  not  prune  the  trees  –  EsImate  tree  performance    on  out-­‐of-­‐bootstrap  data  

•  Average  the  output    of  all  trees  

12  [Based  on  slide  by  Léon  BoFou]  

Page 11: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

BoosIng  

14  

Page 12: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

AdaBoost    [Freund  &  Schapire,  1997]  

•  A  meta-­‐learning  algorithm  with  great  theoreIcal  and  empirical  performance  

•  Turns  a  base  learner  (i.e.,  a  “weak  hypothesis”)  into  a  high  performance  classifier  

•  Creates  an  ensemble  of  weak  hypotheses  by  repeatedly  emphasizing  misspredicted  instances  

15  

Page 13: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

•  Size  of  point  represents  the  instance’s  weight  

16  

Page 14: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 1

17  

Page 15: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 1 + –

18  

Page 16: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 1 + –

•  βt  measures  the  importance  of  ht!•  If                                  ,  then                                    (can  trivially  guarantee)    ✏t 0.5 �t � 0

19  

Page 17: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 1 + –

•  βt  measures  the  importance  of  ht!•  If                                  ,  then                                    (  βt    grows  as              gets  smaller)    ✏t 0.5 �t � 0 ✏t 0.5

20  

Page 18: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

•  Weights  of  correct  predicIons  are  mulIplied  by  •  Weights  of  incorrect  predicIons  are  mulIplied  by    

t = 1 + –

e��t 1e�t � 1

21  

Page 19: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

•  Weights  of  correct  predicIons  are  mulIplied  by  •  Weights  of  incorrect  predicIons  are  mulIplied  by    

t = 1 + –

e��t 1e�t � 1

22  

Page 20: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 1 + –

23  

Disclaimer:    Note  that  resized  points  in  the  illustraIon  above  are  not  necessarily  to  scale  with  βt  

Page 21: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 2 + –

24  

Page 22: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 2 + –

25  

Page 23: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 2

+ –

26  

Page 24: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 2

+ –

•  βt  measures  the  importance  of  ht!•  If                                  ,  then                                    (  βt    grows  as              gets  smaller)    ✏t 0.5 �t � 0 ✏t 0.5

27  

Page 25: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 2

+ –

•  Weights  of  correct  predicIons  are  mulIplied  by  •  Weights  of  incorrect  predicIons  are  mulIplied  by    

e��t 1e�t � 1

28  

Page 26: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 2

+ –

•  Weights  of  correct  predicIons  are  mulIplied  by  •  Weights  of  incorrect  predicIons  are  mulIplied  by    

e��t 1e�t � 1

29  

Page 27: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 3

+ –

30  

Page 28: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 3

+ –

+ –

31  

Page 29: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 3

+ –

+ –

•  βt  measures  the  importance  of  ht!•  If                                  ,  then                                    (  βt    grows  as              gets  smaller)    ✏t 0.5 �t � 0 ✏t 0.5

32  

Page 30: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 3

+ –

+ –

•  Weights  of  correct  predicIons  are  mulIplied  by  •  Weights  of  incorrect  predicIons  are  mulIplied  by    

e��t 1e�t � 1

33  

Page 31: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 3

+ –

+ –

•  Weights  of  correct  predicIons  are  mulIplied  by  •  Weights  of  incorrect  predicIons  are  mulIplied  by    

e��t 1e�t � 1

34  

Page 32: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 4

+ –

+ –

35  

Page 33: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 4

+ –

+ – +

36  

Page 34: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+ –

+ –

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

t = 4

+ – +

37  

Page 35: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

+ –

+ – t = 5

+ – +

38  

Page 36: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

+ –

+ – t = 5

+ – +

+

39  

Page 37: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

+ –

+ – t = 5

+ – +

+

40  

Page 38: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

+ –

+ – t = 6

+ – +

+

41  

Page 39: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

+ +

t = T

+

+

+ + +

+ +

42  

Page 40: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

+ +

t = T

+

+

+ + +

+ +

43  

Page 41: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

+

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

AdaBoost  

+ +

t = T

+

+

+ + +

+ +

•  Final  model  is  a  weighted  combinaIon  of  members  –  Each  member  weighted  by  its  importance  

44  

Page 42: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

AdaBoost    [Freund  &  Schapire,  1997]  

45  

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

Page 43: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

AdaBoost  

46  

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

wt  is  a  vector  of  weights  over  the  instances  at  iteraIon  t!  

All  points  start  with  equal  weight    

Page 44: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

AdaBoost  

47  

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

We  need  a  way  to  weight  instances  differently  when  learning  the  model...    

Page 45: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

Training  a  Model  with  Weighted  Instances  •  For  algorithms  like  logisIc  regression,  can  simply  incorporate  weights  w  into  the  cost  funcIon  –  EssenIally,  weigh  the  cost  of  misclassificaIon  differently  for  each  instance  

•  For  algorithms  that  don’t  directly  support  instance  weights  (e.g.,  ID3  decision  trees,  etc.),  use  weighted  bootstrap  sampling  –  Form  training  set  by  resampling  instances  with  replacement  according  to  w  

48  

Jreg(✓) = �nX

i=1

wi [yi log h✓(xi) + (1� yi) log (1� h✓(xi))] + �k✓[1:d]k22

Page 46: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

Base  Learner  Requirements  •  AdaBoost  works  best  with  “weak”  learners  

–  Should  not  be  complex  –  Typically  high  bias  classifiers  – Works  even  when  weak  learner  has  an  error  rate  just  slightly  under  0.5      (i.e.,  just  slightly  beFer  than  random)  

•  Can  prove  training  error  goes  to  0  in  O(log  n)  iteraIons  

•  Examples:  –  Decision  stumps  (1  level  decision  trees)  –  Depth-­‐limited  decision  trees  –  Linear  classifiers  

49  

Page 47: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

AdaBoost  

50  

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

Error  is  the  sum  the  weights  of  all  misclassified  instances    

Page 48: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

AdaBoost  

51  

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

•  βt  measures  the  importance  of  ht!•  If                                  ,  then                                      o  Trivial,  otherwise  flip  ht‘s  predicIons  

•  βt    grows  as  error  ht‘s  shrinks  

✏t 0.5 �t � 0

Page 49: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

AdaBoost  

52  

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

 

This  is  the  same  as:          EssenIally  this  emphasizes  misclassified  instances.          

wt+1,i = wt,i ⇥⇢

e��t if ht(xi) = yie�t if ht(xi) 6= yi

will  be  ≤  1  

will  be  ≥  1  

Page 50: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

AdaBoost  

53  

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

Make  wt+1  sum  to  1    

Page 51: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

AdaBoost  

54  

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

INPUT: training data X, y = {(xi, yi)}ni=1,

the number of iterations T1: Initialize a vector of n uniform weights w1 =

⇥1n , . . . ,

1n

2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt

4: Compute the weighted training error rate of ht:

✏t =X

i:yi 6=ht(xi)

wt,i

5: Choose �t =12 ln

⇣1�✏t✏t

6: Update all instance weights:

wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n

7: Normalize wt+1 to be a distribution:

wt+1,i =wt+1,iPnj=1 wt+1,j

8i = 1, . . . , n

8: end for

9: Return the hypothesis

H(x) = sign

TX

t=1

�tht(x)

!

Member  classifiers  with  less  error  are  given  more  weight  in  the  final  ensemble  hypothesis    

Final  predicIon  is  a  weighted  combinaIon  of  each  member’s  predicIon    

Page 52: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

Dynamic  Behavior  of  AdaBoost  •  If  a  point  is  repeatedly  misclassified...  

– Each  Ime,  its  weight  is  increased  – Eventually  it  will  be  emphasized  enough  to  generate  a  hypothesis  that  correctly  predicts  it  

•  Successive  member  hypotheses  focus  on  the  hardest  parts  of  the  instance  space  –  Instances  with  highest  weight  are  owen  outliers  

55  

Page 53: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

AdaBoost  and  Overfijng  •  VC  Theory  originally  predicted  that  AdaBoost  would  always  overfit  as  T  grew  large  –  Hypothesis  keeps  growing  more  complex  

•  In  pracIce,  AdaBoost  owen  did  not  overfit,  contradicIng  VC  theory  

•  Also,  AdaBoost  does  not  explicitly  regularize  the  model  

56  

Page 54: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

Train  

Test  AdaBoost  on  OCR  data  with  C4.5  as  the  base  learner  

Explaining  Why  AdaBoost  Works  

•  Empirically,  boosIng  resists  overfijng  •  Note  that  it  conInues  to  drive  down  the  test  error  even  AFTER  the  training  error  reaches  zero  

57  [Figure  from  Schapire:  “Explaining  AdaBoost”]    

Page 55: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

T  =  5  T  =  100  T  =  1000  

Explaining  Why  AdaBoost  Works  

58  

Train  

Test  AdaBoost  on  OCR  data  with  C4.5  as  the  base  learner  

[Figures  from  Schapire:  “Explaining  AdaBoost”]    

•  The  “margins  explanaIon”  shows  that  boosIng  tries  to  increase  the  confidence  in  its  predicIons  over  Ime  –  Improves  generalizaIon  performance  –  EffecIvely,  boosIng  maximizes  the  margin!  

Page 56: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

AdaBoost  in  PracIce  Strengths:  •  Fast  and  simple  to  program  •  No  parameters  to  tune  (besides  T)  •  No  assumpIons  on  weak  learner  

When  boosIng  can  fail:  •  Given  insufficient  data  •  Overly  complex  weak  hypotheses  •  Can  be  suscepIble  to  noise  •  When  there  are  a  large  number  of  outliers  

59  

Page 57: Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... · Combining(Classifiers:((Gang(• Coefficients(of(individual(members(depend(on(input • Train(gang(funcIon(viavalidaon(set

Error  Rates  on  27  Benchmark  Data  Sets  

Boosted  Decision  Trees  •  Boosted  decision  trees  are  one  of  the  best  “off-­‐the-­‐shelf”  classifiers  –  i.e.,  no  parameter  tuning  

•  Limit  member  hypothesis  complexity  by  limiIng  tree  depth  

•  Gradient  boosIng  methods  are  typically  used  with  trees  in  pracIce  

60  [Figure  from  Freud  and  Schapire:  “A  Short  IntroducIon  to  BoosIng”]    

“AdaBoost  with  trees  is  the  best  off-­‐the-­‐shelf  classifier  in  the  world”  -­‐Breiman,  1996  (Also,  see  results  by  Caruana  &  Niculescu-­‐Mizil,  ICML  2006)