Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... ·...
Transcript of Ensemble(Learning - Penn Engineeringcis519/fall2017/lectures/08_Ensemble... ·...
Ensemble Learning
Ensemble Learning Consider a set of classifiers h1, ..., hL!
Idea: construct a classifier H(x) that combines the individual decisions of h1, ..., hL!• e.g., could have the member classifiers vote, or • e.g., could use different members for different regions of the
instance space • Works well if the members each have low error rates
Successful ensembles require diversity • Classifiers should make different mistakes • Can have different types of base learners
4 [Based on slide by Léon BoFou]
PracIcal ApplicaIon: NeMlix Prize Goal: predict how a user will rate a movie • Based on the user’s raIngs for other movies • and other peoples’ raIngs • with no other informaIon about the movies
This applicaIon is called “collaboraIve filtering”
Ne1lix Prize: $1M to the first team to do 10% beFer then NeMlix’ system (2007-‐2009)
Winner: BellKor’s PragmaIc Chaos – an ensemble of more than 800 raIng systems
5 [Based on slide by Léon BoFou]
Combining Classifiers: Averaging
• Final hypothesis is a simple vote of the members
6
x!
h1!h2!
hL!
H(x)!
Combining Classifiers: Weighted Average
• Coefficients of individual members are trained using a validaIon set
7
x!
h1!h2!
hL!
H(x)!
w1!
w2!
wi!wL!
Combining Classifiers: GaIng
• Coefficients of individual members depend on input • Train gaIng funcIon via validaIon set
8
x!
h1!h2!
hL!
H(x)!
Gating Fn!
w1!
w2!
wi!
wL!
C!
Combining Classifiers: Stacking
• PredicIons of 1st layer used as input to 2nd layer • Train 2nd layer on validaIon set
9
x!
h1!h2!
hL!
H(x)!
1st Layer 2nd Layer
How to Achieve Diversity
Cause of the Mistake PaFern was difficult Overfijng Some features are noisy
Diversifica@on Strategy Hopeless Vary the training sets Vary the set of input features
10 [Based on slide by Léon BoFou]
ManipulaIng the Training Data Bootstrap replica@on: • Given n training examples, construct a new training set by
sampling n instances with replacement • Excludes ~30% of the training instances
Bagging: • Create bootstrap replicates of training set • Train a classifier (e.g., a decision tree) for each replicate • EsImate classifier performance using out-‐of-‐bootstrap data • Average output of all classifiers
Boos@ng: (in just a minute...) 11 [Based on slide by Léon BoFou]
ManipulaIng the Features Random Forests • Construct decision trees on bootstrap replicas
– Restrict the node decisions to a small subset of features picked randomly for each node
• Do not prune the trees – EsImate tree performance on out-‐of-‐bootstrap data
• Average the output of all trees
12 [Based on slide by Léon BoFou]
BoosIng
14
AdaBoost [Freund & Schapire, 1997]
• A meta-‐learning algorithm with great theoreIcal and empirical performance
• Turns a base learner (i.e., a “weak hypothesis”) into a high performance classifier
• Creates an ensemble of weak hypotheses by repeatedly emphasizing misspredicted instances
15
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
• Size of point represents the instance’s weight
16
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 1
17
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 1 + –
18
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 1 + –
• βt measures the importance of ht!• If , then (can trivially guarantee) ✏t 0.5 �t � 0
19
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 1 + –
• βt measures the importance of ht!• If , then ( βt grows as gets smaller) ✏t 0.5 �t � 0 ✏t 0.5
20
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
• Weights of correct predicIons are mulIplied by • Weights of incorrect predicIons are mulIplied by
t = 1 + –
e��t 1e�t � 1
21
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
• Weights of correct predicIons are mulIplied by • Weights of incorrect predicIons are mulIplied by
t = 1 + –
e��t 1e�t � 1
22
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 1 + –
23
Disclaimer: Note that resized points in the illustraIon above are not necessarily to scale with βt
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 2 + –
24
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 2 + –
25
+ –
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 2
+ –
26
+ –
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 2
+ –
• βt measures the importance of ht!• If , then ( βt grows as gets smaller) ✏t 0.5 �t � 0 ✏t 0.5
27
+ –
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 2
+ –
• Weights of correct predicIons are mulIplied by • Weights of incorrect predicIons are mulIplied by
e��t 1e�t � 1
28
+ –
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 2
+ –
• Weights of correct predicIons are mulIplied by • Weights of incorrect predicIons are mulIplied by
e��t 1e�t � 1
29
+ –
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 3
+ –
30
+ –
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 3
+ –
+ –
31
+ –
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 3
+ –
+ –
• βt measures the importance of ht!• If , then ( βt grows as gets smaller) ✏t 0.5 �t � 0 ✏t 0.5
32
+ –
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 3
+ –
+ –
• Weights of correct predicIons are mulIplied by • Weights of incorrect predicIons are mulIplied by
e��t 1e�t � 1
33
+ –
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 3
+ –
+ –
• Weights of correct predicIons are mulIplied by • Weights of incorrect predicIons are mulIplied by
e��t 1e�t � 1
34
+ –
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 4
+ –
+ –
35
+ –
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 4
+ –
+ – +
36
+ –
+ –
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
t = 4
+ – +
37
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
+ –
+ – t = 5
+ – +
38
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
+ –
+ – t = 5
+ – +
+
39
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
+ –
+ – t = 5
+ – +
+
40
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
+ –
+ – t = 6
+ – +
+
41
+
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
+ +
t = T
+
+
+ + +
+ +
42
+
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
+ +
t = T
+
+
+ + +
+ +
43
+
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
+ +
t = T
+
+
+ + +
+ +
• Final model is a weighted combinaIon of members – Each member weighted by its importance
44
AdaBoost [Freund & Schapire, 1997]
45
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
AdaBoost
46
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
wt is a vector of weights over the instances at iteraIon t!
All points start with equal weight
AdaBoost
47
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
We need a way to weight instances differently when learning the model...
Training a Model with Weighted Instances • For algorithms like logisIc regression, can simply incorporate weights w into the cost funcIon – EssenIally, weigh the cost of misclassificaIon differently for each instance
• For algorithms that don’t directly support instance weights (e.g., ID3 decision trees, etc.), use weighted bootstrap sampling – Form training set by resampling instances with replacement according to w
48
Jreg(✓) = �nX
i=1
wi [yi log h✓(xi) + (1� yi) log (1� h✓(xi))] + �k✓[1:d]k22
Base Learner Requirements • AdaBoost works best with “weak” learners
– Should not be complex – Typically high bias classifiers – Works even when weak learner has an error rate just slightly under 0.5 (i.e., just slightly beFer than random)
• Can prove training error goes to 0 in O(log n) iteraIons
• Examples: – Decision stumps (1 level decision trees) – Depth-‐limited decision trees – Linear classifiers
49
AdaBoost
50
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
Error is the sum the weights of all misclassified instances
AdaBoost
51
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
• βt measures the importance of ht!• If , then o Trivial, otherwise flip ht‘s predicIons
• βt grows as error ht‘s shrinks
✏t 0.5 �t � 0
AdaBoost
52
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
This is the same as: EssenIally this emphasizes misclassified instances.
wt+1,i = wt,i ⇥⇢
e��t if ht(xi) = yie�t if ht(xi) 6= yi
will be ≤ 1
will be ≥ 1
AdaBoost
53
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
Make wt+1 sum to 1
AdaBoost
54
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
INPUT: training data X, y = {(xi, yi)}ni=1,
the number of iterations T1: Initialize a vector of n uniform weights w1 =
⇥1n , . . . ,
1n
⇤
2: for t = 1, . . . , T3: Train model ht on X, y with instance weights wt
4: Compute the weighted training error rate of ht:
✏t =X
i:yi 6=ht(xi)
wt,i
5: Choose �t =12 ln
⇣1�✏t✏t
⌘
6: Update all instance weights:
wt+1,i = wt,i exp (��tyiht(xi)) 8i = 1, . . . , n
7: Normalize wt+1 to be a distribution:
wt+1,i =wt+1,iPnj=1 wt+1,j
8i = 1, . . . , n
8: end for
9: Return the hypothesis
H(x) = sign
TX
t=1
�tht(x)
!
Member classifiers with less error are given more weight in the final ensemble hypothesis
Final predicIon is a weighted combinaIon of each member’s predicIon
Dynamic Behavior of AdaBoost • If a point is repeatedly misclassified...
– Each Ime, its weight is increased – Eventually it will be emphasized enough to generate a hypothesis that correctly predicts it
• Successive member hypotheses focus on the hardest parts of the instance space – Instances with highest weight are owen outliers
55
AdaBoost and Overfijng • VC Theory originally predicted that AdaBoost would always overfit as T grew large – Hypothesis keeps growing more complex
• In pracIce, AdaBoost owen did not overfit, contradicIng VC theory
• Also, AdaBoost does not explicitly regularize the model
56
Train
Test AdaBoost on OCR data with C4.5 as the base learner
Explaining Why AdaBoost Works
• Empirically, boosIng resists overfijng • Note that it conInues to drive down the test error even AFTER the training error reaches zero
57 [Figure from Schapire: “Explaining AdaBoost”]
T = 5 T = 100 T = 1000
Explaining Why AdaBoost Works
58
Train
Test AdaBoost on OCR data with C4.5 as the base learner
[Figures from Schapire: “Explaining AdaBoost”]
• The “margins explanaIon” shows that boosIng tries to increase the confidence in its predicIons over Ime – Improves generalizaIon performance – EffecIvely, boosIng maximizes the margin!
AdaBoost in PracIce Strengths: • Fast and simple to program • No parameters to tune (besides T) • No assumpIons on weak learner
When boosIng can fail: • Given insufficient data • Overly complex weak hypotheses • Can be suscepIble to noise • When there are a large number of outliers
59
Error Rates on 27 Benchmark Data Sets
Boosted Decision Trees • Boosted decision trees are one of the best “off-‐the-‐shelf” classifiers – i.e., no parameter tuning
• Limit member hypothesis complexity by limiIng tree depth
• Gradient boosIng methods are typically used with trees in pracIce
60 [Figure from Freud and Schapire: “A Short IntroducIon to BoosIng”]
“AdaBoost with trees is the best off-‐the-‐shelf classifier in the world” -‐Breiman, 1996 (Also, see results by Caruana & Niculescu-‐Mizil, ICML 2006)