Post on 11-Jul-2020
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
A decision-theoretic generalization of on-linelearning and an application to boosting [1]
From Regret Learning to AdaBoost
Xing Wang
Department of Computer Science, TAMU
Date: May 6, 2015
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Table of Contents
1 AdaBoost Algorithm
2 Upper Bound for Adaboost Algorithm
3 Experiment EvaluationExperiment 1Experiment 2
4 Generalization Analysis
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
External Regret Learning
Initial∑N
i=1 w1i = 1, w1
i ∈ [0, 1];for t = 1 . . .T do
get pt = w t/∑
w ti ;
receive loss vector l t ∈ [0, 1]N ;suffer loss pt · l t ;update weight w t+1
i = w ti β
l ti
endAlgorithm 1: PW Algorithm
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
From Regret to Adaptive Boosting
input : N labeled samples (x1, y1), . . .distribution D over N samplesweak learning algorithm WeakLearn
Initial w1i = D(i);
for t = 1 . . .T doprovide WeakLearn with distribution pt = w t/
∑w ti over
samples, get a hypothesis ht : X → [0, 1];
calculate the error of ht , εt =∑N
i=1 pti |ht(xi )− yi |;
set βt = εt/(1− εt), update weight vector
w t+1i = w t
i β1−|ht(xi )−yi |t ;
endAlgorithm 2: Adaboost
hf (x) =
{1 if
∑Tt=1 ht(x)log(1/βt) ≥ 0.5
∑Tt=1 log(1/βt)
0 otherwise(1)
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Table of Contents
1 AdaBoost Algorithm
2 Upper Bound for Adaboost Algorithm
3 Experiment EvaluationExperiment 1Experiment 2
4 Generalization Analysis
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Error Bound for AdaBoost
Theorem The error for the final hypothesis hf ,ε =
∑i :hf (xi )6=yi
D(i), is bounded by ε ≤∏T
t=1 2√εt(1− εt)
Figure 1: 2√εt(1− εt)
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Theorem proof, part 1
wT+1i = D(i)
T∏t=1
β1−|ht(xi )−yi |t (2)
hf (x) = 1 if∑T
t=1 ht(x)log(1/βt) ≥ 0.5∑T
t=1 log(1/βt).Then hf makes mistake (hf (x) 6= yi ) is equivalent to
T∏t=1
β−|ht(xi )−yi |t ≥ (
T∏t=1
βt)−1/2 (3)
plug 3 in 2 for the mislabeled samples, we have
N∑i
wT+1i ≥
∑i :hf (x) 6=yi
wT+1i ≥ (
∑i :hf (x)6=yi
D(i))(T∏t=1
βt)1/2 = ε(
T∏t=1
βt)1/2
(4)
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Theorem proof, part 2
N∑i=1
w t+1i =
N∑i=1
w ti β
1−|ht(xi )−yi |t
≤N∑i=1
w ti (1− (1− β)(1− |ht(xi )− yi |))
= (N∑i=1
w ti )(1− (1− εt)(1− βt))
(5)
where εt =∑N
i w ti |ht(xi )− yi |/
∑Nj=1 w
tj
N∑i=1
wT+1i ≤
N∑i=1
w1i
T∏t=1
(1− (1− εt)(1− βt))
≤T∏t=1
(1− (1− εt)(1− βt))
(6)
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Theorem proof, part 3
Based on 4 ε(∏T
t=1 βt)1/2 ≤
∑Ni=1 w
T+1i and 6∑N
i=1 wT+1i ≤
∏Tt=1(1− (1− εt)(1− βt)), we get:
ε ≤T∏t=1
1− (1− εt)(1− βt)√βt
(7)
The right part get minimal value when βt = εt/(1− εt), plug inthis value and finish the proof ε ≤ 2T
∏Tt=1
√εt(1− εt).
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Table of Contents
1 AdaBoost Algorithm
2 Upper Bound for Adaboost Algorithm
3 Experiment EvaluationExperiment 1Experiment 2
4 Generalization Analysis
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Experiment settings
Two dataset:
DRIVE [2] retinal image, blood vessel vs backgroundUCI [4] Japanese credit screening dataset
Decision Tree as weak learner,
package from sklearnmax depth of 4initial sample weight w1
i = 0.5/|j : lj == li |
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Retinal blood vessel / background classification
(a) (b)
20 training images, a total of 4,541,006 pixels, 569,415 bloodvessel pixel.
two shape features, energy and symmetry derived from daisygraph [3].
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Evalution results on the Retina Image
Figure 2: εt → 0.5, βt → 1
There is little update on the sample weightlog(1/βt)→ 0, the corresponding classifier contribute less.2√εt(1− εt)→ 1, no reduce on 2T
∏Tt=1
√εt(1− εt)
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Credit Screening
UCI Japanese Credit Screening : http://goo.gl/4gBRXb, 532samples.
Feature used : 2,3,8,11,14,15. six continuous features.
Class label: +(296)/-(357)
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Evalution results on the Credit Screening
εt of each round is below 0.4ε on training set converge to 0 after 40 rounds.
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Table of Contents
1 AdaBoost Algorithm
2 Upper Bound for Adaboost Algorithm
3 Experiment EvaluationExperiment 1Experiment 2
4 Generalization Analysis
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
PAC framework and VC dimension
Based on [5], with probability 1− δ,
errortrue(h) ≤ errortrain(h) +
√ln(H) + ln(1/δ)
2m(8)
H is the VC dimension of the hypothesis class
m is the sample numbers
δ = 0.05 for later analysis
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
VC dimension
VC dimension of a class of hypothesis is the largest number ofsamples, any assignment of label to the samples could beseperated by one hypotheiss in the hypothesis class.Example: In one-dimension, with hypothesis class as{+/− x > a}.
exists two samples, always separable
any three samples, exist one label assignment not separable
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
VC dimension of decision tree
The VC dimesion of decision tree of depth k on n-dimension spaceis bounded by:
Lower bound: 2k−1(n + 1)
Upper bound[5]: 2(2n)2k−1
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
VC dimension of the Adaboost
Let H be the class of hypothesis given by the WeakLearner withVC dimension d ≥ 2, then the VC dimesion of the hypothesis giverby Adaboost after T rounds is at most
2(d + 1)(T + 1)log2(e(T + 1)) (9)
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Mean of Leave-one-out generalization test
Figure 3: Generalization test on Credit Screening
The optimal interation number given by PAC framework isless than the optimal iterations needed.Consistent with the papers results.
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Thanks, Q&A
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Reference I
Freund, Yoav, and Robert E. Schapire. ”A decision-theoreticgeneralization of on-line learning and an application toboosting.” Journal of computer and system sciences 55.1(1997): 119-139.
J.J. Staal, M.D. Abramoff, M. Niemeijer, M.A. Viergever, B.van Ginneken, ”Ridge based vessel segmentation in colorimages of the retina”, IEEE Transactions on Medical Imaging,2004, vol. 23, pp. 501-509.
Huajun, Ying, Wang Xing, and Liu Jyh-Charn. ”Statisticalpattern analysis of blood vessel features on retina images andits application to blood vessel mapping algorithms.” EMBC2014.
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Reference II
Lichman, M. (2013). UCI Machine Learning Repository[http://archive.ics.uci.edu/ml]. Irvine, CA: University ofCalifornia, School of Information and Computer Science.
Luke Zettlemoyer. PAC-learning, VC Dimesion. UW, 2012.
AdaBoost Algorithm Upper Bound for Adaboost Algorithm Experiment Evaluation Generalization Analysis
Iteration statistic
There are cases the boost iterate less than 40 times. Theiteration ends because the error rate does not change.