Visual Tracking with Online Multiple Instance...
Transcript of Visual Tracking with Online Multiple Instance...
![Page 1: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/1.jpg)
Visual Tracking with Online Multiple
Instance Learning
Kelsie Zhao
Boris Babenko, Ming-Hsuan Yang, Serge Belongie
![Page 2: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/2.jpg)
Content
• Goal
• Background: Tracking by Detection
• Previous Work
• New Tracking Solution • MILTrack
• Online MILBoost
• Experiments & Results
![Page 3: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/3.jpg)
Goal
Track one arbitrary object in video, given its
location in first frame
![Page 4: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/4.jpg)
Background: Tracking by detection
• Frame 1 is labeled, tracker location known
![Page 5: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/5.jpg)
Background: Tracking by detection
• Crop one positive and some negative patches
near tracker
Negative
Positive x2
x1 x3
![Page 6: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/6.jpg)
Background: Tracking by detection
• Use patches to train the classifier
Negative
Positive
Classifier
{(x1, 1), (x2, 0), (x3, 0)}
x2
x1 x3
![Page 7: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/7.jpg)
Background: Tracking by detection
• Frame 2 comes
Classifier
![Page 8: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/8.jpg)
Background: Tracking by detection
• Calculate classifier response within a range of
the old tracker location
Classifier
X
Old location
![Page 9: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/9.jpg)
Background: Tracking by detection
• Find the maximum response location
Classifier
X
Old location
X
New location
![Page 10: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/10.jpg)
Background: Tracking by detection
• Move tracker
Frame 1 Frame 2
![Page 11: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/11.jpg)
Background: Tracking by detection
• Repeat
Negative
Positive
Frame 2
![Page 12: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/12.jpg)
Background: Tracking by detection
• Problem: If tracker location is not precise, might
select bad training examples
![Page 13: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/13.jpg)
Background: Tracking by detection
• Problem: If tracker location is not precise, might
select bad training examples
Model start to degrade!
![Page 14: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/14.jpg)
Background: Tracking by detection
• Problem: If tracker location is not precise, might
select bad training examples
• How to select good training
examples?
Model start to degrade!
![Page 15: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/15.jpg)
Previous Work
• Solution 1: multiple positive examples around
tracker location
x4
x2 x5
x1
x3
{(x1, 1), (x2, 1), (x3, 1), (x4, 0), (x5, 0)}
Classifier
![Page 16: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/16.jpg)
Previous Work
• Solution 1
Might confuse classifier!
x4
x2 x5 x1 x3
{(x1, 1), (x2, 1), (x3, 1), (x4, 0), (x5, 0)}
Classifier
![Page 17: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/17.jpg)
Previous Work
• Solution 2: Multiple Instance Learning (MIL)
[Keeler ‘90, Dietterich et al. ‘97]
![Page 18: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/18.jpg)
Previous Work: Multiple Instance Learning
• Multiple examples in one bag
[Keeler ‘90, Dietterich et al. ‘97]
x13
x12
x11
x21 x31
![Page 19: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/19.jpg)
Previous Work: Multiple Instance Learning
X3 X2
X1
• Multiple examples in one bag
[Keeler ‘90, Dietterich et al. ‘97]
x13
x12
x11
x21 x31
![Page 20: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/20.jpg)
Previous Work: Multiple Instance Learning
(X3 , 0) (X2 , 0)
(X1 , 1)
• Multiple examples in one bag
• One bag one label
[Keeler ‘90, Dietterich et al. ‘97]
x13
x12
x11
x21 x31
![Page 21: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/21.jpg)
Previous Work: Multiple Instance Learning
(X3 , 0) (X2 , 0)
(X1 , 1)
• Multiple examples in one bag
• One bag one label
• Bag Positive if at least one example is Positive
[Keeler ‘90, Dietterich et al. ‘97]
x13
x12
x11
x21 x31
![Page 22: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/22.jpg)
Previous Work: Multiple Instance Learning
{(X1, 1), (X2, 0), (X3, 0)}
Classifier
(X3 , 0) (X2 , 0)
(X1 , 1)
[Keeler ‘90, Dietterich et al. ‘97]
x13
x12
x11
x21 x31
![Page 23: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/23.jpg)
Previous Work: Multiple Instance Learning
MIL training input:
𝑿𝟏, 𝑦1 … 𝑿𝒏, 𝑦𝑛 ,
where,
𝑿𝒊 = 𝑥𝑖1 …𝑥𝑖𝑚 ,
𝑦𝑖 = 𝑚𝑎𝑥𝑗 𝑦𝑖𝑗
(X3 , 0) (X2 , 0)
(X1 , 1)
x13
x12
x11
x21 x31
[Keeler ‘90, Dietterich et al. ‘97]
![Page 24: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/24.jpg)
x13
x12
x11
Previous Work: Multiple Instance Learning
MIL training input:
𝑿𝟏, 𝑦1 … 𝑿𝒏, 𝑦𝑛 ,
where,
𝑿𝒊 = 𝑥𝑖1 …𝑥𝑖𝑚 ,
𝑦𝑖 = 𝑚𝑎𝑥𝑗 𝑦𝑖𝑗
(X3 , 0) (X2 , 0)
(X1 , 1)
Bag babel is 1
if at least one instance is 1 [Keeler ‘90, Dietterich et al. ‘97]
x21 x31
![Page 25: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/25.jpg)
Now we have training examples!
How to train the classifier?
![Page 26: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/26.jpg)
Previous Work: MILBoost
• MIL + boosting
Train a boosting classifier that maximizes log
likelihood of bags
𝑙𝑜𝑔𝐿 = log (𝑝 𝑦𝑖 𝑿𝒊))
𝑖
where,
𝑝 𝑦𝑖 𝑿𝒊) = 1 − (1 − 𝑝 𝑦𝑖 𝑥𝑖𝑗))
𝑗
[Viola et al. ‘05]
![Page 27: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/27.jpg)
Previous Work: MILBoost
• MIL + boosting
Train a boosting classifier that maximizes log
likelihood of bags
𝑙𝑜𝑔𝐿 = log (𝑝 𝑦𝑖 𝑿𝒊))
𝑖
where,
𝑝 𝑦𝑖 𝑿𝒊) = 1 − (1 − 𝑝 𝑦𝑖 𝑥𝑖𝑗))
𝑗
[Viola et al. ‘05]
1 ~1
![Page 28: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/28.jpg)
Previous Work: MILBoost
• Problem: need all training examples
[Viola et al. ‘05]
![Page 29: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/29.jpg)
Previous Work: MILBoost
But in tracking, only current frame available
[Viola et al. ‘05]
![Page 30: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/30.jpg)
Previous Work: MILBoost
But in tracking, only current frame available
Need an online training algorithm for MIL
[Viola et al. ‘05]
![Page 31: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/31.jpg)
Main Contribution of this paper
• Online-MILBoost: Online training for MIL-based classifier
• MILTrack New tracking solution using Online-MILBoost
![Page 32: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/32.jpg)
MILTrack workflow
New frame comes in:
1. Crop out a set of image patches 𝑋𝑠 = {𝑥| < 𝑠1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
X
Babenko et al., 09
1: s = 35 in authors’ experiment
![Page 33: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/33.jpg)
MILTrack workflow
New frame comes in:
1. Crop out a set of image patches 𝑋𝑠 = {𝑥| < 𝑠1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
X
Babenko et al., 09
1: s = 35 in authors’ experiment
![Page 34: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/34.jpg)
MILTrack workflow
New frame comes in:
1. Crop out a set of image patches 𝑋𝑠 = {𝑥| < 𝑠1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
X
1: s = 35 in authors’ experiment
Babenko et al., 09
![Page 35: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/35.jpg)
MILTrack workflow
New frame comes in:
1. Crop out a set of image patches 𝑋𝑠 = {𝑥| < 𝑠1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
X
Babenko et al., 09
1: s = 35 in authors’ experiment
![Page 36: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/36.jpg)
MILTrack workflow
New frame comes in:
1. Crop out a set of image patches 𝑋𝑠 = {𝑥| < 𝑠1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
Babenko et al., 09
1: s = 35 in authors’ experiment
![Page 37: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/37.jpg)
MILTrack workflow
New frame comes in:
1. Crop out a set of image patches
𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
2. Use MIL classifier to find new tracker location
𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))
Babenko et al., 09
![Page 38: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/38.jpg)
MILTrack workflow
New frame comes in:
1. Crop out a set of image patches
𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
2. Use MIL classifier to find new tracker location
𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))
3. 1) Crop positive examples 𝑋𝑟 = {𝑥| < 𝑟1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
Babenko et al., 09
1: r = 5 in authors’ experiment
![Page 39: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/39.jpg)
MILTrack workflow
(X1 , 1)
Babenko et al., 09
New frame comes in:
1. Crop out a set of image patches
𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
2. Use MIL classifier to find new tracker location
𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))
3. 1) Crop positive examples 𝑋𝑟 = {𝑥| < 𝑟1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
1: r = 5 in authors’ experiment
x13
x12
x11
![Page 40: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/40.jpg)
MILTrack workflow
Babenko et al., 09
New frame comes in:
1. Crop out a set of image patches
𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
2. Use MIL classifier to find new tracker location
𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))
3. 1) Crop positive examples
𝑋𝑟 = {𝑥| < 𝑟 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
3. 2) Crop Negative examples 𝑋𝑟, 𝛽 = 𝑥 𝑟 𝑡𝑜 𝛽1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑎𝑤𝑎𝑦 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
1: 𝛽 = 50 in authors’ experiment
![Page 41: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/41.jpg)
MILTrack workflow
(X3 , 0)
x31
(X2 , 0)
x21
Babenko et al., 09
New frame comes in:
1. Crop out a set of image patches
𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
2. Use MIL classifier to find new tracker location
𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))
3. 1) Crop positive examples
𝑋𝑟 = {𝑥| < 𝑟 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
3. 2) Crop Negative examples 𝑋𝑟, 𝛽 = 𝑥 𝑟 𝑡𝑜 𝛽1 𝑝𝑖𝑥𝑒𝑙𝑠 𝑎𝑤𝑎𝑦 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
1: 𝛽 = 50 in authors’ experiment
![Page 42: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/42.jpg)
MILTrack workflow
New frame comes in:
1. Crop out a set of image patches
𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
2. Use MIL classifier to find new tracker location
𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))
3. Crop positive and negative examples near new
object location
4. Online MILBoost: Update MIL classifier with positive and
negative example bags
(X3 , 0) (X2 , 0)
(X1 , 1)
Babenko et al., 09
x31 x21
x13
x12
x11
Classifier
![Page 43: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/43.jpg)
Online-MILBoost:
f1
f2
f3
…
Image patch x
Babenko et al., 09
![Page 44: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/44.jpg)
Online-MILBoost:
• ℎ𝑘: a weak classifier using one feature
Image patch x
Babenko et al., 09
f1
f2
f3
…
f9
…
ℎ1(𝑥)
ℎ2(𝑥)
![Page 45: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/45.jpg)
Online-MILBoost:
• ℎ𝑘: a weak classifier using one feature
ℎ𝑘(𝑥) = log𝑝(𝑦 = 1|𝑓𝑘(𝑥))
𝑝(𝑦 = 0|𝑓𝑘(𝑥))
Image patch x
Babenko et al., 09
f1
f2
f3
…
ft
…
ℎ𝑘(𝑥)
![Page 46: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/46.jpg)
Online-MILBoost:
• ℎ𝑘: a weak classifier using one feature
ℎ𝑘(𝑥) = log𝑝(𝑦 = 1|𝑓𝑘(𝑥))
𝑝(𝑦 = 0|𝑓𝑘(𝑥))
with, 𝑝 𝑓𝑘 𝑥 𝑦 = 1 ~ 𝒩(𝜇1, 𝜎1) 𝑝 𝑓𝑘 𝑥 𝑦 = 0 ~ 𝒩(𝜇0, 𝜎0) 𝑝 𝑦 = 1 = 𝑝 𝑦 = 0
Image patch x
Babenko et al., 09
f1
f2
f3
…
ft
…
ℎ𝑘(𝑥)
![Page 47: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/47.jpg)
Online-MILBoost:
• 𝑯(𝒙) : the MIL classifier made from weak
classifiers
𝑯 𝒙 = ℎ𝑘(𝑥)
𝐾
𝑘=1
Babenko et al., 09 K = 50 in authors’ experiment
Image patch x
ℎ1(𝑥)
ℎ𝑘(𝑥)
…
…
f1
f2
f3
…
ft
…
![Page 48: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/48.jpg)
Online-MILBoost:
• Always keep a pool of M >> K weak classifier
candidates
h3
h1
h2 …
hM
Babenko et al., 09 M = 250 & K = 50 in authors’ experiment
![Page 49: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/49.jpg)
Online-MILBoost:
• Update all M weak classifiers with positive and
negative bags
X3
{(X1, 1), (X2, 0), (X3, 0)}
X2 X1 x13
x12
x11
x21 x31
Classifier h1 Classifier h2 Classifier hM …
Babenko et al., 09
![Page 50: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/50.jpg)
Online-MILBoost:
• Pick best K weak classifiers to form 𝑯(𝒙), where
ℎ𝑘 = argmaxℎ ∈ {ℎ1
… ℎ𝑀}log 𝐿(𝐻𝑘 − 1 + ℎ)
𝑯 𝒙 = ℎ𝑘(𝑥)
𝐾
𝑘=1
where 𝐻𝑘 − 1 is the classifier made up of the first
𝑘 − 1 weak classifiers
Babenko et al., 09
![Page 51: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/51.jpg)
Online-MILBoost:
• Prediction : 𝑝 𝑦 = 1 𝑥 = 𝜎(𝑯(𝒙))
𝜎 𝑥 = 1
1 + 𝑒 − 𝑥
Babenko et al., 09
![Page 52: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/52.jpg)
Online-MILBoost:
• Prediction : 𝑝 𝑦 = 1 𝑥 = 𝜎(𝑯(𝒙))
𝜎 𝑥 = 1
1 + 𝑒 − 𝑥
Babenko et al., 09
ℎ1 𝑥 = 2, ℎ2 𝑥 = 1.8, ℎ3 𝑥 = 0.6
f1
f2
f3
ℎ1(𝑥)
ℎ2(𝑥)
ℎ3(𝑥)
![Page 53: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/53.jpg)
Online-MILBoost:
• Prediction : 𝑝 𝑦 = 1 𝑥 = 𝜎(𝑯(𝒙))
𝜎 𝑡 = 1
1 + 𝑒 − 𝑡
Babenko et al., 09
𝑯 𝒙 = ℎ𝑘(𝑥)
𝐾
𝑘=1
= 4.4
f1
f2
f3
ℎ1(𝑥)
ℎ2(𝑥)
ℎ3(𝑥)
𝑝 𝑦 = 1 𝑥 = 𝜎 𝑯 𝒙 = 0.99
ℎ1 𝑥 = 2, ℎ2 𝑥 = 1.8, ℎ3 𝑥 = 0.6
![Page 54: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/54.jpg)
MILTrack workflow
New frame comes in:
1. Crop out a set of image patches
𝑋𝑠 = {𝑥| < 𝑠 𝑝𝑖𝑥𝑒𝑙𝑠 𝑓𝑟𝑜𝑚 𝑡𝑟𝑎𝑐𝑘𝑒𝑟 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛}
2. Use MIL classifier to find new tracker location
𝑙𝑛𝑒𝑤 = 𝑙(argmax𝑥 ∈ 𝑋𝑠 𝑝(𝑦 = 1|𝑥))
3. Crop positive and negative examples near new object location
4. Online MILBoost:
Update MIL classifier with positive and negative
example bags
Babenko et al., 09
![Page 55: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/55.jpg)
Experiments
Datesets: 8 publicly available videos, • Grayscale, 320 x 240 pixels
• Ground truth labeled every 5 frames by hand
−
Babenko et al., 2009
Coke can Girl Occluded face
David
Occluded face 2
Tiger 1 Sylvester Tiger 2
![Page 56: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/56.jpg)
Experiments
Compared with: • OAB1
Online AdaBoost w/ 1 positive example per frame
• OAB5
Online AdaBoost w/ 45 positive examples per frame
• SemiBoost
Label in 1st frame only.
• FragTrack
Static appearance model
Babenko et al., 2009
![Page 57: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/57.jpg)
Experiments
Evaluation criterion:
Tracker position error (pixels) w.r.t. Ground truth
Babenko et al., 2009
X X
![Page 59: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/59.jpg)
Results
Position error versus Frame #, Video David
Babenko et al., 2009
![Page 61: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/61.jpg)
Results
Position error versus Frame #,
Video Occluded Face
Babenko et al., 2009
![Page 62: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/62.jpg)
Results
Average Center location errors (pixels)
Best
Second Best
Babenko et al., 2009
![Page 63: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/63.jpg)
Conclusion
• Online MILBoost: Online algorithm to update MIL-based classifier
• Performance of “MILTrack” is stable
Babenko et al., 2009
![Page 64: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/64.jpg)
Discussion
• Why it can handle occlusion?
• Possible improvements
• Motion Model
• Features
• Part based representation
Babenko et al., 2009
![Page 65: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/65.jpg)
Note…
Wu et al., 2013
![Page 66: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/66.jpg)
Note…
Wu et al., 2013
![Page 67: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/67.jpg)
Note…
Wu et al., 2013 Project 2 use this!
![Page 68: Visual Tracking with Online Multiple Instance Learningvision.stanford.edu/teaching/cs231b_spring1415/slides/...Need an online training algorithm for MIL [Viola et al. ‘05] Main Contribution](https://reader035.fdocuments.us/reader035/viewer/2022071107/5fe14f06791a510130684445/html5/thumbnails/68.jpg)
Thank You!
Q&A