Multiclass feature learning for hyperspectral image classification ...
On Feature Combination for Multiclass Object Classification Peter Gehler and Sebastian Nowozin...
-
date post
22-Dec-2015 -
Category
Documents
-
view
222 -
download
0
Transcript of On Feature Combination for Multiclass Object Classification Peter Gehler and Sebastian Nowozin...
![Page 1: On Feature Combination for Multiclass Object Classification Peter Gehler and Sebastian Nowozin Reading group October 15, 2009.](https://reader034.fdocuments.us/reader034/viewer/2022051214/56649d775503460f94a59049/html5/thumbnails/1.jpg)
Xerox Research Centre Europe
On Feature Combination for Multiclass Object Classification
Peter Gehler and Sebastian Nowozin
Reading group October 15, 2009
![Page 2: On Feature Combination for Multiclass Object Classification Peter Gehler and Sebastian Nowozin Reading group October 15, 2009.](https://reader034.fdocuments.us/reader034/viewer/2022051214/56649d775503460f94a59049/html5/thumbnails/2.jpg)
Introduction
This paper is about: Kernel selection (feature selection)
Example: Flower classification
Features: colour and shape 2 kernels
Problem: how to combine these 2 kernels (input to SVM: 1 kernel!)
Simple: take average Smarter: weighted sum with as many weights as kernels Even smarter: different weights for each class
![Page 3: On Feature Combination for Multiclass Object Classification Peter Gehler and Sebastian Nowozin Reading group October 15, 2009.](https://reader034.fdocuments.us/reader034/viewer/2022051214/56649d775503460f94a59049/html5/thumbnails/3.jpg)
Combining kernels – baseline method
Compute average over all kernels:
Given: distance matrices dl(xi,xj)Goal: compute one single kernel to use with SVMs
Recipe:
Compute RBF kernels: kl(xi,xj) = exp(-gl*dl(xi,xj))
Rule-of-thumb: set gl to 1/mean(dl) or 1/median(dl)
Trace normalise each kernel kl such that trace(kl) = 1
Compute average (or product) over all kernels kl
![Page 4: On Feature Combination for Multiclass Object Classification Peter Gehler and Sebastian Nowozin Reading group October 15, 2009.](https://reader034.fdocuments.us/reader034/viewer/2022051214/56649d775503460f94a59049/html5/thumbnails/4.jpg)
Combining kernels
Combination of kernels
Decision function for SVMs:
addedMultiple Kernel Learning (MKL)
• Objective function [Varma and Ray]• Near identical to l1 C-SVM but added l1 regularisation on the weights d
![Page 5: On Feature Combination for Multiclass Object Classification Peter Gehler and Sebastian Nowozin Reading group October 15, 2009.](https://reader034.fdocuments.us/reader034/viewer/2022051214/56649d775503460f94a59049/html5/thumbnails/5.jpg)
Combining kernels
Combination of kernels
Decision function for SVMs:
All kernels share the samealpha and beta values
![Page 6: On Feature Combination for Multiclass Object Classification Peter Gehler and Sebastian Nowozin Reading group October 15, 2009.](https://reader034.fdocuments.us/reader034/viewer/2022051214/56649d775503460f94a59049/html5/thumbnails/6.jpg)
Combining kernels
Boosting of individual kernels
Idea:
Learn separate SVMs for each kernel each with own values for alpha and beta
Use boosting based approach to combine the individual SVMs linear weighted combination of “weak” classifiers
Authors propose two versions:LP-beta – learns a single weight vectorLP-BETA – learns a weight vector for each class
![Page 7: On Feature Combination for Multiclass Object Classification Peter Gehler and Sebastian Nowozin Reading group October 15, 2009.](https://reader034.fdocuments.us/reader034/viewer/2022051214/56649d775503460f94a59049/html5/thumbnails/7.jpg)
Combining kernels
Combination of kernels
Decision function for SVMs:
![Page 8: On Feature Combination for Multiclass Object Classification Peter Gehler and Sebastian Nowozin Reading group October 15, 2009.](https://reader034.fdocuments.us/reader034/viewer/2022051214/56649d775503460f94a59049/html5/thumbnails/8.jpg)
Results
Results on Oxford flowers
7 kernels
Best results when combiningmultiple kernels
Baseline methods doequally well and aremagnitudes faster
The proposed LPmethods don’t do betterthan the baseline either not explained why!
![Page 9: On Feature Combination for Multiclass Object Classification Peter Gehler and Sebastian Nowozin Reading group October 15, 2009.](https://reader034.fdocuments.us/reader034/viewer/2022051214/56649d775503460f94a59049/html5/thumbnails/9.jpg)
Results
Results on Oxford flowers
adding “noisy” kernels MKL able to identify these kernels and set weights to ~zero Accuracy using “averaging” or “product” goes down
![Page 10: On Feature Combination for Multiclass Object Classification Peter Gehler and Sebastian Nowozin Reading group October 15, 2009.](https://reader034.fdocuments.us/reader034/viewer/2022051214/56649d775503460f94a59049/html5/thumbnails/10.jpg)
Results
Results on Caltech-256 dataset
39 kernels
LP-beta performs best
Using the baseline“average” accuraciesare within 5% to bestresults
![Page 11: On Feature Combination for Multiclass Object Classification Peter Gehler and Sebastian Nowozin Reading group October 15, 2009.](https://reader034.fdocuments.us/reader034/viewer/2022051214/56649d775503460f94a59049/html5/thumbnails/11.jpg)
Results
Results on Caltech-101 dataset
LP-beta 10% better than state-of-the-art