Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... ·...
Transcript of Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... ·...
![Page 1: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/1.jpg)
Support Vector and Kernel Machines
Nello CristianiniBIOwulf [email protected]://www.support-vector.net/tutorial.html
ICML 2001
![Page 2: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/2.jpg)
www.support-vector.net
A Little History
z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik. Greatly developed ever since.
z Initially popularized in the NIPS community, now an important and active field of all Machine Learning research.
z Special issues of Machine Learning Journal, and Journal of Machine Learning Research.
z Kernel Machines: large class of learning algorithms, SVMs a particular instance.
![Page 3: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/3.jpg)
www.support-vector.net
A Little History
z Annual workshop at NIPSz Centralized website: www.kernel-machines.orgz Textbook (2000): see www.support-vector.netz Now: a large and diverse community: from machine
learning, optimization, statistics, neural networks, functional analysis, etc. etc
z Successful applications in many fields (bioinformatics, text, handwriting recognition, etc)
z Fast expanding field, EVERYBODY WELCOME ! -
![Page 4: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/4.jpg)
www.support-vector.net
Preliminaries
z Task of this class of algorithms: detect and exploit complex patterns in data (eg: by clustering, classifying, ranking, cleaning, etc. the data)
z Typical problems: how to represent complex patterns; and how to exclude spurious (unstable) patterns (= overfitting)
z The first is a computational problem; the second a statistical problem.
![Page 5: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/5.jpg)
www.support-vector.net
Very Informal Reasoning
z The class of kernel methods implicitly defines the class of possible patterns by introducing a notion of similarity between data
z Example: similarity between documentsz By lengthz By topicz By language …
z Choice of similarity Î Choice of relevant features
![Page 6: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/6.jpg)
www.support-vector.net
More formal reasoning
z Kernel methods exploit information about the inner products between data items
z Many standard algorithms can be rewritten so that they only require inner products between data (inputs)
z Kernel functions = inner products in some feature space (potentially very complex)
z If kernel given, no need to specify what features of the data are being used
![Page 7: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/7.jpg)
www.support-vector.net
Jus t in case …
z Inner product between vectors
z Hyperplane:
x z xzi i
i
, =∑x
x
o
o
o x
x o
o o
x
x
x
w
b
w x b, + = 0
![Page 8: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/8.jpg)
www.support-vector.net
Overview of the Tutorial
z Introduce basic concepts with extended example of Kernel Perceptron
z Derive Support Vector Machines z Other kernel based algorithmsz Properties and Limitations of Kernelsz On Kernel Alignmentz On Optimizing Kernel Alignment
![Page 9: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/9.jpg)
www.support-vector.net
Parts I and II: overview
z Linear Learning Machines (LLM)z Kernel Induced Feature Spacesz Generalization Theoryz Optimization Theoryz Support Vector Machines (SVM)
![Page 10: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/10.jpg)
www.support-vector.net
Modularity
z Any kernel-based learning algorithm composed of two modules:
– A general purpose learning machine– A problem specific kernel function
z Any K-B algorithm can be fitted with any kernelz Kernels themselves can be constructed in a modular
wayz Great for software engineering (and for analysis)
IMPORTANTCONCEPT
![Page 11: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/11.jpg)
www.support-vector.net
1-Linear Learning Machines
z Simplest case: classification. Decision function is a hyperplane in input space
z The Perceptron Algorithm (Rosenblatt, 57)
z Useful to analyze the Perceptron algorithm, before looking at SVMs and Kernel Methods in general
![Page 12: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/12.jpg)
www.support-vector.net
Basic Notation
z Input spacez Output spacez Hypothesisz Real-valued:z Training Setz Test errorz Dot product
x X
y Y
h H
f X
S x y x y
x z
i i
∈∈ = − +∈
→=
{ , }
: R
{( , ),..., ( , ),....}
,
1 1
1 1
ε
![Page 13: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/13.jpg)
www.support-vector.net
Perceptron
z Linear Separation of the input space
x
x
o
o
o x
x o
o o
x
x
x
w
b
f x w x b
h x sign f x
( ) ,
( ) ( ( ))
= +=
![Page 14: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/14.jpg)
www.support-vector.net
Perceptron Algorithm
Update rule (ignoring threshold):
z if then x
x
o
o
o x
x o
o o
x
x
x
w
b
y w xi k i( , ) ≤ 0
w w y x
k k
k k i i+ ← +← +
1
1
η
![Page 15: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/15.jpg)
www.support-vector.net
Observations
z Solution is a linear combination of training points
z Only used informative points (mistake driven)z The coefficient of a point in combination
reflects its ‘difficulty’
w y xi i i
i
=
≥∑ α
α 0
![Page 16: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/16.jpg)
www.support-vector.net
Observations - 2
z Mistake bound:
z coefficients are non-negativez possible to rewrite the algorithm using this alternative
representation
MR≤
���
���γ
2x
x
o
o
o x
x o
o o
x
x
x
g
![Page 17: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/17.jpg)
www.support-vector.net
Dual Representation
The decision function can be re-written as follows:
f x w x b y x x bi i i( ) , ,= + = +∑α
w y xi i i= ∑α
IMPORTANTCONCEPT
![Page 18: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/18.jpg)
www.support-vector.net
Dual Representation
z And also the update rule can be rewritten as follows:
z if then
z Note: in dual representation, data appears only inside dot products
y y x x bi j j j iα ,∑ + ≤3 8 0 α α ηi i← +
![Page 19: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/19.jpg)
www.support-vector.net
Duality: First Property of SVMs
z DUALITY is the first feature of Support Vector Machines
z SVMs are Linear Learning Machines represented in a dual fashion
z Data appear only within dot products (in decision function and in training algorithm)
f x w x b y x x bi i i( ) , ,= + = +∑α
![Page 20: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/20.jpg)
www.support-vector.net
Limitations of LLMs
Linear classifiers cannot deal with
z Non-linearly separable dataz Noisy data
z + this formulation only deals with vectorial data
![Page 21: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/21.jpg)
www.support-vector.net
Non-Linear Classifiers
z One solution: creating a net of simple linear classifiers (neurons): a Neural Network(problems: local minima; many parameters; heuristics needed to train; etc)
z Other solution: map data into a richer feature space including non-linear features, then use a linear classifier
![Page 22: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/22.jpg)
www.support-vector.net
Learning in the Feature Space
z Map data into a feature space where they are linearly separable x x→ φ( )
x
x
x
x
o o
o
o
f ( o ) f (x)
f (x)
f ( o )
f ( o )
f ( o ) f (x)
f (x)
f
X F
![Page 23: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/23.jpg)
www.support-vector.net
Problems with Feature Space
z Working in high dimensional feature spaces solves the problem of expressing complex functions
BUT:z There is a computational problem (working with
very large vectors)z And a generalization theory problem (curse of
dimensionality)
![Page 24: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/24.jpg)
www.support-vector.net
Implicit Mapping to Feature Space
We will introduce Kernels:
z Solve the computational problem of working with many dimensions
z Can make it possible to use infinite dimensions – efficiently in time / space
z Other advantages, both practical and conceptual
![Page 25: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/25.jpg)
www.support-vector.net
Kernel-Induced Feature Spaces
z In the dual representation, the data points only appear inside dot products:
z The dimensionality of space F not necessarily important. May not even know the map
f x y x x bi i i( ) ( , ( ))= +∑α φ φ
φ
![Page 26: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/26.jpg)
www.support-vector.net
Kernels
z A function that returns the value of the dot product between the images of the two arguments
z Given a function K, it is possible to verify that it is a kernel
K x x x x( , ) ( ), ( )1 2 1 2= φ φ
IMPORTANTCONCEPT
![Page 27: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/27.jpg)
www.support-vector.net
Kernels
z One can use LLMs in a feature space by simply rewriting it in dual representation and replacing dot products with kernels:
x x K x x x x1 2 1 2 1 2, ( , ) ( ), ( )← = φ φ
![Page 28: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/28.jpg)
www.support-vector.net
The Kernel Matrix
z (aka the Gram matrix):
K(m,m)…K(m,3)K(m,2)K(m,1)
……………
K(2,m)…K(2,3)K(2,2)K(2,1)
K(1,m)…K(1,3)K(1,2)K(1,1)
IMPORTANTCONCEPT
K=
![Page 29: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/29.jpg)
www.support-vector.net
The Kernel Matrix
z The central structure in kernel machinesz Information ‘bottleneck’: contains all necessary
information for the learning algorithm z Fuses information about the data AND the
kernelz Many interesting properties:
![Page 30: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/30.jpg)
www.support-vector.net
Mercer’s Theo rem
z The kernel matrix is Symmetric Positive Definite
z Any symmetric positive definite matrix can be regarded as a kernel matrix, that is as an inner product matrix in some space
![Page 31: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/31.jpg)
www.support-vector.net
More Formally: Mercer’s Theo rem
z Every (semi) positive definite, symmetric function is a kernel: i.e. there exists a mapping
such that it is possible to write:
Pos. Def.
φ
K x x x x( , ) ( ), ( )1 2 1 2= φ φK x z f x f z d x d z
f L
( , ) ( ) ( ) ≥
∀ ∈I 0
2
![Page 32: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/32.jpg)
www.support-vector.net
Mercer’s Theo rem
z Eigenvalues expansion of Mercer’s Kernels:
z That is: the eigenfunctions act as features !
K x x x xi
i
i i( , ) ( ) ( )1 2 1 2= ∑λφ φ
![Page 33: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/33.jpg)
www.support-vector.net
Examples of Kernels
z Simple examples of kernels are:
K x z x z
K x z e
d
x z
( , ) ,
( , ) /
=
= − − 2 2σ
![Page 34: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/34.jpg)
www.support-vector.net
Example: Polynomial Kernels
x x x
z z z
x z x z x z
x z x z x z x z
x x x x z z z z
x z
==
= + =
= + + =
= =
=
( , );
( , );
, ( )
( , , ),( , , )
( ), ( )
1 2
1 2
21 1 2 2
2
12
12
22
22
1 1 2 2
12
22
1 2 12
22
1 2
2
2 2
φ φ
![Page 35: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/35.jpg)
www.support-vector.net
Example: Polynomial Kernels
![Page 36: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/36.jpg)
www.support-vector.net
Example: the two spirals
z Separated by a hyperplane in feature space (gaussian kernels)
![Page 37: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/37.jpg)
www.support-vector.net
Making Kernels
z The set of kernels is closed under some operations. If K, K’ are kernels, then:
z K+K’ is a kernelz cK is a kernel, if c>0z aK+bK’ is a kernel, for a,b >0z Etc etc etc…… z can make complex kernels from simple ones:
modularity !
IMPORTANTCONCEPT
![Page 38: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/38.jpg)
www.support-vector.net
Second Property of SVMs:
SVMs are Linear Learning Machines, that z Use a dual representation ANDz Operate in a kernel induced feature space(that is: is a linear function in the feature space implicitely
defined by K)
f x y x x bi i i( ) ( , ( ))= +∑α φ φ
![Page 39: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/39.jpg)
www.support-vector.net
Kernels over General Structures
z Haussler, Watkins, etc: kernels over sets, over sequences, over trees, etc.
z Applied in text categorization, bioinformatics, etc
![Page 40: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/40.jpg)
www.support-vector.net
A bad ke rnel …
z … would be a kernel whose kernel matrix is mostly diagonal: all points orthogonal to each other, no clusters, no structure …
1…000
……………
1
0…010
0…001
![Page 41: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/41.jpg)
www.support-vector.net
No Free Kernel
z If mapping in a space with too many irrelevant features, kernel matrix becomes diagonal
z Need some prior knowledge of target so choose a good kernel
IMPORTANTCONCEPT
![Page 42: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/42.jpg)
www.support-vector.net
Other Kernel-based algorithms
z Note: other algorithms can use kernels, not justLLMs (e.g. clustering; PCA; etc). Dual representation often possible (in optimizationproblems, by Representer’s theorem).
![Page 43: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/43.jpg)
www.support-vector.net
%5($.
![Page 44: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/44.jpg)
www.support-vector.net
The Generalization Problem
z The curse of dimensionality: easy to overfit in high dimensional spaces(=regularities could be found in the training set that are accidental, that is that would not be found again in a test set)
z The SVM problem is ill posed (finding one hyperplanethat separates the data: many such hyperplanes exist)
z Need principled way to choose the best possiblehyperplane
NEW TOPIC
![Page 45: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/45.jpg)
www.support-vector.net
The Generalization Problem
z Many methods exist to choose a good hyperplane (inductive principles)
z Bayes, statistical learning theory / pac, MDL, …
z Each can be used, we will focus on a simple case motivated by statistical learning theory (will give the basic SVM)
![Page 46: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/46.jpg)
www.support-vector.net
Statistical (Computational) Learning Theory
z Generalization bounds on the risk of overfitting (in a p.a.c. setting: assumption of I.I.d. data; etc)
z Standard bounds from VC theory give upper and lower bound proportional to VC dimension
z VC dimension of LLMs proportional to dimension of space (can be huge)
![Page 47: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/47.jpg)
www.support-vector.net
Assumptions and Definitions
z distribution D over input space Xz train and test points drawn randomly (I.I.d.) from D z training error of h: fraction of points in S misclassifed
by hz test error of h: probability under D to misclassify a point
x z VC dimension: size of largest subset of X shattered by
H (every dichotomy implemented)
![Page 48: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/48.jpg)
www.support-vector.net
VC Bounds
=
m
VCO~ε
VC = (number of dimensions of X) +1
Typically VC >> m, so not useful
Does not tell us which hyperplane to choose
![Page 49: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/49.jpg)
www.support-vector.net
Margin Based Bounds
f
xfy
m
RO
ii
i
)(min
)/(~ 2
=
=
γ
γε
Note: also compression bounds exist; and online bounds.
![Page 50: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/50.jpg)
www.support-vector.net
Margin Based Bounds
z (The worst case bound still holds, but if lucky (margin is large)) the other bound can be applied and better generalization can be achieved:
z Best hyperplane: the maximal margin onez Margin is large is kernel chosen well
=
m
RO
2)/(~ γε
IMPORTANTCONCEPT
![Page 51: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/51.jpg)
www.support-vector.net
Maximal Margin Classifier
z Minimize the risk of overfitting by choosing the maximal margin hyperplane in feature space
z Third feature of SVMs: maximize the marginz SVMs control capacity by increasing the
margin, not by reducing the number of degrees of freedom (dimension free capacity control).
![Page 52: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/52.jpg)
www.support-vector.net
Two kinds of margin
z Functional and geometric margin:
x
x
o
o
o x
x o
o o
x
x
x
g
funct y f x
geomy f x
f
i i
i i
=
=
min ( )
min( )
![Page 53: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/53.jpg)
www.support-vector.net
Two kinds of margin
![Page 54: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/54.jpg)
www.support-vector.net
Max Margin = Minimal Norm
z If we fix the functional margin to 1, the geometric margin equal 1/||w||
z Hence, maximize the margin by minimizing the norm
![Page 55: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/55.jpg)
www.support-vector.net
Max Margin = Minimal Norm
Distance betweenThe two convex hulls
w x b
w x b
w x x
w
wx x
w
,
,
, ( )
,( )
+
−
+ −
+ −
+ = +
+ = −
− =
− =
1
1
2
2
x
x
o
o
o x
x o
o o
x
x
x
g
![Page 56: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/56.jpg)
www.support-vector.net
The primal problem
z Minimize:subject to:
w w
y w x bi i
,
, + ≥4 9 1
IMPORTANTSTEP
![Page 57: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/57.jpg)
www.support-vector.net
Optimization Theory
z The problem of finding the maximal marginhyperplane: constrained optimization(quadratic programming)
z Use Lagrange theory (or Kuhn-Tucker Theory)z Lagrangian:
1
21
0
w w y w x bi i i, ,− + −∑ �!
"$#
≥
α
α
1 6
![Page 58: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/58.jpg)
www.support-vector.net
From Primal to Dual
L w w w y w x bi i i
i
( ) , ,= − + −∑ �!
"$#
≥
1
21
0
α
α
1 6
Differentiate and substitute:∂∂∂∂
L
bL
w
=
=
0
0
![Page 59: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/59.jpg)
www.support-vector.net
The Dual Problem
z Maximize:
z Subject to:
z The duality again ! Can use kernels !
W y y x x
y
i
i
ii j
j i j i j
i
i ii
( ) ,,
α α αα
αα
= −
≥∑ =
∑ ∑1
2
0
0
IMPORTANTSTEP
![Page 60: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/60.jpg)
www.support-vector.net
Convexity
z This is a Quadratic Optimization problem: convex, no local minima (second effect of Mercer’s conditions)
z Solvable in polynomial time …z (convexity is another fundamental property of
SVMs)
IMPORTANTCONCEPT
![Page 61: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/61.jpg)
www.support-vector.net
Kuhn-Tucker Theorem
Properties of the solution:z Duality: can use kernelsz KKT conditions:
z Sparseness: only the points nearest to the hyperplane(margin = 1) have positive weight
z They are called support vectors
w y xi i i= ∑α
αi i iy w x b
i
, + − =
∀
1 6 1 0
![Page 62: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/62.jpg)
www.support-vector.net
KKT Conditions Imply Sparseness
x
x
o
o
o x
x o
o o
x
x
x
g
Sparseness: another fundamental property of SVMs
![Page 63: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/63.jpg)
www.support-vector.net
Properties of SVMs - Summary
9 Duality9 Kernels9 Margin9 Convexity9 Sparseness
![Page 64: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/64.jpg)
www.support-vector.net
Dealing with noise
( )2
2
1
γξ
ε ∑ 2+≤
R
m
y w x bi i i, + ≥ −4 9 1 ξ
In the case of non-separable data in feature space, the margin distribution can be optimized
![Page 65: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/65.jpg)
www.support-vector.net
The Soft-Margin Classifier
1
2
1
2
1
2
w w C
w w C
y w x b
i
i
i
i
i i i
,
,
,
+
+
+ ≥ −
∑
∑
ξ
ξ
ξ4 9
Minimize:
Or:
Subject to:
![Page 66: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/66.jpg)
www.support-vector.net
Slack Variables
( )2
2
1
γξ
ε ∑ 2+≤
R
m
y w x bi i i, + ≥ −4 9 1 ξ
![Page 67: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/67.jpg)
www.support-vector.net
Soft Margin-Dual Lagrangian
z Box constraints
z Diagonal
W y y x x
C
y
i
i
ii j
j i j i j
i
i ii
( ) ,,
α α αα
αα
= −
≤ ≤∑ =
∑ ∑1
2
0
0
α αα α α
αα
i
i
ii j
j i j i j j j
i
i i
y y x xC
yi
∑ ∑ ∑− −
≤∑ ≥
1
2
1
2
0
0
,,
![Page 68: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/68.jpg)
www.support-vector.net
The regression case
z For regression, all the above properties are retained, introducing epsilon-insensitive loss:
L
y i - < w , x i >+b
e
0
x
![Page 69: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/69.jpg)
www.support-vector.net
Regression: the ε-tube
![Page 70: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/70.jpg)
www.support-vector.net
Implementation Techniques
z Maximizing a quadratic function, subject to a linear equality constraint (and inequalities as well)
W y y K x x
y
i
i
ii j
j i j i j
i
i ii
( ) ( , ),
α α αα
αα
= −
≥∑ =
∑ ∑1
2
0
0
![Page 71: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/71.jpg)
www.support-vector.net
Simple Approximation
z Initially complex QP pachages were used.z Stochastic Gradient Ascent (sequentially
update 1 weight at the time) gives excellent approximation in most cases
α α αi ii i
i i i i jK x x
y y K x x← + −���
���∑1
1( , )
( , )
![Page 72: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/72.jpg)
www.support-vector.net
Full Solution: S.M.O.
z SMO: update two weights simultaneouslyz Realizes gradient descent without leaving the
linear constraint (J. Platt).
z Online versions exist (Li-Long; Gentile)
![Page 73: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/73.jpg)
www.support-vector.net
Other “ kernelized” Algorithms
z Adatron, nearest neighbour, fisher discriminant, bayes classifier, ridge regression, etc. etc
z Much work in past years into designing kernel based algorithms
z Now: more work on designing good kernels (for any algorithm)
![Page 74: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/74.jpg)
www.support-vector.net
On Combining Kernels
z When is it advantageous to combine kernels ?z Too many features leads to overfitting also in
kernel methodsz Kernel combination needs to be based on
principlesz Alignment
![Page 75: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/75.jpg)
www.support-vector.net
Kernel Alignment
z Notion of similarity between kernels:Alignment (= similarity between Gram matrices)
A K KK K
K K K K( , )
,
, ,1 2
1 2
1 1 2 2=
IMPORTANTCONCEPT
![Page 76: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/76.jpg)
www.support-vector.net
Many interpretations
z As measure of clustering in dataz As Correlation coefficient between ‘oracles’
z Basic idea: the ‘ultimate’ kernel should be YY’, that is should be given by the labels vector (after all: target is the only relevant feature !)
![Page 77: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/77.jpg)
www.support-vector.net
The ideal kernel
1…1-1-1
……………
11-1-1
-1…-111
-1…-111
YY’=
![Page 78: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/78.jpg)
www.support-vector.net
Combining Kernels
z Alignment in increased by combining kernels that are aligned to the target and not aligned to each other.
z
A K YYK YY
K K YY YY( , ' )
, '
, ' , '1
1
1 1
=
![Page 79: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/79.jpg)
www.support-vector.net
Spectral Machines
z Can (approximately) maximize the alignment of a set of labels to a given kernel
z By solving this problem:
z Approximated by principal eigenvector (thresholded) (see courant-hilbert theorem)
yyKy
yy
yi
=
∈ − +
argmax'
{ , }1 1
![Page 80: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/80.jpg)
www.support-vector.net
Courant-Hilbert theorem
z A: symmetric and positive definite,z Principal Eigenvalue / Eigenvector
characterized by:
λ = max'v
vAv
vv
![Page 81: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/81.jpg)
www.support-vector.net
Optimizing Kernel Alignment
z One can either adapt the kernel to the labels or vice versa
z In the first case: model selection methodz Second case: clustering / transduction method
![Page 82: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/82.jpg)
www.support-vector.net
Applications of SVMs
z Bioinformaticsz Machine Visionz Text Categorizationz Handwritten Character Recognitionz Time series analysis
![Page 83: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/83.jpg)
www.support-vector.net
Text Kernels
z Joachims (bag of words)z Latent semantic kernels (icml2001)z String matching kernelsz …z See KerMIT project …
![Page 84: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/84.jpg)
www.support-vector.net
Bioinformatics
z Gene Expressionz Protein sequencesz Phylogenetic Informationz Promotersz …
![Page 85: Support Vector and Kernel Machines - University of Rochesterstefanko/Teaching/09CS446/... · 2009-01-08 · A Little History z SVMs introduced in COLT-92 by Boser, Guyon, Vapnik.](https://reader034.fdocuments.us/reader034/viewer/2022042219/5ec4dd0de891f3349e4e6b24/html5/thumbnails/85.jpg)
www.support-vector.net
Conclusions:
z Much more than just a replacement for neural networks. -
z General and rich class of pattern recognition methods
z %RR RQ 690V��ZZZ�VXSSRUW�YHFWRU�QHWz Kernel machines website
www.kernel-machines.orgz www.NeuroCOLT.org