1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating...

3.6 Support Vector Machines

K. M. Koo

Goal of SVMFind Maximum Margin

goalfind a separating hyperplane with

maximum margin

marginminimum distance between a

separating hyperplane and the sets of or

Goal of SVMFind Maximum Margin

assume that are linearly separable

margin

find separating hyperplane with maximum margin

Calculate margin

separating hyperplane and are not uniquely determinedunder the constraint ,

and are uniquely determine

0)( 0 wg Txwx

0)( 0 wxg Txww 0w

10min wTxwx

Calculate margin

distance between a point x and is given by

thus, the margin is given by

wxw /0wT

wxwwxwxx

/)()/( 00 minmin

Optimization of margin

maximization of margin

thatrequiring

Optimization of margin

therefore, we want to

i ,...,2,1 1)( subject to

1)( minimize

separating hyperplanewith maximal margin

separating hyperplanewith minimum w

This is an optimization-problem with inequality constraints

optimization with constraints

constraint

cost function

min-value

optimization with equality constraintsoptimization with inequality constraints

constraint

Lagrange Multiplier

optimization problem under constraints can be solved by the method of Lagrange Multipliers

let be real valued functions, let and ,and let , the level set for with value . assume .if has a local minimum or maximum on at , which is called a critical point of ,then there is a real number ,called a Lagrange multiplier, such that

nRURUgf for ,:,U0x cg )( 0x )(1 cgS

g c 0)( 0 xg Sf |

)()( 00 xx gf

0),( xL

The Method of Lagrange Multiplier

1)( cJ θ

2)( cJ θ

3)( cJ θ

JT θa

subject to

)( minimize

))(( *J

bT θa

Lagrange Multiplier

Lagrangian is obtained as follows:for equality constraints

for inequality constraints

In our caseInequality constraints

N1,2,...,i ,0

]1)([2

Tii wywL

xwwλw

)()(),(1

ii bJL

θaθλθ

mibJL iiTi

ii ,...,1,0 0, )()(),(

θaθλθ

Convex

a subset is convex iff for any , the line segment joining and is also a subset of , i.e. for any ,

a real-valued function on is convex iff for any two points and for any ,

XC Cyx ,

]1,0[)()1()())1(( yfxfyxf

]1,0[Cyx )1(

Convex

convex set concave set

convex function concave function neither convex nor concave

)( 1xf

)( 2xf

Convex Optimization

an optimization problem is said to be convex iff the cost function as well as the constraints are convex the optimization problem for SVM is convex

the solution to a convex problem, if it exist, is unique. that is, there is no local optimum!

for convex optimization problem, KKT(Karush-Kuhn-Tucker) condition is necessary and sufficient for the solution

KKT(Karush-Kuhn-Tucker) condition KKT condition

1. The gradient of the Lagrangian with respect to the original variable is 0

2. The original constraints are satisfied

3. Multipliers for inequality constraints

4. (Complementary KKT) product of multiplier and constraints equal to 0

for convex optimize problems,1-4 are necessary and sufficient for the solution

),,( 0wL

0),,( 00

λw wLw

Nii ,...,2,0 ,0

Niwy iT

ii ,...,2,1 ,0]1)([ 0 xw

KKT condition for the optimization of margin

recall

KKT condition

i ,...,2,1 1)( subject to

1)( minimize

,...,2,1 ,0]1)([

,...,2,0 ,0

(3.62)

(3.63)

(3.64)

(3.65)

(3.66)

T wxwywwwL1

00 ]1)([2

1),,( λw

KKT condition for the optimization of margin

Combining (3.66) with (3.62)

x (3.67)

(3.68)

Remarks-support vector

of the optimal solution is a linear combination of feature vectors which are associated with

support vectors are associated with

iiii y

Niwy iT

ii ,...,2,1 ,0]1)([ 0 xw

Remarks-support vector

ctorsupport ve -non

ctorsupport ve

00 wTxw

10 wTxw

The resulting hyperplane classifier is insensitive to the number and position of non-support vector

Remark-computation w0

can be implicitly obtaines by any of the condition satisfying strict complement (i.e. )

In practice, is computed as an average value obtained using all conditions of the type

0]1)([ 0 wy iT

Remark-optimal hyperplane is unique

the optimal hyperplane classifier of a support vector machine is unique under two conditionthe cost function is convex the inequality constraints consist of

linear functionsconstraints are convex

an optimization problem is said to be convex iff the target(or cost) function as well as the constraints are convex (the optimization problem for SVM is convex)

the solution to a convex problem, if it exist, is unique. that is, there is no local optimum!

Computation optimal Lagrange multiplier

optimization problem belongs to the convex programming family (convex optimization problem) of problems

It can be solved by considering the so called Lagrangian duality and can be stated equivalently by its Wolfe dual representation form

),( subject to

),(max

),(minmax),(maxmin)(min

λθθ

λθλθθ

θλλθθ

Lagrangian duality

Wolfe dual representation

Wolfe dual representation form

subject to

1),,( maximize

Computation optimal Lagrange multiplier

once the optimal Lagrangian multipliers have been computed, the optimal hyperplane is obtained

,0 subject to

Tijiji

(3.75)

(3.76)

Remarks

the cost function does not depend explicitly on the dimensionality of the input spacethis allows for efficient generalizations

in the case of nonlinearly separable classes

although the resulting optimal hyperplane is unique, there is no guarantee about Lagrange multipliers

Simple example

]1)([2

]1,1[,]1,1[:

43214321

432122

432111

consider the two classification task that consists of the following points

its Lagrangian function

KKT condition

Simple exampleLagrangian duality

0,]0,1[

)22(max

43214321

324124

214321

optimize with equality constraint

resultmore then one solution

SVM for Non-separable Classes

in the case of non-separable, the training feature vector belong to one of the following three categories

10 wTxw

1)(0 0 wy Ti xw

0)( 0 wy Ti xw

10 wTxw

00 wTxw

10 wTxw

All three cases can be treated under a single type constraints

i wy 1)][ 0xw

goal ismake the margin as large as possible keep the number of points with as

small as possible

(3.79) is intractable because of discontinuous function

wξw (3.79)

as common case, we choose to optimize a closely related cost function

,...,2,1 ,0

,...,2,1 ,1][ subject to

1),,( minimize

to Lagrangian

]1)([2

1),,,,(

xwwξw

The corresponding KKT condition

,...,2,0 ,0

,...,2,1 ,0]1)([

,...,2,0 ,0 ,0

,...,2,1 ,0 or 0

xw0w (3.85)

(3.86)

(3.87)

(3.90)

(3.88)

(3.89)

The associated Wolfe dual representation now becomes

,...,2,0 ,0 ,0

,...,2,1 ,0

subject to

),,,,( maximize

equivalent to

,...,2,1 ,0 subject to

Tijiji

Remarks-difference with the linearly separable case

Lagrange multipliers( ) need to be bounded by C

the slack variables, , and their associated Lagrange multipliers, , do not enter into the problem explicitlyreflected indirectly though C

RemarksM-class problem

SVM for M-class problem design M separating hyperplanes so th

at separate class from all the others

assign0)( xgi

0)( xgi i

0)( xgi

)}({maxarg if in xgi kk

1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating...

Documents

Transcript of 1 3.6 Support Vector Machines K. M. Koo. 2 Goal of SVM Find Maximum Margin goal find a separating...

Using Image Priors in Maximum Margin Classifiers Tali Brayer Margarita Osadchy Daniel Keren.

Semi-Supervised Biased Maximum Margin Analysis …imi.ntu.edu.sg/NewsEvents/Events/PastSeminars/Documents/...Semi-Supervised Biased Maximum Margin Analysis for SVM Relevance Feedback

Norwegian Journal of Geology - Norwegian Journal of Geology - …njg.geologi.no/.../NJG_Vol96_No2_Article6_Larsen_etal.pdf · 2017-05-15 · maximum Weichselian ice margin was achieved

Maximum Margin Planning - Robotics Institute · In particular, we adopt the structured max-margin framework and attempt to make y i better than any other solution ˆy by a margin

El Camino Hospital Board of Directors -Minutes - Finance ... · goal to 95% of operating margin, and discussed whether (1) the deltas between the minimum, target and maximum metrics

The Last Glacial Maximum and Holocene along the western … · 2020-04-19 · The Last Glacial Maximum and Holocene along the western Iberian Margin: paleoceanographic and paleoclimatic

CROP MARKS MARGIN MARGIN Fra MARGIN … CROP MARKS MARGIN MARGIN CROP MARKS MARGIN MARGIN CROP MARKS MARGIN MARGIN CROP MARKS MARGIN Fraud Barometer 2018 | 5 The national picture nd

MedLDA: Maximum Margin Supervised Topic Modelsml.cs.tsinghua.edu.cn/~jun/pub/MedLDA_jmlr.pdf · MedLDA: Maximum Margin Supervised Topic Models Jun Zhu DCSZJ@MAIL.TSINGHUA EDU CN ...

CHAPTER 5: MAXIMUM LIKELIHOOD ESTIMATIONbios.unc.edu/~dzeng/BIOS760/ChapE_Slide.pdfCHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 3 • Goal MLE is asymptotically eﬃcient estimator under

RankGAN: A Maximum Margin Ranking GAN for Generating Faceshal.cse.msu.edu/assets/pdfs/papers/2018-accv-rankgan.pdf · RankGAN: A Maximum Margin Ranking GAN for Generating Faces Felix

Margin Minder - Salient Management Company Minder shows you where the money goes, what you get in return, and who or what is responsible. Margin Minder ® For achieving maximum profitability

The Future of Defence · 0 MARGIN CROP MARKS MARGIN MARGIN CROP MARKS MARGIN MARGIN CROP MARKS MARGIN MARGIN CROP MARKS MARGIN The future of defence. Throughout this document, “we”,

MedLDA: Maximum Margin Supervised Topic Models for Regression and Classification

Copy of Maximum Achievement Goal Planner

CS480/680: IntrotoMLy328yu/mycourses/480/lec07.pdf · Lecture 07: Hard-margin SVM 1 2020-06-02 Yao-Liang Yu. Outline •Maximum Margin •LagrangianDual •Alternative View 2 2020-06-02

Maximum Margin Projection Subspace Learning for Visual Data …poseidon.csd.auth.gr/.../2014/Nikitidis_2014_Maximum_TIP.pdf · 2016. 5. 31. · The Maximum Margin Projection (MMP)

Recap - syllabus.cs.manchester.ac.uksyllabus.cs.manchester.ac.uk/pgt/2017/COMP61011/materials/Day2... · Recap Finds the boundary with “maximum margin” ... The convention in SVM

CROP MARKS MARGIN The Wheely Big Cycle route map · MARGIN CROP MARKS MARGIN MARGIN CROP MARKS MARGIN MARGIN CROP MARKS MARGIN MARGIN CROP MARKS MARGIN. North Route South Route .

California Consumer Confidence Reports (CCRs)€¦ · • Maximum Residual Disinfectant Level (MRDL) • Maximum Residual Disinfectant Level Goal . Terms Required If You Report An

Regression and Classification using Kernel Methodsbapoczos/other_presentations/regress_cla… · that “touch” the margin The maximum margin linear classifier is the linear classifier