From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass...
Transcript of From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass...
![Page 1: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/1.jpg)
CS6355:StructuredPrediction
FromBinarytoMulticlassClassification
1
![Page 2: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/2.jpg)
Wehaveseenbinaryclassification
• Wehaveseenlinearmodels• Learningalgorithms– Perceptron– SVM– LogisticRegression
• Predictionissimple– Givenanexample 𝐱,output= sgn(𝐰𝑇𝐱)– Outputisasinglebit
2
![Page 3: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/3.jpg)
Whatifwehavemorethantwolabels?
3
![Page 4: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/4.jpg)
Readingfornextlecture:
ErinL.Allwein,RobertE.Schapire,Yoram Singer, ReducingMulticlasstoBinary:AUnifyingApproachforMarginClassifiers,ICML2000.
4
![Page 5: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/5.jpg)
Multiclassclassification
• Introduction
• Combiningbinaryclassifiers– One-vs-all– All-vs-all– Errorcorrectingcodes
• Trainingasingleclassifier– MulticlassSVM– Constraintclassification
5
![Page 6: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/6.jpg)
Wherearewe?
• Introduction
• Combiningbinaryclassifiers– One-vs-all– All-vs-all– Errorcorrectingcodes
• Trainingasingleclassifier– MulticlassSVM– Constraintclassification
6
![Page 7: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/7.jpg)
Whatismulticlassclassification?
• AninputcanbelongtooneofKclasses
• Trainingdata:examplesassociatedwithclasslabel(anumberfrom1toK)
• Prediction:Givenanewinput,predicttheclasslabel
Eachinputbelongstoexactlyoneclass.Notmore,notless.• Otherwise,theproblemisnotmulticlassclassification
• Ifaninputcanbeassignedmultiplelabels(thinktagsforemailsratherthanfolders),itiscalledmulti-labelclassification
7
![Page 8: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/8.jpg)
Exampleapplications:Images
– Input:hand-writtencharacter;Output:whichcharacter?
– Input:aphotographofanobject;Output:whichofasetofcategoriesofobjectsisit?• Eg:theCaltech256dataset
8
allmaptotheletterA
Cartire Cartire Duck laptop
![Page 9: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/9.jpg)
Exampleapplications:Language
• Input:anewsarticle• Output:Whichsectionofthenewspapershouldbebein
• Input:anemail• Output:whichfoldershouldanemailbeplacedinto
• Input:anaudiocommandgiventoacar• Output:whichofasetofactionsshouldbeexecuted
9
![Page 10: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/10.jpg)
Wherearewe?
• Introduction
• Combiningbinaryclassifiers– One-vs-all– All-vs-all– Errorcorrectingcodes
• Trainingasingleclassifier– MulticlassSVM– Constraintclassification
10
![Page 11: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/11.jpg)
Binarytomulticlass
• Canweuseanalgorithmfortrainingbinaryclassifierstoconstructamulticlassclassifier?– Answer:Decomposethepredictionintomultiplebinarydecisions
• Howtodecompose?– One-vs-all– All-vs-all– Errorcorrectingcodes
11
![Page 12: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/12.jpg)
Generalsetting
• Input𝐱 ∈ ℜ-– Theinputsarerepresentedbytheirfeaturevectors
• Output𝐲 ∈ 1,2,⋯ ,𝐾– Theseclassesrepresentdomain-specificlabels
• Learning:Givenadataset𝐷 = {(𝐱𝑖, 𝐲𝑖)}– NeedalearningalgorithmthatusesDtoconstructafunctionthatcan
predict𝐱 to 𝐲– Goal:findapredictorthatdoeswellonthetrainingdataandhaslow
generalizationerror
• Prediction/Inference:Givenanexample𝐱 andthelearnedfunction,computetheclasslabelfor 𝐱
12
![Page 13: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/13.jpg)
1.One-vs-allclassification
• Assumption:Eachclassindividuallyseparablefromall theothers
• Learning:Givenadataset𝐷 = {(𝐱𝑖, 𝐲𝑖)}– DecomposeintoKbinaryclassificationtasks– Forclassk,constructabinaryclassificationtaskas:
• Positiveexamples:ElementsofDwithlabelk• Negativeexamples:AllotherelementsofD
– TrainKbinaryclassifiersw1,w2,! wK usinganylearningalgorithmwehaveseen
13
𝒙 ∈ ℜ-𝒚 ∈ 1,2,⋯ , 𝐾
![Page 14: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/14.jpg)
1.One-vs-allclassification
• Assumption:Eachclassindividuallyseparablefromall theothers
• Learning:Givenadataset𝐷 = {(𝐱𝑖, 𝐲𝑖)}– DecomposeintoKbinaryclassificationtasks– Forclassk,constructabinaryclassificationtaskas:
• Positiveexamples:ElementsofDwithlabelk• Negativeexamples:AllotherelementsofD
– TrainKbinaryclassifiersw1,w2,! wK usinganylearningalgorithmwehaveseen
14
𝐱 ∈ ℜ-𝐲 ∈ 1,2,⋯ , 𝐾
![Page 15: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/15.jpg)
1.One-vs-allclassification
• Assumption:Eachclassindividuallyseparablefromall theothers
• Learning:Givenadataset𝐷 = {(𝐱i, 𝐲𝑖)}– TrainKbinaryclassifiersw1,w2,! wK usinganylearningalgorithmwehaveseen
• Prediction:“WinnerTakesAll”argmax𝑖𝐰𝑖
𝑇𝐱
15
𝒙 ∈ ℜ-𝒚 ∈ 1,2,⋯ , 𝐾
![Page 16: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/16.jpg)
1.One-vs-allclassification
• Assumption:Eachclassindividuallyseparablefromall theothers
• Learning:Givenadataset𝐷 = {(𝐱i, 𝐲𝑖)}– TrainKbinaryclassifiersw1,w2,! wK usinganylearningalgorithmwehaveseen
• Prediction:“WinnerTakesAll”argmax𝑖𝐰𝑖
𝑇𝐱
16
𝒙 ∈ ℜ-𝒚 ∈ 1,2,⋯ , 𝐾
Question:Whatisthedimensionalityofeachwi?
![Page 17: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/17.jpg)
VisualizingOne-vs-all
17
![Page 18: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/18.jpg)
VisualizingOne-vs-all
Fromthefulldataset,constructthreebinaryclassifiers,oneforeachclass
18
![Page 19: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/19.jpg)
VisualizingOne-vs-all
Fromthefulldataset,constructthreebinaryclassifiers,oneforeachclass
19
wblueTx >0
forblueinputs
![Page 20: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/20.jpg)
VisualizingOne-vs-all
Fromthefulldataset,constructthreebinaryclassifiers,oneforeachclass
20
wblueTx >0
forblueinputs
wredTx >0
forredinputs
wgreenTx >0
forgreeninputs
![Page 21: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/21.jpg)
VisualizingOne-vs-all
Fromthefulldataset,constructthreebinaryclassifiers,oneforeachclass
21
wblueTx >0
forblueinputs
wredTx >0
forredinputs
wgreenTx >0
forgreeninputs
Notation:Scoreforbluelabel
![Page 22: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/22.jpg)
VisualizingOne-vs-all
Fromthefulldataset,constructthreebinaryclassifiers,oneforeachclass
22
wblueTx >0
forblueinputs
wredTx >0
forredinputs
wgreenTx >0
forgreeninputs
Notation:Scoreforbluelabel
WinnerTakeAllwillpredicttherightanswer.Onlythecorrectlabelwillhaveapositivescore
![Page 23: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/23.jpg)
One-vs-allmaynotalwaysworkBlackpointsarenotseparablewithasinglebinaryclassifier
Thedecompositionwillnotworkforthesecases!
wblueTx >0
forblueinputs
wredTx >0
forredinputs
wgreenTx >0
forgreeninputs
???
23
![Page 24: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/24.jpg)
One-vs-allclassification:Summary
• Easytolearn– Useanybinaryclassifierlearningalgorithm
• Problems– Notheoreticaljustification– Calibrationissues
• WearecomparingscoresproducedbyKclassifierstrainedindependently.Noreasonforthescorestobeinthesamenumericalrange!
– Mightnotalwayswork• Yet,worksfairlywellinmanycases,especiallyiftheunderlyingbinaryclassifiersaretuned,regularized
24
![Page 25: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/25.jpg)
2.All-vs-allclassification
• Assumption:Every pairofclassesisseparable
Sometimescalledone-vs-one
25
![Page 26: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/26.jpg)
2.All-vs-allclassification
• Assumption:Every pairofclassesisseparable
• Learning:Givenadataset𝐷 = {(𝐱𝒊, 𝐲𝑖)},– Foreverypairoflabels(j,k),createabinaryclassifierwith:
• Positiveexamples:Allexampleswithlabelj• Negativeexamples:Allexampleswithlabelk
– Train 𝐾2 = @(@AB)C
classifierstoseparateeverypairoflabelsfromeachother
Sometimescalledone-vs-one
26
𝐱 ∈ ℜ-𝐲 ∈ 1,2,⋯ , 𝐾
![Page 27: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/27.jpg)
2.All-vs-allclassification
• Assumption:Every pairofclassesisseparable
• Learning:Givenadataset𝐷 = {(𝐱𝒊, 𝐲𝑖)},– Train 𝐾2 = @(@AB)
Cclassifierstoseparateeverypairof
labelsfromeachother
• Prediction:Morecomplex,eachlabelgetK-1votes– Howtocombinethevotes?Manymethods
• Majority:Pickthelabelwithmaximumvotes• Organizeatournamentbetweenthelabels
Sometimescalledone-vs-one
27
𝐱 ∈ ℜ-𝐲 ∈ 1,2,⋯ , 𝐾
![Page 28: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/28.jpg)
All-vs-allclassification
• Everypairoflabelsislinearlyseparablehere– Whenapairoflabelsisconsidered,allothersareignored
• Problems1. O(K2)weightvectorstotrainandstore
2. Sizeoftrainingsetforapairoflabelscouldbeverysmall,leadingtooverfittingofthebinaryclassifiers
3. Predictionisoftenad-hocandmightbeunstableEg:Whatiftwoclassesgetthesamenumberofvotes?Foratournament,whatisthesequenceinwhichthelabelscompete?
28
![Page 29: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/29.jpg)
3.Errorcorrectingoutputcodes(ECOC)
• Eachbinaryclassifierprovidesonebitofinformation
• WithKlabels,weonlyneedlog2Kbitstorepresentthelabel– One-vs-allusesK bits(oneperclassifier)– All-vs-allusesO(K2)bits
• CanwegetbywithO(logK)classifiers?– Yes! Encodeeachlabelasabinarystring– Oralternatively,ifwedotrainmorethanO(logK)classifiers,can
weusetheredundancytoimproveclassificationaccuracy?
29
![Page 30: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/30.jpg)
Usinglog2Kclassifiers
• Learning:– Representeachlabelbyabitstring(i.e.,itscode)– Trainonebinaryclassifierforeachbit
• Prediction:– Usethepredictionsfromalltheclassifierstocreatealog2Nbit
stringthatuniquelydecidestheoutput
• Whatcouldgowronghere?– Evenifoneoftheclassifiersmakesamistake,finalpredictionis
wrong!
30
label# Code
0 0 0 0
1 0 0 1
2 0 1 0
3 0 1 1
4 1 0 0
5 1 0 1
6 1 1 0
7 1 1 1
8 classes,code-length=3
Example:Forsomeexample,ifthethreeclassifierspredict0,1 and1,thenthelabelis3
![Page 31: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/31.jpg)
Usinglog2Kclassifiers
• Learning:– Representeachlabelbyabitstring(i.e.,itscode)– Trainonebinaryclassifierforeachbit
• Prediction:– Usethepredictionsfromalltheclassifierstocreatealog2Nbit
stringthatuniquelydecidestheoutput
• Whatcouldgowronghere?– Evenifoneoftheclassifiersmakesamistake,finalpredictionis
wrong!
31
label# Code
0 0 0 0
1 0 0 1
2 0 1 0
3 0 1 1
4 1 0 0
5 1 0 1
6 1 1 0
7 1 1 1
8 classes,code-length=3
![Page 32: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/32.jpg)
Usinglog2Kclassifiers
• Learning:– Representeachlabelbyabitstring(i.e.,itscode)– Trainonebinaryclassifierforeachbit
• Prediction:– Usethepredictionsfromalltheclassifierstocreatealog2Nbit
stringthatuniquelydecidestheoutput
• Whatcouldgowronghere?– Evenifoneoftheclassifiersmakesamistake,finalpredictionis
wrong!
32
label# Code
0 0 0 0
1 0 0 1
2 0 1 0
3 0 1 1
4 1 0 0
5 1 0 1
6 1 1 0
7 1 1 1
8 classes,code-length=3
![Page 33: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/33.jpg)
Errorcorrectingoutputcoding
Answer:Useredundancy• Assignabinarystringwitheachlabel
– Couldberandom– LengthofthecodewordL >=log2Kisaparameter
• Trainonebinaryclassifierforeachbit– Effectively,splitthedataintorandomdichotomies– Weneedonlylog2Kbits
• Additionalbitsactasanerrorcorrectingcode
33
8 classes,code-length=5
# Code
0 0 0 0 0 0
1 0 0 1 1 0
2 0 1 0 1 1
3 0 1 1 0 1
4 1 0 0 1 1
5 1 0 1 0 0
6 1 1 0 0 0
7 1 1 1 1 1
![Page 34: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/34.jpg)
Howtopredict?
• Prediction– RunallL binaryclassifiersontheexample– GivesusapredictedbitstringoflengthL– Output=labelwhosecodewordis“closest”to
theprediction– ClosestdefinedusingHammingdistance
• Longercodelengthisbetter,bettererror-correction
• Example– Supposethebinaryclassifiersherepredict11010– Theclosestlabeltothisis6,withcodeword11000
34
8 classes,code-length=5
# Code
0 0 0 0 0 0
1 0 0 1 1 0
2 0 1 0 1 1
3 0 1 1 0 1
4 1 0 0 1 1
5 1 0 1 0 0
6 1 1 0 0 0
7 1 1 1 1 1
![Page 35: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/35.jpg)
Howtopredict?
• Prediction– RunallL binaryclassifiersontheexample– GivesusapredictedbitstringoflengthL– Output=labelwhosecodewordis“closest”to
theprediction– ClosestdefinedusingHammingdistance
• Longercodelengthisbetter,bettererror-correction
• Example– Supposethebinaryclassifiersherepredict11010– Theclosestlabeltothisis6,withcodeword11000
35
8 classes,code-length=5
# Code
0 0 0 0 0 0
1 0 0 1 1 0
2 0 1 0 1 1
3 0 1 1 0 1
4 1 0 0 1 1
5 1 0 1 0 0
6 1 1 0 0 0
7 1 1 1 1 1
One-vs-allisaspecialcaseofthisscheme.How?
![Page 36: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/36.jpg)
Errorcorrectingcodes:Discussion
• Assumesthatcolumnsareindependent– Otherwise,ineffectiveencoding
• Strongtheoreticalresultsthatdependoncodelength– IfminimalHammingdistancebetweentworowsisd,thenthe
predictioncancorrectupto(d-1)/2errorsinthebinarypredictions
• Codeassignmentcouldberandom,ordesignedforthedataset/task
• One-vs-allandall-vs-allarespecialcases– All-vs-allneedsaternarycode(notbinary)
36
![Page 37: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/37.jpg)
Errorcorrectingcodes:Discussion
• Assumesthatcolumnsareindependent– Otherwise,ineffectiveencoding
• Strongtheoreticalresultsthatdependoncodelength– IfminimalHammingdistancebetweentworowsisd,thenthe
predictioncancorrectupto(d-1)/2errorsinthebinarypredictions
• Codeassignmentcouldberandom,ordesignedforthedataset/task
• One-vs-allandall-vs-allarespecialcases– All-vs-allneedsaternarycode(notbinary)
37
Exercise:Convinceyourselfthatthisiscorrect
![Page 38: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/38.jpg)
Decompositionmethods:Summary
• Generalidea– Decomposethemulticlassproblemintomanybinaryproblems– Weknowhowtotrainbinaryclassifiers– Predictiondependsonthedecomposition
• Constructsthemulticlasslabelfromtheoutputofthebinaryclassifiers
• Learningoptimizeslocalcorrectness– Eachbinaryclassifierdoesnotneedtobegloballycorrect
• Thatis,theclassifiersdonothavetoagreewitheachother– Thelearningalgorithmisnotevenawareofthepredictionprocedure!
• Poordecompositiongivespoorperformance– Difficultlocalproblems,canbe“unnatural”
• Eg.ForECOC,whyshouldthebinaryproblemsbeseparable?
38
![Page 39: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/39.jpg)
Wherearewe?
• Introduction
• Combiningbinaryclassifiers– One-vs-all– All-vs-all– Errorcorrectingcodes
• Trainingasingleclassifier– MulticlassSVM– Constraintclassification
39
![Page 40: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/40.jpg)
Motivation
• Decompositionmethods– Donotaccountforhowthefinalpredictorwillbeused– Donotoptimizeanyglobalmeasureofcorrectness
• Goal:Totrainamulticlassclassifierthatis“global”
40
![Page 41: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/41.jpg)
Recall:Marginforbinaryclassifiers
Themargin ofahyperplaneforadataset:thedistancebetweenthehyperplaneandthedatapointnearesttoit
41
++
++
+ +++
-- --
-- -- --
---- --
--
Marginwithrespecttothishyperplane
![Page 42: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/42.jpg)
Multiclassmargin
Definedasthescoredifferencebetweenthehighestscoringlabelandthesecondone
42
Labels
Scoreforalabel
Blue
Red
Green
Black
=wlabelTx
![Page 43: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/43.jpg)
Multiclassmargin
Definedasthescoredifferencebetweenthehighestscoringlabelandthesecondone
43
Labels
Scoreforalabel
Blue
Red
Green
Black
=wlabelTx
MulticlassMargin
![Page 44: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/44.jpg)
MulticlassSVM(Intuition)
• Recall:BinarySVM– Maximizemargin– Equivalently,
Minimizenormofweightssuchthattheclosestpointstothehyperplanehaveascore±1
• MulticlassSVM– Eachlabelhasadifferentweightvector(likeone-vs-all)– Maximizemulticlassmargin– Equivalently,
Minimizetotalnormoftheweightssuchthatthetruelabelisscoredatleast1morethanthesecondbestone
44
![Page 45: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/45.jpg)
MulticlassSVMintheseparablecase
45
RecallhardbinarySVM
𝑠𝑐𝑜𝑟𝑒 𝑦J – 𝑠𝑐𝑜𝑟𝑒 𝑘 ≥ 1
𝑅𝑒𝑔𝑢𝑙𝑎𝑟𝑖𝑧𝑒𝑟 𝐰B,⋯ ,𝒘@
![Page 46: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/46.jpg)
MulticlassSVMintheseparablecase
46
RecallhardbinarySVM
𝑅𝑒𝑔𝑢𝑙𝑎𝑟𝑖𝑧𝑒𝑟 𝐰B,⋯ ,𝒘@
![Page 47: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/47.jpg)
MulticlassSVMintheseparablecase
47
RecallhardbinarySVM
![Page 48: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/48.jpg)
MulticlassSVMintheseparablecase
48
RecallhardbinarySVM
Thescoreforthetruelabelishigherthanthescoreforany otherlabelby1
![Page 49: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/49.jpg)
MulticlassSVMintheseparablecase
49
RecallhardbinarySVM
Thescoreforthetruelabelishigherthanthescoreforany otherlabelby1
Sizeoftheweights.Effectively,regularizer
![Page 50: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/50.jpg)
MulticlassSVMintheseparablecase
50
RecallhardbinarySVM
Thescoreforthetruelabelishigherthanthescoreforany otherlabelby1
Sizeoftheweights.Effectively,regularizer
Problemswiththis?
![Page 51: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/51.jpg)
MulticlassSVMintheseparablecase
51
RecallhardbinarySVM
Thescoreforthetruelabelishigherthanthescoreforanyotherlabelby1
Sizeoftheweights.Effectively,regularizer
Problemswiththis?
Whatifthereisnosetofweightsthatachievesthisseparation?Thatis,whatifthedataisnotlinearlyseparable?
![Page 52: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/52.jpg)
MulticlassSVM:Generalcase
52
Sizeoftheweights.Effectively,regularizer
Thescoreforthetruelabelishigherthanthescoreforany otherlabelby1- »i
Slackvariables.Notallexamplesneedtosatisfythemargin
constraint.
![Page 53: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/53.jpg)
MulticlassSVM:Generalcase
53
Sizeoftheweights.Effectively,regularizer
Thescoreforthetruelabelishigherthanthescoreforany otherlabelby1- »i
Slackvariables.Notallexamplesneedtosatisfythemargin
constraint.
Totalslack.Don’tallowtoomanyexamplestoviolatethemargin
constraint
![Page 54: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/54.jpg)
MulticlassSVM:Generalcase
54
Sizeoftheweights.Effectively,regularizer
Thescoreforthetruelabelishigherthanthescoreforany otherlabelby1- »i
Slackvariables.Notallexamplesneedtosatisfythemargin
constraint.
Totalslack.Don’tallowtoomanyexamplestoviolatethemargin
constraint
Slackvariablescanonlybepositive
![Page 55: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/55.jpg)
MulticlassSVM:Generalcase
55
Sizeoftheweights.Effectively,regularizer
Thescoreforthetruelabelishigherthanthescoreforany otherlabelby1- »i
Slackvariables.Notallexamplesneedtosatisfythemargin
constraint.
Totalslack.Don’tallowtoomanyexamplestoviolatethemargin
constraint
Slackvariablescanonlybepositive
![Page 56: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/56.jpg)
MulticlassSVM:Generalcase
56
Thescoreforthetruelabelishigherthanthescoreforany otherlabelby1- »i
Sizeoftheweights.Effectively,regularizer
Slackvariables.Notallexamplesneedtosatisfythemargin
constraint.
Totalslack.Don’tallowtoomanyexamplestoviolatethemargin
constraint
Slackvariablescanonlybepositive
![Page 57: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/57.jpg)
MulticlassSVM:Generalcase
57
Solving
Isequivalenttosolving
min𝐰U,𝐰V,⋯,𝐰W
12X𝐰J
Y𝐰J + 𝐶 X max 0,max]^𝐲_
𝐰]Y𝐱J − 𝐰𝐲_
Y 𝐱J + 1�
(𝐱_,𝐲_)∈b
�
J
Why?
![Page 58: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/58.jpg)
MulticlassSVM:Generalcase
58
min𝐰U,𝐰V,⋯,𝐰W
12X𝐰J
Y𝐰J + 𝐶 X max 0,max]^𝐲_
𝐰]Y𝐱J − 𝐰𝐲_
Y 𝐱J + 1�
(𝐱_,𝐲_)∈b
�
J
Sizeoftheweights.Effectively,regularizer
![Page 59: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/59.jpg)
MulticlassSVM:Generalcase
59
min𝐰U,𝐰V,⋯,𝐰W
12X𝐰J
Y𝐰J + 𝐶 X max 0,max]^𝐲_
𝐰]Y𝐱J − 𝐰𝐲_
Y 𝐱J + 1�
(𝐱_,𝐲_)∈b
�
J
Sizeoftheweights.Effectively,regularizer Themulticlasshingeloss
![Page 60: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/60.jpg)
MulticlassSVM:Generalcase
60
min𝐰U,𝐰V,⋯,𝐰W
12X𝐰J
Y𝐰J + 𝐶 X max 0,max]^𝐲_
𝐰]Y𝐱J − 𝐰𝐲_
Y 𝐱J + 1�
(𝐱_,𝐲_)∈b
�
J
Sizeoftheweights.Effectively,regularizer Themulticlasshingeloss
Thetradeoffhyperparameter
![Page 61: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/61.jpg)
MulticlassSVM
• GeneralizesbinarySVMalgorithm– Ifwehaveonlytwoclasses,thisreducestothebinary(uptoscale)
• ComeswithsimilargeneralizationguaranteesasthebinarySVM
• Canbetrainedusingdifferentoptimizationmethods– Stochasticsub-gradientdescentcanbegeneralized
• Tryasexercise
61
![Page 62: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/62.jpg)
MulticlassSVM:Summary
• Training:– OptimizetheSVMobjective
• Prediction:– Winnertakesall
argmaxi wiTx
• WithKlabelsandinputsin<n,wehavenK weightsinall– Sameasone-vs-all
– Butcomeswithguarantees!
62Questions?
![Page 63: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/63.jpg)
Wherearewe?
• Introduction
• Combiningbinaryclassifiers– One-vs-all– All-vs-all– Errorcorrectingcodes
• Trainingasingleclassifier– MulticlassSVM– Constraintclassification
63
![Page 64: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/64.jpg)
Letusexamineone-vs-allagain
• Training:– CreateKbinaryclassifiersw1,w2,…,wK
– wi separatesclassi fromallothers
• Prediction:argmaxi wiTx
• Observations:1. Attrainingtime,werequirewi
Tx tobepositiveforexamplesofclassi.
2. Really,allweneedis forwiTx tobemorethanallothers
Therequirementofbeingpositiveismorestrict
64
![Page 65: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/65.jpg)
Rewriteinputsandweightvector• Stackallweightvectorsintoan
nK-dimensionalvector
• Defineafeaturevectorforlabeli beingassociatedtoinputx:
LinearSeparability withmultipleclasses
65
xintheith block,zeroseverywhereelse
Forexampleswithlabeli,wewantwiTx >wj
Tx forallj
![Page 66: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/66.jpg)
Rewriteinputsandweightvector• Stackallweightvectorsintoan
nK-dimensionalvector
• Defineafeaturevectorforlabeli beingassociatedtoinputx:
LinearSeparability withmultipleclasses
66
xintheith block,zeroseverywhereelse
Forexampleswithlabeli,wewantwiTx >wj
Tx forallj
ThisiscalledtheKesler construction
![Page 67: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/67.jpg)
LinearSeparability withmultipleclasses
Equivalentrequirement:
67
xintheith block,zeroseverywhereelse
Forexampleswithlabeli,wewantwiTx >wj
Tx forallj
Or:
![Page 68: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/68.jpg)
LinearSeparability withmultipleclasses
68
ithblock
Forexampleswithlabeli,wewantwiTx >wj
Tx foralljOrequivalently:
![Page 69: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/69.jpg)
LinearSeparability withmultipleclasses
69
ithblock
Foreveryexample(x,i)indataset,allotherlabelsj
Positiveexamples Negativeexamples
Thatis,thefollowingbinarytaskinnK dimensionsthatshouldbelinearlyseparable
Forexampleswithlabeli,wewantwiTx >wj
Tx foralljOrequivalently:
![Page 70: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/70.jpg)
ConstraintClassification
• Training:– Givenadataset{(x,y)},createabinaryclassificationtask
• Positiveexamples:Á(x,y)- Á(x,y’)• Negativeexamples:Á(x, y’)- Á(x,y)foreveryexample,foreveryy’≠y
– Useyourfavoritealgorithmtotrainabinaryclassifier
• Prediction:GivenanK dimensionalweightvectorwandanewexamplex
argmaxy wT Á(x,y)
70
![Page 71: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/71.jpg)
ConstraintClassification
• Training:– Givenadataset{(x,y)},createabinaryclassificationtask
• Positiveexamples:Á(x,y)- Á(x,y’)• Negativeexamples:Á(x, y’)- Á(x,y)foreveryexample,foreveryy’≠y
– Useyourfavoritealgorithmtotrainabinaryclassifier
• Prediction:GivenanK dimensionalweightvectorwandanewexamplex
argmaxy wT Á(x,y)
71
![Page 72: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/72.jpg)
ConstraintClassification
• Training:– Givenadataset{(x,y)},createabinaryclassificationtask
• Positiveexamples:Á(x,y)- Á(x,y’)• Negativeexamples:Á(x, y’)- Á(x,y)foreveryexample,foreveryy’≠y
– Useyourfavoritealgorithmtotrainabinaryclassifier
• Prediction:GivenanK dimensionalweightvectorwandanewexamplex
argmaxy wT Á(x,y)
72
Exercise:WhatdotheperceptronupdaterulelooklikeintermsoftheÁs?Interprettheupdatestep
![Page 73: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/73.jpg)
ConstraintClassification
• Training:– Givenadataset{(x,y)},createabinaryclassificationtask
• Positiveexamples:Á(x,y)- Á(x,y’)• Negativeexamples:Á(x, y’)- Á(x,y)foreveryexample,foreveryy’≠y
– Useyourfavoritealgorithmtotrainabinaryclassifier
• Prediction:GivenanK dimensionalweightvectorwandanewexamplex
argmaxy wT Á(x,y)
73
Note:Thebinaryclassificationtaskonlyexpressespreferencesoverlabelassignments
Thisapproachextendstotrainingaranker,canusepartialpreferencestoo,moreonthislater…
![Page 74: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/74.jpg)
Asecondlookatthemulticlassmargin
74
Definedasthescoredifferencebetweenthehighestscoringlabelandthesecondone
Labels
Scoreforalabel
Blue
Red
Green
Black
MulticlassMargin
![Page 75: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/75.jpg)
Asecondlookatthemulticlassmargin
75
Definedasthescoredifferencebetweenthehighestscoringlabelandthesecondone
Labels
Scoreforalabel
Blue
Red
Green
Black
MulticlassMarginIntermsofKeslerconstruction
Herey isthelabelthathasthehighestscore
![Page 76: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/76.jpg)
Discussion
• ThenumberofweightsformulticlassSVMandconstraintclassificationisstillsameasOne-vs-all,muchlessthanall-vs-allK(K-1)/2
• Butbothstillaccountforallpairwiselabelpreferences– MulticlassSVMviathedefinitionofthelearningobjective
– Constraintclassificationbyconstructingabinaryclassificationproblem
• Bothcomewiththeoreticalguaranteesforgeneralization
• Importantideathatisapplicablewhenwemovetoarbitrarystructures
76Questions?
![Page 77: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/77.jpg)
Trainingmulticlassclassifiers:Wrap-up
• Labelbelongstoasetthathasmorethantwoelements
• Methods– Decompositionintoacollectionofbinary(local)decisions
• One-vs-all• All-vs-all• Errorcorrectingcodes
– Trainingasingle(global)classifier• MulticlassSVM• Constraintclassification
• Exercise:Whichofthesewillworkforthiscase?
77Questions?
![Page 78: From Binary to Multiclass Classification · 2020. 1. 14. · From Binary to Multiclass Classification 1. We have seen binary classification •We have seen linear models •Learning](https://reader034.fdocuments.us/reader034/viewer/2022051822/5fec46e581110d0d251bb6a0/html5/thumbnails/78.jpg)
Nextsteps…
• Builduptostructuredprediction– Multiclassisreallyasimplestructure
• Differentaspectsofstructuredprediction– Decidingthestructure,training,inference
• Sequencemodels
78