UVA CS 6316/4501 – Fall 2016 Machine Learning Lecture 11 ...- e.g., support vector machine,...
Transcript of UVA CS 6316/4501 – Fall 2016 Machine Learning Lecture 11 ...- e.g., support vector machine,...
UVACS6316/4501–Fall2016
MachineLearning
Lecture11:ProbabilityReview
Dr.YanjunQi
UniversityofVirginia
DepartmentofComputerScience
10/17/16
Dr.YanjunQi/UVACS6316/f16
1
Announcements:Schedule
– Midterm–Nov.26Wed/3:30pm–4:45pm/opennotes
– HW4includessamplemidtermquesOons– GradingofHW1isavailableonCollab– SoluOonofHW1isavailableonCollab– GradingofHW2willbeavailablenextweek– SoluOonofHW2willbeavailablenextweek
10/17/16
Dr.YanjunQi/UVACS6316/f16
2
Wherearewe?èFivemajorsecOonsofthiscourse
q Regression(supervised)q ClassificaOon(supervised)q Unsupervisedmodelsq Learningtheoryq Graphicalmodels
10/17/16 3
Dr.YanjunQi/UVACS6316/f16
Wherearewe?èThreemajorsecOonsforclassificaOon
• We can divide the large variety of classification approaches into roughly three major types
1. Discriminative - directly estimate a decision rule/boundary - e.g., support vector machine, decision tree 2. Generative: - build a generative statistical model - e.g., naïve bayes classifier, Bayesian networks 3. Instance based classifiers - Use observation directly (no models) - e.g. K nearest neighbors
10/17/16 4
Dr.YanjunQi/UVACS6316/f16
Today:ProbabilityReview
• Thebigpicture• EventsandEventspaces• Randomvariables• Jointprobability,MarginalizaOon,condiOoning,chainrule,BayesRule,lawoftotalprobability,etc.
• StructuralproperOes• Independence,condiOonalindependence
10/17/16 5
Dr.YanjunQi/UVACS6316/f16
TheBigPicture
Modeli.e.DatageneraOng
process
ObservedData
Probability
EsOmaOon/learning/Inference/Datamining
Buthowtospecifyamodel?
10/17/16 6
Dr.YanjunQi/UVACS6316/f16
Probabilityasfrequency
• ConsiderthefollowingquesOons:– 1.WhatistheprobabilitythatwhenIflipacoinitis“heads”?
– 2.why?
– 3.WhatistheprobabilityofBlueRidgeMountainstohaveanerupOngvolcanointhenearfuture?
10/17/16 7
Message:Thefrequen*stviewisveryuseful,butitseemsthatwealsousedomainknowledgetocomeupwithprobabili*es.
Dr.YanjunQi/UVACS6316/f16
Wecancountè~1/2
ècouldnotcount
AdaptfromProf.NandodeFreitas’sreviewslides
Probabilityasameasureofuncertainty
10/17/16 8
• Imaginewearethrowingdartsatawallofsize1x1andthatalldartsareguaranteedtofallwithinthis1x1wall.
• Whatistheprobabilitythatadartwillhittheshadedarea?
Dr.YanjunQi/UVACS6316/f16
AdaptfromProf.NandodeFreitas’sreviewslides
Probabilityasameasureofuncertainty
• Probabilityisameasureofcertaintyofaneventtakingplace.
• i.e.intheexample,weweremeasuringthechancesofhi?ngtheshadedarea.
10/17/16 9
prob = #RedBoxes#Boxes
Dr.YanjunQi/UVACS6316/f16
AdaptfromProf.NandodeFreitas’sreviewslides
Today:ProbabilityReview
• Thebigpicture• Samplespace,EventandEventspaces• Randomvariables• Jointprobability,MarginalizaOon,condiOoning,chainrule,BayesRule,lawoftotalprobability,etc.
• StructuralproperOes• Independence,condiOonalindependence
10/17/16 10
Dr.YanjunQi/UVACS6316/f16
Probability
10/17/16 11
O:ElementaryEvent“Throwa2”
Odie={1,2,3,4,5,6}
TheelementsofWarecalledelementaryevents.
Dr.YanjunQi/UVACS6316/f16
Probability
• Probabilityallowsustomeasuremanyevents.• TheeventsaresubsetsofthesamplespaceO.Forexample,foradiewemayconsiderthefollowingevents:e.g.,
• Assignprobabili7estotheseevents:e.g.,
10/17/16 12
EVEN={2,4,6}GREATER={5,6}
P(EVEN)=1/2
Dr.YanjunQi/UVACS6316/f16
AdaptfromProf.NandodeFreitas’sreviewslides
SamplespaceandEvents
• O : SampleSpace,• resultofanexperiment/setofalloutcomes
• IfyoutossacointwiceO = {HH,HT,TH,TT}
• Event:asubsetofO• Firsttossishead={HH,HT}
• S:eventspace,asetofevents:• ContainstheemptyeventandO10/17/16 13
Dr.YanjunQi/UVACS6316/f16
AxiomsforProbability
• Definedover(O,S) s.t.• 1>=P(a)>=0forallainS• P(O)=1
• IfA, Baredisjoint,then• P(AUB)=p(A)+p(B)
10/17/16 14
Dr.YanjunQi/UVACS6316/f16
AxiomsforProbability
10/17/16
Dr.YanjunQi/UVACS6316/f16
15
B1
B2B3B4
B5
B6B7
P Bi( )∑• P(O)=
ORoperaOonforProbability
• Wecandeduceotheraxiomsfromtheaboveones
• Ex:P(AUB)fornon-disjointevents
10/17/16 16
Dr.YanjunQi/UVACS6316/f16
P(UnionofAandB)
A
10/17/16 17
Dr.YanjunQi/UVACS6316/f16
A
B
10/17/16 18
Dr.YanjunQi/UVACS6316/f16
P(IntersecOonofAandB)
A
B
10/17/16 19
Dr.YanjunQi/UVACS6316/f16
CondiOonalProbability
A
B
10/17/16 20
Dr.YanjunQi/UVACS6316/f16
fromProf.NandodeFreitas’sreview
CondiOonalProbability/ChainRule
• Morewaystowriteoutchainrule…
10/17/16 21
P(A,B)=p(B|A)p(A)P(A,B)=p(A|B)p(B)
Dr.YanjunQi/UVACS6316/f16
Ruleoftotalprobability=>MarginalizaOon
A
B1
B2B3B4
B5
B6B7
p A( ) = P Bi( )P A | Bi( )∑ WHY???
10/17/16 22
Dr.YanjunQi/UVACS6316/f16
Today:ProbabilityReview
• Thebigpicture• EventsandEventspaces• Randomvariables• Jointprobability,MarginalizaOon,condiOoning,chainrule,BayesRule,lawoftotalprobability,etc.
• StructuralproperOes• Independence,condiOonalindependence
10/17/16 23
Dr.YanjunQi/UVACS6316/f16
FromEventstoRandomVariable
• Concisewayofspecifyingarributesofoutcomes• Modelingstudents(GradeandIntelligence):
• O = allpossiblestudents(samplespace)• Whatareevents(subsetofsamplespace)
• Grade_A=allstudentswithgradeA• Grade_B=allstudentswithgradeB• HardWorking_Yes=…whoworkshard
• Verycumbersome
• Need“funcOons”thatmapsfromO toanarributespaceT.• P(H=YES)=P({studentϵO : H(student)=YES})10/17/16 24
Dr.YanjunQi/UVACS6316/f16
RandomVariables(RV)O
Yes
No
A
B A+
H:hardworking
G:Grade
P(H=Yes)=P({allstudentswhoisworkinghardonthecourse})
10/17/16 25
Dr.YanjunQi/UVACS6316/f16
• “funcOons”thatmapsfromO toanarributespaceT.
NotaOonDigression
• P(A)isshorthandforP(A=true)• P(~A)isshorthandforP(A=false)• SamenotaOonappliestootherbinaryRVs:P(Gender=M),P(Gender=F)
• SamenotaOonappliestomul*valuedRVs:P(Major=history),P(Age=19),P(Q=c)
• Note:uppercaselerers/namesforvariables,lowercaselerers/namesforvalues
10/17/16 26
Dr.YanjunQi/UVACS6316/f16
DiscreteRandomVariables
• Randomvariables(RVs)whichmaytakeononlyacountablenumberofdisInctvalues
• XisaRVwitharitykifitcantakeonexactlyonevalueoutof{x1,…,xk}
10/17/16 27
Dr.YanjunQi/UVACS6316/f16
ProbabilityofDiscreteRV
• ProbabilitymassfuncOon(pmf):P(X=xi)• Easyfactsaboutpmf
§ ΣiP(X=xi)=1§ P(X=xi∩X=xj)=0ifi≠j§ P(X=xiUX=xj)=P(X=xi)+P(X=xj)ifi≠j§ P(X=x1UX=x2U…UX=xk)=1
10/17/16
Dr.YanjunQi/UVACS6316/f16
28
e.g.CoinFlips
• Youflipacoin– Headwithprobability0.5
• Youflip100coins– Howmanyheadswouldyouexpect
10/17/16
Dr.YanjunQi/UVACS6316/f16
29
e.g.CoinFlipscont.
• Youflipacoin– Headwithprobabilityp– Binaryrandomvariable– Bernoullitrialwithsuccessprobabilityp
• Youflipkcoins– Howmanyheadswouldyouexpect– NumberofheadsX:discreterandomvariable– BinomialdistribuOonwithparameterskandp
10/17/16
Dr.YanjunQi/UVACS6316/f16
30
DiscreteRandomVariables
• Randomvariables(RVs)whichmaytakeononlyacountablenumberofdisInctvalues– E.g.thetotalnumberofheadsXyougetifyouflip100coins
• XisaRVwitharitykifitcantakeonexactlyonevalueoutof– E.g.thepossiblevaluesthatXcantakeonare0,1,2,…,100
x1,…,xk{ }
10/17/16
Dr.YanjunQi/UVACS6316/f16
31
e.g.,twoCommonDistribuOons
• Uniform– Xtakesvalues1,2,…,N– – E.g.pickingballsofdifferentcolorsfromabox
• Binomial– Xtakesvalues0,1,…,k
– – E.g.coinflipskOmes
X ∼U 1,..., N⎡⎣ ⎤⎦
( )P X 1i N= =
X ∼ Bin k, p( )
P X = i( ) = k
i⎛
⎝⎜⎞
⎠⎟pi 1− p( )k−i
10/17/16
Dr.YanjunQi/UVACS6316/f16
32
Today:ProbabilityReview
• Thebigpicture• EventsandEventspaces• Randomvariables• Jointprobability,MarginalizaOon,condiOoning,chainrule,BayesRule,lawoftotalprobability,etc.
• StructuralproperOes• Independence,condiOonalindependence
10/17/16 33
Dr.YanjunQi/UVACS6316/f16
e.g.,CoinFlipsbyTwoPersons
• Yourfriendandyoubothflipcoins– Headwithprobability0.5– Youflip50Omes;yourfriendflip100Omes– Howmanyheadswillbothofyouget
10/17/16
Dr.YanjunQi/UVACS6316/f16
34
JointDistribuOon
• GiventwodiscreteRVsXandY,theirjointdistribuIonisthedistribuOonofXandYtogether– E.g.P(Youget21headsANDyoufriendget70heads)
• – E.g.sum ( )P X Y 1
x yx y= ∩ = =∑ ∑
( )50 100
0 0P You get heads AND your friend get heads 1
i ji j
= ==∑ ∑
10/17/16
Dr.YanjunQi/UVACS6316/f16
35
JointDistribuOon
• GiventwodiscreteRVsXandY,theirjointdistribuIonisthedistribuOonofXandYtogether– E.g.P(Youget21headsANDyoufriendget70heads)
• – E.g.sum ( )P X Y 1
x yx y= ∩ = =∑ ∑
( )50 100
0 0P You get heads AND your friend get heads 1
i ji j
= ==∑ ∑
10/17/16
Dr.YanjunQi/UVACS6316/f16
36
CondiOonalProbability
• istheprobabilityof,giventheoccurrenceof– E.g.youget0heads,giventhatyourfriendgets61heads
•
( )P X Yx y= =
( ) ( )( )
P X YP X Y
P Yx y
x yy
= ∩ == = =
=
X x=Y y=
10/17/16
Dr.YanjunQi/UVACS6316/f16
37
LawofTotalProbability
• GiventwodiscreteRVsXandY,whichtakevaluesinand,Wehavex1,…,xm{ } y1,…, yn{ }
( ) ( )( ) ( )
P X P X Y
P X Y P Y
i i jj
i j jj
x x y
x y y
= = = ∩ =
= = = =
∑∑
10/17/16
Dr.YanjunQi/UVACS6316/f16
38
MarginalizaOon
MarginalProbability LawofTotalProbability
CondiOonalProbability
( ) ( )( ) ( )
P X P X Y
P X Y P Y
i i jj
i j jj
x x y
x y y
= = = ∩ =
= = = =
∑∑
MarginalProbability
A
B1
B2B3B4
B5
B6B7
10/17/16
Dr.YanjunQi/UVACS6316/f16
39
SimplifyNotaOon:CondiOonalProbability
( ) ( )( )
P X YP X Y
P Yx y
x yy
= ∩ == = =
=
( ))(),(|
ypyxpyxP =
Butwewillalwayswriteitthisway:
events
X=x
Y=y
10/17/16 40
Dr.YanjunQi/UVACS6316/f16
P(X=xtrue)->P(X=x)->P(x)
SimplifyNotaOon:MarginalizaOon
• Weknowp(X,Y),whatisP(X=x)?
• Wecanusethelawoftotalprobability,why? ( ) ( )
( ) ( )∑
∑=
=
y
y
yxPyP
yxPxp
|
,
A
B1
B2B3B4
B5
B6B7
10/17/16 41
Dr.YanjunQi/UVACS6316/f16
MarginalizaOonCont.• Anotherexample
( ) ( )
( ) ( )∑
∑=
=
yz
zy
zyxPzyP
zyxPxp
,
,
,|,
,,
10/17/16 42
Dr.YanjunQi/UVACS6316/f16
BayesRule
• WeknowthatP(rain)=0.5• Ifwealsoknowthatthegrassiswet,thenhowthisaffectsourbeliefaboutwhetheritrainsornot?
€
P rain |wet( ) =P(rain)P(wet | rain)
P(wet)
10/17/16 43
Dr.YanjunQi/UVACS6316/f16
BayesRule
• WeknowthatP(rain)=0.5• Ifwealsoknowthatthegrassiswet,thenhowthisaffectsourbeliefaboutwhetheritrainsornot?
€
P rain |wet( ) =P(rain)P(wet | rain)
P(wet)
€
P x | y( ) =P(x)P(y | x)
P(y)10/17/16 44
Dr.YanjunQi/UVACS6316/f16
BayesRule
• XandYarediscreteRVs…
( ) ( ) ( )( ) ( )
P Y X P XP X Y
P Y X P Xj i i
i jj k kk
y x xx y
y x x
= = == = =
= = =∑
( ) ( )( )
P X YP X Y
P Yx y
x yy
= ∩ == = =
=
10/17/16
Dr.YanjunQi/UVACS6316/f16
45
10/17/16 46
Dr.YanjunQi/UVACS6316/f16
P(A = a1 | B) =P(B | A = a1)P(A = a1)P(B | A = ai )P(A = ai )
i∑
10/17/16 47
Dr.YanjunQi/UVACS6316/f16
BayesRulecont.
• YoucancondiOononmorevariables
( ))|(
),|()|(,|zyP
zxyPzxPzyxP =
10/17/16 48
Dr.YanjunQi/UVACS6316/f16
CondiOonalProbabilityExample
10/17/16 49
Dr.YanjunQi/UVACS6316/f16
AdaptfromProf.NandodeFreitas’sreviewslides
CondiOonalProbabilityExample
10/17/16 50
Dr.YanjunQi/UVACS6316/f16
AdaptfromProf.NandodeFreitas’sreviewslides
CondiOonalProbabilityExample
10/17/16 51
Dr.YanjunQi/UVACS6316/f16
CondiOonalProbabilityExample
10/17/16 52
Dr.YanjunQi/UVACS6316/f16
10/17/16
Dr.YanjunQi/UVACS6316/f16
53
èMatrixNotaOon
Today:ProbabilityReview
• Thebigpicture• EventsandEventspaces• Randomvariables• Jointprobability,MarginalizaOon,condiOoning,chainrule,BayesRule,lawoftotalprobability,etc.
• StructuralproperOes• Independence,condiOonalindependence
10/17/16 54
Dr.YanjunQi/UVACS6316/f16
IndependentRVs
• IntuiOon:XandYareindependentmeansthatneithermakesitmoreorlessprobablethat
• DefiniOon:XandYareindependentiff( ) ( ) ( )P X Y P X P Yx y x y= ∩ = = = =
X x=Y y=
10/17/16
Dr.YanjunQi/UVACS6316/f16
55
MoreonIndependence
•
• E.g.nomarerhowmanyheadsyouget,yourfriendwillnotbeaffected,andviceversa
( ) ( )P X Y P Xx y x= = = = ( ) ( )P Y X P Yy x y= = = =
( ) ( ) ( )P X Y P X P Yx y x y= ∩ = = = =
10/17/16
Dr.YanjunQi/UVACS6316/f16
56
MoreonIndependence
• XisindependentofYmeansthatknowingYdoesnotchangeourbeliefaboutX.• P(X|Y=y)=P(X)• P(X=x,Y=y)=P(X=x)P(Y=y)
• Theaboveshouldholdforallxi,yj• Itissymmetricandwrirenas
10/17/16
Dr.YanjunQi/UVACS6316/f16
!X ⊥Y
57
CondiOonallyIndependentRVs
• IntuiOon:XandYarecondiOonallyindependentgivenZmeansthatonceZisknown,thevalueofXdoesnotaddanyaddiIonalinformaOonaboutY
• DefiniOon:XandYarecondiOonallyindependentgivenZiff
( ) ( ) ( )P X Y Z P X Z P Y Zx y z x z y z= ∩ = = = = = = =
10/17/16
Dr.YanjunQi/UVACS6316/f16
Ifholdingforallxi,yj,zk !!X ⊥Y |Z58
CondiOonallyIndependentRVs
10/17/16
Dr.YanjunQi/UVACS6316/f16
59
!!X ⊥Y |Z
!X ⊥Y
MoreonCondiOonalIndependence
( ) ( ) ( )P X Y Z P X Z P Y Zx y z x z y z= ∩ = = = = = = =
( ) ( )P X Y ,Z P X Zx y z x z= = = = = =
( ) ( )P Y X ,Z P Y Zy x z y z= = = = = =
10/17/16
Dr.YanjunQi/UVACS6316/f16
60
TodayRecap:ProbabilityReview
• Thebigpicture:data<->probabilisOcmodel• Samplespace,EventsandEventspaces• Randomvariables• Jointprobability,Marginalprobability,condiOonalprobability,
• Chainrule,BayesRule,Lawoftotalprobability,etc.
• Independence,condiOonalindependence10/17/16 61
Dr.YanjunQi/UVACS6316/f16
References
q Prof.AndrewMoore’sreviewtutorialq Prof.NandodeFreitas’sreviewslidesq Prof.CarlosGuestrinrecitaOonslides
10/17/16 62
Dr.YanjunQi/UVACS6316/f16