MLPI Lecture 2: Monte Carlo Methods (Basics)
Transcript of MLPI Lecture 2: Monte Carlo Methods (Basics)
Lecture'2
Monte&Carlo&MethodsDahua%Lin
The$Chinese$University$of$Hong$Kong
1
Monte&Carlo&Methods
Monte&Carlo&methods!are!a!large!family!of!computa0onal!algorithms!that!rely!on!random&sampling.!These!methods!are!mainly!used!for
• Numerical+integra/on
• Stochas/c+op/miza/on
• Characterizing+distribu/ons
2
Expecta(ons,in,Sta(s(cal,Analysis
Compu&ng)expecta'on)is)perhaps)the)most)common)opera&on)in)sta&s&cal)analysis.
• Compu'ng*the*normaliza)on*factor*in*posterior*distribu'on:
3
Expecta(ons,in,Sta(s(cal,Analysis,(cont'd)
• Compu'ng*marginaliza'on:
• Compu'ng*expecta'on*of*func'ons:
4
Compu&ng)Expecta&on• Generally,*expecta'on*can*be*wri/en*as*
• For*discrete*space:*
• For*con7nuous*space:*
5
Law$of$Large$Numbers• (Strong))Law)of)Large)Numbers)(LLN):#Let#
#be#i.i.d#random#variables#and# #be#a#measurable#func6on.#Then
6
Monte&Carlo&Integra-on• We$can$use$sample'mean$to$approximate$expecta.on:
• How%many%samples%are%enough?
7
Variance(of(Sample(Mean• By$the$Central(Limit(Theorem((CLT):
Here,$ $is$the$variance$of$ .
• The$variance$of$ $is$ .$The$number$of$
samples$required$to$a=ain$a$certain$variance$ $is$at$least$ .
8
Random'Number'Genera.on• All$sampling$methods$rely$on$a$stream$of$random'numbers$to$construct$random$samples.
• "True"$random$numbers$are$difficult$to$obtain.$A$more$widely$used$approach$is$to$use$computa:onal$algorithms$to$produce$long$sequences$of$apparently'random$numbers,$called$pseudorandom'numbers.
9
Random'Number'Genera.on'(cont'd)• The%sequence%of%pseudorandom+numbers%is%determined%by%a%seed.
• If%a%randomized%simula9on%is%based%on%a%single%random+stream,%it%can%be%exactly%reproduced%by%fixing%the%seed%ini9ally.
10
RNGs• Linear(Congruen-al(Generator((LCG):(
.
• C(and(Java's(buil-n.
• Useful(for(simple(randomized(program.
• Not(good(enough(for(serious(Monte(Carlo(simula-on.
11
RNGs
• Mersenne'Twister'(MT):'
• Passes'Die$hard'tests,'good'enough'for'most'Monte'Carlo'experiments
• Provided'by'C++'11'or'Boost
• Default'RNG'for'MATLAB,'Numpy,'Julia,'and'many'other'numerical'soLwares
• Not'amenable'to'parallel'use
12
RNGs
• Xorshi(1024:.
• Proposed.in.year.2014
• Passes.BigCrush
• Incredibly.simple.(5*+*6.lines.of.C.code)
13
Sampling)a)Discrete)Distribu2on
• Linear(search:(
• Sorted(search:(
• but(much(faster(when(prob.(mass(concentrates(on(a(few(values
• Binary(search:( ,(
• but(each(step(is(a(bit(more(expensive
• Can(we(do(be?er?14
Sampling)a)Discrete)Distribu2on• Huffman(coding:(
• (preprocessing(+( (per(sample
• Alias(methods((by(A.#J.#Walker):(
• (preprocessing(+( (per(sample
15
Transform)Sampling
• Let% %be%the%cumula&ve)distribu&on)func&on)(cdf)%of%a%distribu0on% .%Let% ,%then%
.
• Why% ?
• How%to%generate%a%exponen&ally)distributed%sample%?
16
Sampling)from)Mul/variate)Normal
How$to$draw$a$sample$from$a$mul$variate+normal+distribu$on$ $?
(Algorithm)
1. Perform)Cholesky)decomposi4on)to)get) )s.t.).
2. Generate)a)vector) )comprised)of)iid)values)from)
3. Let) .17
Rejec%on(Sampling
• (Rejec&on)sampling)algorithm):"To"sample"from"a"distribu2on" ,"which"has" "for"some" :
1. Sample"
2. Accept" "with"probability" .
• Why"does"this"algorithm"yield"the"desired"distribu2on"?
18
Rejec%on(Sampling((cont'd)
• What&are&the&acceptance&rate?
• What&are&the&problems&of&this&method?19
Importance+Sampling
• Basic&idea:"generate"samples"from"an"easier&distribu+on" ,"which"is"o4en"referred"to"as"the"proposal&distribu+on,"and"then"reweight"the"samples.
20
Importance+Sampling+(cont'd)
• Let% %be%the%target&distribu,on%and% %be%the%proposal&distribu,on%with% ,%and%let%the%importance&weight%be% .%Then
• We$can$approximate$ $with
21
Importance+Sampling+(cont'd)• By$the$strong'law'of'large'numbers,$we$have$
• How%to%choose%a%good%proposal%distribu4on% ?
22
Variance(of(Importance(Sampling
• The%variance%of% %is
• The%2nd%term%does%not%depend%on% ,%while%the%1st%term%has
23
Op#mal'Proposal• The%lower%bound%is%a1ained%when:
• The%op#mal'proposal'distribu#on%is%generally%difficult%to%sample%from.%
• However,%this%analysis%leads%us%to%an%insight:%we'can'achieve'high'sampling'efficiency'by'emphasizing'regions'where'the'value'of' 'is'high.
24
Adap%ve(Importance(Sampling• Basic'idea'**'Learn'to'do'sampling
• choose'the'proposal'from'a'tractable'family:'.
• Objec;ve:'minimize'the'sample'mean'of'.'Update'the'parameter' 'as:
25
Self%Normalized.Weights
• In$many$prac+cal$cases,$ $is$known$only$upto$a$normalizing$constant.
• For$such$case,$we$may$approximate$ $with
• Here,& &is&called&the&self%normalized.weight.
26
Self%Normalized.Weights.(cont'd)
Note:
By#strong#law#of#large#numbers,#we#have#
27
MCMC:$Mo&va&on
Simple'strategies'like'transform)sampling,'rejec1on)sampling,'and'importance)sampling'all'rely'on'drawing'independent)samples'from' 'or'a'proposal'distribu7on''over'the'sample'space.
This%can%become%very%difficult%(if%not%impossible)%for%complex%distribu:ons.
28
MCMC:$Overview
• Markov'Chain'Monte'Carlo'(MCMC)"explores"the"sample"space"through"an"ergodic"Markov"chain,"whose"equilibrium"distribu;on"is"the"target"distribu;on" .
• Many"important"samplers"belong"to"MCMC,"e.g."Gibbs,sampling,"Metropolis4Has6ngs,algorithm,"and"Slice,sampling.
29
Markov'Processes• A#Markov'process#is#a#stochas+c#process#where#the#future#depends#only#on#the#present#and#not#on#the#past.
• A#sequence#of#random#variables# #defined#on#a#measurable#space# #is#called#a#(discrete01me)'Markov'process#if#it#sa+sfies#the#Markov'property:
30
Countable*Markov*Chain
We#first#review#the#formula2on#and#proper2es#of#Markov'chains#under#a#simple#se6ng,#where# #is#a#countable#space.#We#will#later#extend#the#analysis#to#more#general#spaces.
31
Homogeneous)Markov)Chains
A"homogeneous)Markov)chain"on"a"countable"space" ,"denoted"by" "is"characterized"by"an"ini5al"distribu5on" "and"a"transi2on)probability)matrix)(TPM),"denoted"by" ,"such"that
• ,#and
• .
32
Stochas(c)Matrix
The$transi+on$probability$matrix$ $is$a$stochas'c(matrix,$namely$it$is$nonnega've$and$has:$
."
33
Evolu&on(of(State(Distribu&ons
Let$ $be$the$distribu,on$of$ ,$then:
We#can#simply#write# .
34
Mul$%step*Transi$on*Probabili$es• Consider*two*transi.on*steps:
• More&generally,& .
• Let& &be&the&distribu6on&of& ,&then&
35
Classes&of&States
• A#state# #is#said#to#be#accessible#from#state# ,#or# #leads)to# ,#denoted#by# ,#if# #for#some# .
• States# #and# #are#said#to#communicate#with#each#other,#denoted#by# ,#if# #and# .
• #is#an#equivalence)rela2on#on# ,#which#par88ons# #into#communica2ng)classes,#where#states#within#the#same#class#communicate#with#each#other.
36
Irreducibility
A"Markov"chain"is"said"to"be"irreducible"if"it"forms"a"single"communica7ng"class,"or"in"other"words,"all"states"communicate"with"any"other"states.
37
Exercise(1
• Is$this$Markov$chain$irreducible$?
• Please$iden5fy$the$communica-ng/classes.38
Periods(of(Markov(Chains• The%period%of%a%state% %is%defined%as
• A#state# #is#said#to#be#aperiodic#if# .
• Period#is#a#class#property:#if# ,#then#.
39
Aperiodic)Chains• A#Markov#chain#is#called#aperiodic,#if#all#states#are#aperiodic.
• An#irreducible,Markov,chain#is#aperiodic,#if#there#exists#an#aperiodic#state.
• Lazyness#breaks#periodicity:# .
40
First&Return&Time• Suppose(the(chain(is(ini/ally(at(state( ,(the(first%return%)me(to(state( (is(defined(to(be
Note(that( (is(a(random(variable.
• We(also(define( ,(the(probability(that(the(chain(returns(to( (for%the%first%)me(a<er( (steps.
41
Recurrence
A"state" "is"said"to"be"recurrent"if"it"is"guaranteed"to"have"a"finite)hi+ng)-me,"as
Otherwise,* *is*said*to*be*transient.
42
Recurrence'(cont'd)• "is"recurrent"iff"it"returns"to" "infinitely+o-en:
• Recurrence"is"a"class"property:"if" "and" "is"recurrent,"then" "is"also"recurrent.
• Every"finite"communica9ng"class"is"recurrent.
• An"irreducible"finite"Markov"chain"is"recurrent.43
Invariant(Distribu-ons
Consider)a)Markov)chain)with)TPM) )on) :
• A#distribu+on# #over# #is#called#an#invariant'distribu,on#(or#sta,onary'distribu,on)#if# .
• Invariant'distribu,on#is#NOT#necessarily#existent#and#unique.
• Under#certain#condi+on#(ergodicity),#there#exists#a#unique#invariant#distribu+on# .#In#such#cases,# #is#oAen#called#an#equilibrium'distribu,on.
44
Exercise(2
• Is$this$chain$irreducible?
• Is$this$chain$periodic?
• Please$compute$the$invariant/distribu1on.45
Posi%ve(Recurrence• The%expected'return'+me%of%a%state% %is%defined%to%be% .
• When% %is%transient,% .%If% %is%recurrent,% %is%NOT%necessarily%finite.
• A%recurrent'state% %is%called%posi+ve'recurrent%if%.%Otherwise,%it%is%called%null'recurrent.
46
Existence)of)Invariant)Distribu3ons
For$an$irreducible$Markov$chain,$if$some$state$is$posi,ve.recurrent,$then$all$states$are$posi,ve.recurrent$and$the$chain$has$an$invariance.distribu,on$ $given$by$
47
Example:)1D)Random)Walk
• Under'what'condi/on'is'this'chain'recurrent?
• When'it'is'recurrent,'is'it'posi+ve-recurrent'or'null-recurrent?
48
Ergodic(Markov(Chains• An$irreducible,$aperiodic,$and$posi-ve/recurrent$Markov$chain$is$called$an$ergodic/Markov/chain,$or$simply$ergodic/chain.
• A$finite$Markov$chain$is$ergodic$if$and$only$if$it$is$irreducible$and$aperiodic.
• A$Markov$chain$is$ergodic$if$it$is$aperiodic$and$there$exist$ $such$that$any$state$can$be$reached$from$any$other$state$within$ $steps$with$posi>ve$probability.
49
Convergence)to)Equilibrium
Let$ $be$the$transi,on$probability$matrix$of$an$ergodic$Markov$chain,$then$there$exists$a$unique$invariant,distribu0on$ .$Then$with$any$ini,al$distribu,on,$
!
Par$cularly,* *as* *for*all*.
50
Ergodic(Theorem• The%ergodic(theorem%relates%,me(mean%to%space(mean:
• Let% %be%an%ergodic(Markov(chain%over%%with%equilibrium%distribu7on% ,%and% %be%a%
measurable%func7on%on% ,%then
51
Ergodic(Theorem((cont'd)• More&generally,&we&have&for&any&posi4ve&integer& :
• The%ergodic(theorem%is%the%theore+cal%founda+on%for%MCMC.
52
Total&Varia*on&Distance• The%total%varia)on%distance%between%two%measures%
%and% %is%defined%as
• The%total%varia)on%distance%is%a%metric.%If% %is%countable,%we%have
53
Mixing&Time
The$%me$required$by$a$Markov$chain$to$get$close$to$the$equilibrium$distribu%on$is$measured$by$the$mixing&'me,$defined$as$
In#par'cular# .
54
Spectral)Representa-on
• "be"a"finite"stochas'c(matrix"over" "with" .
• The"spectral(radius"of" ,"namely"the"maximum"absolute"value"of"all"eigenvalues,"is" .
• Assume" "is"ergodic"and"reversible"with"equilibrium"distribu=on" .
• Define"an"inner"product:"
55
Spectral)Representa-on)(cont'd)
All#eigenvalues#of# #are#real#values,#given#by#.#Let# #be#the#right#
eigenvector#associated#with# .#Then#the#le:#eigenvector#is# #(element'wise+product),#and# #can#be#represented#as
56
Spectral)Gap
Let$ .$Then$the$spectral)gap$is$defined$to$be$ $and$the$absolute)spectral)gap$is$defined$to$be$ .$Then:
Here,% .
57
Bounds'of'Mixing'Time
The$mixing&'me$can$be$bounded$by$the$inverse$of$absolute&spectral&gap:
Generally,)the)goal)to)design)a)rapidly(mixing)reversible)Markov)chain)is)to)maximize)the)absolute)spectral)gap) .
58
Ergodic(Flow
Consider)an)ergodic)Markov)chain)on)a)finite)space) )with)transi5on)probability)matrix) )and)equilibrium)distribu5on) :
• The%ergodic(flow%from%a%subset% %to%another%subset%%is%defined%as
59
Conductance
The$conductance$of$a$Markov$chain$is$defined$as
60
Bounds'of'Spectral'Gap
(Jerrum'and'Sinclair'(1989))!The!spectral!gap!is!bounded!by
61
Exercise(3
Consider)an)ergodic)finite)chain) )with) .)To)improve)the)mixing)7me,)one)can)add)a)li:le)bit)lazyness)as) .)
Please&solve&the&op,mal&value&of& &that&maximizes&the&absolute)spectral)gap& .
62
Exercise(4
Consider)a) )stochas.c)matrix) ,)given)by))when) .
• Please'specify'the'condi2on'under'which' 'is'ergodic.
• What'is'the'equilibrium'distribu2on'when' 'is'ergodic?
• Solve'the'op2mal'value'of' 'that'maximizes'the'absolute'spectral'gap.
63
General'Markov'Chains
Next,&we&extend&the&formula2on&of&Markov'chain&from&countable'space&to&general&measurable'space.
• First,(the(Markov'property(remains.
• But,&the&transi.on&probability&matrix&makes&no&sense&in&general.
64
General'Markov'Chains'(cont'd)• Generally,*a*homogeneous)Markov)chain,*denoted*by*
,*over*a*measurable*space* *is*characterized*by*
• an*ini1al)measure* ,*and*
• a*transi1on)probability)kernel* :
65
Stochas(c)Kernel
The$transi'on)probability)kernel$ $is$a$stochas'c)kernel:
• Given' ,' 'is'a'probability*measure'over' .
• Given'a'measurable'subset' ,''is'a'measurable'func5on.
• When' 'is'a'countable'space,' 'reduces'to'a'stochas1c*matrix.
66
Evolu&on(of(State(Distribu&ons
Suppose'the'distribu.on'of' 'is' ,'then
Again,'we'can'simply'write'this'as' .
67
Composi'on)of)Stochas'c)Kernels
• Composi(on*of*stochas(c*kernels* *and* *remains*a*stochas(c*kernel,*denoted*by* ,*defined*as:
• Recursive*composi.on*of* *for* *.mes*results*in*a*stochas.c*kernel*denoted*by* ,*and*we*have*
*and* .
68
Example:)Random)Walk)in)
Here,%the%stochas,c%kernel%is%given%by% .
69
Occupa&on)Time,)Return)Time,)and)Hi4ng)Time
Let$ $be$a$measurable$set:
• The%occupa&on(&me:% .
• The%return(&me:% .
• The%hi/ng(&me:% .
• ,% %and% %are%all%random%variables.70
!irreducibility
• Define&.
• Given&a&posi/ve&measure& &over& ,&a&markov&chain&is&called& !irreducible&if& &whenever& &is& !posi-ve,&i.e.& .
• Intui/vely,&it&means&that&for&any& >posi/ve&set& ,&there&is&posi/ve&chance&that&the&chain&enters& &within&finite&/me,&no&ma?er&where&it&begins.
71
!irreducibility,(cont'd)
• A#Markov#chain#over# #is# 0irreducible#if#and#only#if#either#of#the#following#statement#holds:
•
•
72
!irreducibility,(cont'd)
• Typical)spaces)usually)come)with)natural'measure) :
• The)natural'measure)for)countable)space)is)the)coun-ng'measure.)In)this)case,)the)no:on)of) ;irreducibility)coincides)with)the)one)introduced)earlier.
• The)natural'measure)for) ,) ,)or)a)finite;dimensional)manifold)is)the)Lebesgue'measure
73
Transience)and)Recurrence
Given&a&Markov&chain& &over& ,&and& :
• "is"called"transient"if" "for"every" .
• "is"called"uniformly.transient"if"there"exists" "such"that" "for"every" .
• "is"called"recurrent"if" "for"every" .
74
Transience)and)Recurrence)(cont'd)
• Consider*an* ,irreducible*chain,*then*either:
• Every* ,posi9ve*subset*is*recurrent,*then*we*call*the*chain*recurrent
• *is*covered*by*countably*many*uniformly*transient*sets,*then*we*call*the*chain*transient.
75
Invariant(Measures• A#measure# #is#called#an#invariant'measure#w.r.t.#the#stochas2c#kernel# #if# ,#i.e.
• A#recurrent#Markov#chain#admits#a#unique#invariant'measure# #(up#to#a#scale#constant).
• Note:#This#measure# #can#be#finite#or#infinite.
76
Posi%ve(Chains• A#Markov#chain#is#called#posi%ve#if#it#is#irreducible#and#admits#an#invariant,probability,measure# .
• The#study#of#the#existence#of# #requires#more#sophis<cated#analysis.
• We#are#not#going#into#these#details,#as#in#MCMC#prac<ce,#existence#of# #is#usually#not#an#issue.
77
Subsampled+Chains
Let$ $be$a$stochas+c$kernel$and$ $be$a$probability$vector$over$ .$Then$ $defined$as$below$is$also$a$stochas+c$kernel:
The$chain$with$kernel$ $is$called$a$subsampled*chain$with$ .
78
Subsampled+Chains+(cont'd)
• When& ,& .
• If& &is&invariant&w.r.t.& ,&then& &is&also&invariant&w.r.t.&.
79
Sta$onary)Process
• A#stochas*c#process# #is#called#sta$onary#if
• A#Markov#chain#is#sta$onary#if#it#has#an#invariant#probability#measure#and#that#is#also#its#ini9al#distribu9on.
80
Birkhoff(Ergodic(Theorem• (Birkhoff)Ergodic)Theorem)"Every"irreducible)sta8onary"Markov"chain" "is"ergodic,"that"is,"for"any"real5valued"measurable"func:on" :
where" "is"the"invariant"probability"measure.
81
Markov'Chain'Monte'Carlo
(Markov(Chain(Monte(Carlo):"To"sample"from"a"target(distribu6on" :
• We$first$construct$a$Markov$chain$with$transi'on)probability)kernel$ $such$that$ .
• This$is$the$most$difficult$part.
82
Markov'Chain'Monte'Carlo'(cont'd)• Then&we&simulate&the&chain,&usually&in&two&stages:
• (Burning(stage)&simulate&the&chain&and&ignore&all&samples,&un8l&it&gets&close&to&sta.onary
• (Sampling(stage)&collect&samples& &from&a&subsampled&chain& &or& .
• Approximate&the&expecta8on&of&the&func8on& &of&interest&using&the&sample&mean.
83
Detailed(Balance
Most%Markov%chains%in%MCMC%prac0ce%falls%in%a%special%family:%reversible(chains
• A#distribu+on# #over#a#countable*space#is#said#to#be#in*detailed*balance#with# #if#
.
• Detailed(balance"implies"invariance.
• The"converse"is"not"true.
84
Reversible)Chains
An#irreducible#Markov#chain# #with#transi5on#probability#matrix# #and#an#invariant#distribu5on# :
• This&Markov&chain&is&called&reversible&if& &is&in&detailed+balance&with& .
• Under&this&condi7on,&it&has:
85
Reversible)Chains)on)General)Spaces
Over%a%general%measurable%space% ,%a%stochas4c%kernel% %is%called%reversible%w.r.t.%a%probability%measure%%if
for$any$bounded$measurable$func0on$ .
• If$ $is$reversible$w.r.t.$ ,$then$ $is$an$invariant$to$ .
86
Detailed(Balance(on(General(Spaces
Suppose'both' 'and' 'are'absolutely'con2nuous'w.r.t.'a'base'measure' :'
Then%the%chain%is%reversible%if%and%only%if
which%is%called%the%detailed'balance.
87
Detailed(Balance(on(General(Spaces
More%generally,%if%
where% ,%then%the%chain%is%reversible%iff
88
Metropolis*Has-ngs:1Overview• In$MCMC$prac+ce,$the$target&distribu,on$ $is$usually$known$up$to$an$unnormalized&density$ ,$such$that$ ,$and$the$normalizing&constant$ $is$o9en$intractable$to$compute.
• The$Metropolis6Has,ngs&algorithm&(M6H&algorithm)$is$a$classical$and$popular$approach$to$MCMC$sampling,$which$requires$only$the$unnormalized&density$ .$
89
Metropolis*Has-ngs:1How1it1Works
1. It%is%associated%with%a%proposal'kernel% .
2. At%each%itera2on,%a%candidate%is%generated%from% %given%the%current%state% .
3. With%a%certain%acceptance%ra2o,%which%depends%on%both% %and% ,%the%candidate%is%accepted.
• The%acceptance%ra2o%is%determined%in%a%way%that%maintains%detailed'balance,%so%the%resultant%chain%is%reversible%w.r.t.% .
90
Metropolis*Algorithm• The%Metropolis*algorithm%is%a%precursor%(and%a%special%case)%of%the%M6H%algorithm,%which%requires%the%designer%to%provide%a%symmetric*kernel% ,%i.e.%
,%where% %is%the%density%of% .
• Note:% %is%not%necessarily%invariant%to%
• Gaussian*random*walk%is%a%symmetric%kernel.
91
Metropolis*Algorithm*(cont'd)• At$each$itera+on,$with$current$state$ :
1. Generate$a$candidate$ $from$
2. Accept$the$candidate$with$acceptance'ra)o$.
• The$Metropolis'update$sa+sfies$detailed'balance.$Why?
92
Metropolis*Has-ngs0Algorithm• The%Metropolis*Has-ngs0algorithm%requires%a%proposal0kernel% ,
• %is%NOT%necessarily%symmetric%and%does%NOT%necessarily%admit% %as%an%invariant%measure.
93
Metropolis*Has-ngs0Algorithm• At$each$itera+on,$with$current$status$ :
• Generate$a$candidate$ $from$
• Accept$the$candidate$with$acceptance'ra)o$,$with
• The$Metropolis/Has)ngs'update$sa+sfies$detailed'balance.$Why?
94
Gibbs%Sampling• The%Gibbs%sampler%was%introduced%by%Geman%and%Geman%(1984)%from%sampling%from%a%Markov%random%field%over%images,%and%popularized%by%Gelfand%and%Smith%(1990).
• Each%state% %is%comprised%of%mulIple%components%.
95
Gibbs%Sampling%(cont'd)
At#each#itera*on,#following#a#permuta*on# #over#.#For# ,#let# ,#update# #by#
re;drawing# #condi*oned#on#all#other#components:
Here$ $indicates$the$condi&onal)distribu&on$of$ $given$all$other$components.$
96
Gibbs%Sampling%(cont'd)• At$each$itera+on,$one$can$use$either$a$random'scan$or$a$fixed'scan.
• Different$schedules$can$be$used$at$different$itera+ons$to$scan$the$components.
• The$Gibbs'update$is$a$special$case$of$M4H'update,$and$thus$sa+sfies$detailed4balance.$Why?
97
Example(1:(Gaussian(Mixture(Model
98
Example(1:(Gibbs(Sampling
Given& ,&ini(alize& &and& .
Condi&oned(on( (and( :
99
Example(1:(Gibbs(Sampling((cont'd)
Condi&oned(on( (and( :
with
100
Example(2:(Ising(Model
The$normalizing$constant$ $is$usually$unknown$and$intractable$to$compute.
101
Example(2:(Gibbs(Sampling
• Let% %denote%the%en*re% %vector%except%for%one%entry% ,
• How%can%we%schedule%the%computa2on%so%that%many%updates%can%be%done%in%parallel?
• Coloring102
Mixture(of(MCMC(Kernels
Let$ $be$stochas'c(kernels$with$the$same$invariant$probability$measure$ :
• Let% %be%a%probability%vector,%then% %
remains%a%stochas'c(kernel%with%invariant%probability%measure% .
• Furthermore,%if% %are%all%reversible,%then% %is%reversible.
103
Composi'on)of)MCMC)Kernels
• "is"also"a"stochas'c(kernel"with"invariant"probability"measure" .
• Note:" "is"generally"not"reversible"even"when""are"all"reversible,"except"when"
.
104