Chapter 07: Monte Carlo Methods -...

LEARNING AND INFERENCE IN GRAPHICAL MODELS

Chapter 07: Monte Carlo Methods

Dr. Martin Lauer

University of FreiburgMachine Learning Lab

Karlsruhe Institute of TechnologyInstitute of Measurement and Control Systems

Learning and Inference in Graphical Models. Chapter 07 – p. 1/44

References for this chapter

◮ Christopher M. Bishop, Pattern Recognition and Machine Learning, ch. 11,Springer, 2006

◮ Christophe Andrieu, Nando de Freitas, Arnaud Doucet, and Michael I.Jordan, An Introduction to MCMC for Machine Learning, In: MachineLearning, vol. 50, no. 1–2, pp. 5-43, 2003

◮ Christian P. Robert and George Casella, Monte Carlo Statistical Methods,Springer, 1999

◮ Radford M. Neal, Slice sampling, In: Annals of Statistics, vol. 31, no. 3, pp.705-767, 2003

◮ Darrall Henderson, Sheldon H. Jacobson, and Alan W. Johnson, The Theoryand Practice of Simulated Annealing, In: Fred Glover and Gary A.Kochenberger (eds.), Handbook of Metaheuristics, Springer, 2003

References for this chapter

◮ Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth,Augusta H. Teller, and Edward Teller, Equations of State Calculations by FastComputing Machines, In: The Journal of Chemical Physics, vol. 21, pp.1087–1092, 1953

◮ W. Keith Hastings, Monte Carlo Sampling Methods Using Markov Chains andTheir Applications, In: Biometrika, col. 57, pp. 97-109, 1970

◮ Stuart Geman and Donald Geman, Stochastic Relaxation, GibbsDistributions and the Bayesian Restauration of Images, In: IEEETransactions in Pattern Analysis and Machine Intelligence, vol. 6, pp.721-741, 1984

◮ Donald E. Knuth, The Art of Computer Science: Volume 2 SeminumericalAlgorithms, Addison-Wesley, 1997

◮ William Feller, An Introduction to Probability Theory and its Applications, vol.1, Wiley, 1968

Monte Carlo inference

◮ many tasks in probability theory deal with terms of the form:

f(x)p(x)dx

• e.g. expectation value∫

xp(x)dx

• e.g. variance∫

x2p(x)dx− (∫

xp(x)dx)2

• e.g. expected risk∫

risk(x)p(x)dx

• e.g. expected gain∫

gain(x)p(x)dx

◮ but:

• integral often not tractable analytically

• p(·) often not known explicitly

◮ hence: replace analytical calculation by numerical approach→ Monte Carlo approach

Monte Carlo inference

◮ basic idea: calculate with random samples instead of pdfs

p(·) −→sample

{x1, . . . , xN} ∼ p

| ||× |↓ ↓

Rf(x)p(x)dx ←−

approximates

i=1 f(xi)

◮ as long as N is large enough 1N

i=1 f(xi) is a good approximation for∫

Rf(x)p(x)dx

◮ but: you need a random number generator for p

Random number generators

◮ random number generators for uniform distribution U(0, 1):many different algorithms exist, cf. book of Knuth

◮ quantile trick:

• assume, F (x) =∫ x

−∞p(t)dt is known (“cumulative distribution

function”, cdf)

• if u is a random sample element from U(0, 1), then F−1(u) is a randomsample element of p

• F−1 is called “quantile function”

◮ distribution specific transformation tricks:

• e.g. sampling from a Gaussian:assume u1, u2 independent random samples from U(0, 1). Then,

v1 =√−2 log u1 · sin(2πu2) and v2 =

√−2 log u1 · cos(2πu2) are

independent random variables fromN (0, 1)

Random number generators

◮ what can we do if we do not find any trick?→ accept-reject sampling

◮ assume:

• we want to sample from distribution with pdf p

• we own a random number generator for distribution with pdf q

• we know a constant M such that M ·q(x) ≥ p(x) for all x

• how can we use the random number generator for q to sample from p?

p(x1)q(x1)

q(x2)p(x2)

Accept-reject sampling

p M ·q

M ·q(x)

◮ sample from q yields x

◮ accept x with probabilityp(x)

◮ otherwise, reject x

the set of all accepted sample elements x yields a sample from p since:

p(x) ∝ 1

Mp(x) = q(x) · p(x)

on average the algorithm accepts only a ratio of 1M

of all sample elements.

→ choose appropriate q so that M remains small

Extension:accept-reject sampling works even if p is only known up to a constant factor

Example: robot localization

◮ robot is located within an area ofsize 1m× 1m

◮ it has sensors to measure thedistance to the corners of the field

~e1 ~e2

~e3 ~e4

◮ Bayesian network

d1 d2 d3 d4

◮ distributions

~x ∼ U([0, 1]× [0, 1])

di|~x ∼ N (||~x− ~ei||, σ2)

~x|d1, d2, d3, d4 ∼ ?

◮ calculating the posterior

p(~x|d1, d2, d3, d4) ∝ p(d1|~x)p(d2|~x)p(d3|~x)p(d4|~x) · p(~x)

∝ exp(

(||~x− ~ei|| − di)2)

· I[0≤x1,x2≤1]

◮ apply accept-reject sampling with M = 1, q(~x) = I[0≤x1,x2≤1]

◮ results

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

~x = (0.5, 0.5)rejected: 3627accepted: 100

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

~x = (0.3, 0.2)rejected: 3087accepted: 100

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

~x = (0, 0)rejected: 5792accepted: 100

◮ question: how can we create more efficient sampling schemes?

Side topic:Markov chains and their properties

Markov chains

Definitions:

◮ A Markov model or Markov chain is a Bayesian network which is organizedas a chain of random variables Xi where Xi+1 solely depends on Xi.

· · ·Xi

· · ·Xn

◮ The set of values that Xi can take is called the state set S.

◮ The transition between subsequent states is given by a transition kernelT (Xi+1|Xi)

• T (Xi+1|Xi) is a conditional probability if state set is discrete

• T (Xi+1|Xi) is a conditional density if state set is continuous

For the moment, we focus on Markov chains with finite, discrete state set.

◮ A Markov chain is homogeneous if transition kernel T is invariant w.r.t. time

Markov chains

◮ A transition diagram for a homogeneous Markov chain is a directed graphwith one vertex for each state and an edge between vertex u and v ifT (v|u) > 0

Example A

Example B

Example C

Markov chains

◮ A homogeneous Markov chain is irreducible if for all states u, v exists asequence of states s1, . . . , sn with u = s1 and sn = v such thatT (si+1|si) > 0.

◮ The period of a state s is given by

gcd{N ∈ N| exist s1, . . . , sN−1 with T (s1|s) > 0 and T (s|sN−1) > 0

and T (si+1|si) > 0 for all i ∈ {1, . . . , N − 2}}◮ A homogeneous Markov chain is aperiodic if the period of all states is 1.

Example A

Example B

Example C

Ergodic Markov chains

◮ Given a homogeneous Markov chain with discete, finite state set S we canarrange all transition probabilities in a transition matrix M withMi,j = T (sj|si). Hence, each row of M is the probability vector of acategorical distribution over S.

◮ Given a transition matrix M and a categorical distribution over the state setwith probability vector (row vector) ~w we obtain the distribution of successorstates by ~w ·M .

◮ a categorical distribution with probability vector ~w is a stationary distributionof a Markov chain with transition matrix M if ~w ·M = ~w.

◮ A homogeneous Markov chains with discrete, finite state set S is ergodic if

limk→∞ Mk exists and all rows in limk→∞ Mk are identical andlimk→∞ Mk does not contain zeros. Then the rows in limk→∞ Mk formthe probability vector of a stationary categorical distribution over S.

Example A

Example B

Example C

0 0 910

0 0 12

0 0 1 0

0 1 0 0

0 0 0 112

0 0 12

0 0 1 0

1 0 0 0

0 0 13

0 0 12

0 0 1 0

Which of these Markov chains are ergodic?

◮ Theorem: if a homogeneous Markov chain with discrete, finite state set ifirreducible and aperiodic it is also ergodic.Proof: see literature, e.g. Feller 1968

◮ What happens if we sample a very long sequence from an ergodic Markovchain?

• the first part of the sample will depend on the initial state (burn in phase)

• after burn in the sample is drawn from the stationary distribution of theMarkov chain

• the sample elements are dependent on each other

Good and bad mixing behavior of a Markov chain

Both Markov chains share the same stationary distribution, however, mixing isvery different.E.g. a random sample from chain A:s1, s1, s2, s1, s2, s2, s2, s1, s1, s2, s1, s1, s2, s2, s1, s2, . . .

and a sample from chain B:s1, s1, s1, s1, s1, s1, s1, s1, s1, s1, s1, s1, s1, s1, s1, s1, s2, s2, s2, s2, s2, s2, s2,

s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2,

s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s2, s1, s1, s1,

s1, s1, s1, s1, s1, s1, s1, s1, s1, s1, s1, s1, s1, . . .

Markov chains with continuous state space

If S ⊆ Rd we have to replace the transition matrix by a transition kernel

T (Xt+1|Xt), i.e. a conditional probability density.

E.g. T (v|u) = 12πe−

12(u+1−v)2

Now, a stationary distribution a pdf p(·) with∫

T (v|u) · p(u)du = p(v)

Most results (especially those about ergodic chains) can be transferred fromdiscrete state spaces to continuous state spaces. For details, cf. the book ofChristian P. Robert & George Casella. Monte Carlo Statistical Methods. Springer,1999

Designing Markov chains

We want to create a Markov chain with a specific stationary distribution p(·). Howcan we design the transition kernel?

Theorem: If a transition kernel T meets the detailed balance equation

T (v|u) · p(u) = T (u|v) · p(v)

for all states u, v ∈ S then p is stationary distribution of T . In this case T iscalled reversible.

Proof:∫

T (v|u)p(u)du=

T (u|v)p(v)du =

T (u|v)du · p(v) = p(v)

Theorem: If T1 and T2 are transition kernels with stationary distribution p, thenT = T2 ◦ T1 is a transition kernel with stationary distribution p. T is defined as

T (v|u) =∫

T2(v|w) · T1(w|u)dw

X ′1

X ′2

X ′3

· · ·T1 T2 T1 T2 T1 T2

Proof:∫

T (v|u)p(u)du=

∫ ∫

T2(v|w)T1(w|u)dw p(u)du

T2(v|w)∫

T1(w|u)p(u)du dw

T2(v|w)p(w)dw = p(v)

Theorem: If T1 and T2 are transition kernels with stationary distribution p and0 < q < 1, then T = q · T1 + (1− q) · T2 is a transition kernel with stationary

distribution p. T is defined as T (v|u) = q · T1(v|u) + (1− q) · T2(v|u)

X ′1

X ′′1

X ′2

X ′′2

X ′3

X ′′3

· · ·T1 T1 T1

T2 T2 T2

Proof:∫

T (v|u)p(u)du=

(q · T1(v|u) + (1− q) · T2(v|u)) p(u)du

= q ·∫

T1(v|u)p(u) du+ (1− q) ·∫

T2(v|u)p(u) du = p(v)

Markov Chain Monte Carlo Sampling

Markov chain Monte Carlo

◮ task

• we want to sample from a distribution p

• standard sampling tricks are not applicable

◮ basic idea:

• design a Markov chain with stationary distribution p

• sample from Markov chain. Reject initial sample elements

• obtain dependent sample from target distribution p

burn in: distribution dependson initial state

almost stationary target distribution

◮ approach is known as Markov chain Monte Carlo sampling (MCMC)

• Metropolis-Hastings algorithm (Metropolis, 1953), (Hastings, 1970)

• Gibbs sampling (Geman and Geman, 1984)

• Slice sampling (Neal, 2003)Learning and Inference in Graphical Models. Chapter 07 – p. 25/44

Metropolis-Hastings algorithm

◮ basic idea:

• sample candidates for successor state using a distribution q

• apply detailed balance equation to calculate an acceptance probability

◮ principle:

q(·|xt) q(·|zt)

transition:

• sample zt ∼ q(·|xt)

• set xt+1 = zt withprobability

1, p(zt)·q(xt|zt)p(xt)·q(zt|xt)

• otherwise set xt+1 = xt

◮ the acceptance probability simplifies if q is symmetric: min{

1, p(z)p(x)

(Metropolis algorithmus)

Metropolis-Hastings algorithm

The transition kernel of the Metropolis-Hastings algorithm is

T (v|u) = q(v|u) · A(v|u) + δ(v − u) ·∫

q(w|u) · (1− A(w|u))dw

with A(v|u) = min{

1, p(v)·q(u|v)p(u)·q(v|u)

Lemma:The Metropolis-Hastings transition kernel meets the detailed balance equation.

Proof:→ blackboard

Remark: Metropolis-Hastings also works if the target probability is only known upto a normalization constant

Example: robot localization revisited

◮ robot localization example solved with Metropolis-Hastings algorithmsample distribution q: ~z ∼ ~x+ U([−0.1, 0.1]× [−0.1, 0.1])

◮ created samples (each 200 elements):

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

~x = (0.5, 0.5)93 candidates rejected

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

~x = (0.3, 0.2)82 candidates rejected

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

~x = (0, 0)106 candidates rejected

Gibbs sampling

◮ sampling over bivariate distribution ~x = (x1, x2)

◮ Metropolis-Hastings with q(z1, z2|x1, x2) = I[x1=z1]p(z2|x1)

p(z1, z2) · q(x1, x2|z1, z2)p(x1, x2) · q(z1, z2|x1, x2)

=p(z1, z2) · I[x1=z1] · p(x2|z1)p(x1, x2) · I[x1=z1] · p(z2|x1)

=p(x1, z2) · p(x2|x1)

p(x1, x2) · p(z2|x1)=

p(z2|x1) · p(x1) · p(x2|x1)

p(x2|x1) · p(x1) · p(z2|x1)= 1

i.e. x1 is clamped while x2 ∼ p(x2|x1) is sampled

◮ analogously: clamp x2 and sample x1 ∼ p(x1|x2)

◮ Gibbs sampling: concatenate both steps

Example: uniform distribution over a parabola

◮ sample from a uniform distributionover a frustum of a parabola

◮ p(x1, x2) ∝ I[−2≤x1≤2]I[x21≤x2≤4]

x2|x1 ∼ U(x21, 4)

x1|x2 ∼ U(−√x2,√x2)

−2 −1.5 −1 −0.5 0 0.5 1 1.5 20

x̃(t)

x̃(t+1)

Gibbs sampling

generalization to multivariate distributions p(x1, . . . , xd):

◮ sample x(t+1)1 ∼ p(x1|x(t)

2 , . . . , x(t)d )

◮ sample x(t+1)2 ∼ p(x2|x(t+1)

1 , x(t)3 , . . . , x

(t)d )

◮ sample x(t+1)d ∼ p(xn|x(t+1)

1 , . . . , x(t+1)d−1 )

Example: bearing-only tracking

◮ observing a moving object from a fixed position

◮ object moves with constant velocity

◮ for every point in time, observer senses angleof observation, but only sometimes distance toobject

◮ distributions:~x0 ∼ N (~a,R)

~v ∼ N (~b, S)

~yi|~x0, ~v ∼ N (~x0 + ti~v, σ2I)

ri = ||~yi||wi =

||~yi||

object movement

unknowndistance

angle ofobservation

observer

unknown~x0 ~v

��

◮ conditional distributions for ~x0, ~v:

~x0|~v, (~yi), (ti) ∼ N(

σ2I +R−1)−1(

(~yi − tiv) +R−1~a), (n

σ2I +R−1)−1

~v|~x0, (~yi), (ti) ∼ N(

t2i I + S−1)−1(1

σ2(∑

ti(~yi − ~x0)) + S−1~b), (1

t2i I + S−1)−1)

◮ conditional distribution for ri:p(~yi|~x0, ~v, ti) ∝ exp{− 1

2σ2||~x0 + ti~v − ~yi||2}

p(~yi|~x0, ~v, ti, ~wi) ∝

exp{− 12σ2

||~x0 + ti~v − ~yi||2} · I[~yi‖~wi]

p(ri|~x0, ~v, ti, ~wi) ∝

exp{− 12σ2

||~x0 + ti~v − ri ~wi||2} =

exp{− 12σ2

(||~x0+ti~v||2−2ri · ~wTi (~x0+ti~v)+r2i )} ∝

exp{− 12σ2

(ri − ~wTi (~x0 + ti~v))

⇒ ri|~x0, ~v, ti, ~wi ∼ N (~wTi (~x0 + ti~v), σ

◮ Gibbs sampling with non-informative priors (R−1 = S−1 = 0):

~x0 ∼ N ( 1n

(~yi − ti~v),σ2

~v ∼ N (∑

ti(~yi−~x0)∑

t2i, σ2∑

ri ∼ N (~wTi (~x0 + ti~v), σ

◮ results and Matlab demo

−5 0 5 10 15 20 250

10iteration=1000

Gibbs sampling for Gaussian mixture

m0 r0 a0 b0

µj sj

µj ∼N (m0, r0)

sj ∼ Γ−1(a0, b0)

~w ∼D(~β)Zi|~w ∼ C(~w)

Xi|Zi, µZi, sZi∼N (µZi

, sZi)

calculate conditionals for Gibbs sampling using the results about conjugatedistributions (chapter 2)

~w|z1, . . . , zn, ~β ∼ D(β1 + n1, . . . , βk + nk) with nj = |{i|zi = j}|

µj|x1, . . . , xn, z1, . . . , zn, sj,m0, r0 ∼ N (sjm0 + r0

i|zi=j xi

sj + njr0,

sj + njr0)

sj|x1, . . . , xn, z1, . . . , zn, µj, a0, b0 ∼ Γ−1(a0 +nj

2, b0 +

i|zi=j

(xi − µj)2)

zi = j|~w, xi, µ1, . . . , µk, s1, . . . , sk ∼ C(hi,1, . . . , hi,k)

with hi,j ∝wj

2πsje− 1

(xi−µj)2

Example→ Matlab demo

−0.2 0 0.2 0.4 0.6 0.8 1 1.20

1.4iteration = 116 k = 3 n = 1000 Plot shows sampled mixture after

1000 iterations of Gibbs sampling.Sample of size 1000 is taken froma uniform distribution. Priors wereset close to non-informativity.

Slice sampling

We want to sample from a distributionwith density p(x)

Extend distribution by a secondvariable u

p′(x, u) =

1 if 0 ≤ u ≤ p(x)

0 otherwise

p′ is a pdf since it is nonnegative and∫ ∫

p′(x, u)du dx = 1

p′(x, u) u

Apply Gibbs sampling to p′:

u|x∼ U(0, p(x))x|u∼ U{x′|p(x′) ≥ u}We obtain a sample of p′. Since p is marginal of p′ we obtain a sample of p bysampling from p′ and forgetting about the ui.

Slice sampling

Executing slice sampling on the example:

(x1, u1) (x2, u1)

(x2, u2)(x3, u2)

(x3, u3)(x4, u3)

(x4, u4)

The crucial point in slice sampling is whether it is possible to determine the set{x′|p(x′) ≥ u} efficiently.

Slice sampling can also be used if p is only known up to a normalization factor.

Simulated annealing

Can we use MCMC if we want to calculate the MAP estimator of a distribution p?

Observation:

Consider the sequence of densities

p(x),1

(p(x))2,1

(p(x))3, . . .

What is the limit limν→∞1Zνpν?

Example:

p(x) =

2x if 0 ≤ x ≤ 1

0 otherwise

(p(x))ν =

(2x)ν if 0 ≤ x ≤ 1

0 otherwise

(2x)νdx =2ν

ν + 1

limν→∞

pν(x) = limν→∞

(ν + 1)xν if 0 ≤ x ≤ 1

0 otherwise

= δ(x− 1) =

∞ if x = 1

0 otherwise

Simulated annealing

In general, if a density p(x) has a single global maximum at x = xmax, then the

sequence of 1Zν(p(x))ν converges pointwise to δ(x− xmax)

Hence, the larger ν, the more probable will a MCMC sampler focus on a smallsurrounding of xmax.

Let us build a Metropolis-Hastings sampler with symmetric proposal distribution

q(z|x). The acceptance probability is min{1, ( p(z)p(x)

)ν}To be consistent with literature, let us define t = 1

ν. Hence, the acceptance

probability is min{1, ( p(z)p(x)

)1t }. t is called the temperature.

Idea:While applying the Metropolis algorithm decrease the temperature t slowly overtime.→ simulated annealing

Simulated annealing

Goal: find the MAP of a probability distribution with density function p

1. initialize x arbitrarily

2. initialize temperature t = 1

3. repeat

4. sample a candidate z ∼ q(·|x)5. calculate acceptance probability A = min{1, ( p(z)

p(x))1t }

6. with probability A

7. set x← z

8. endif

9. decrease temperature t slighly

10. until convergence

Simulated annealing

Simulated annealing is guaranteed to find the MAP estimate with probability 1 if

◮ the Markov chain generated by proposal distribution q is ergodic for anychoice of t > 0

◮ the cooling scheme is sufficiently slow

Proof idea:

◮ since q generates an ergodic Markov chain it will sample from the full

distribution 1Z 1

(p(x))1t after a certain burnin period if we keep t constant.

◮ the smaller t is, the more time the Markov chain will stay in the closesurrounding of xmax

◮ since 1Z 1

(p(x))1t → δ(x− xmax) the Markov chain will converge to xmax

Background remark:Simulated annealing was motivated by the physical annealing of solids.

→ Matlab-demo robot localization Metropolis algorithm vs. simulated annealing

Summary

◮ Monte Carlo approximation

◮ accept reject sampling

• example: robot localization

◮ Markov chains

• ergodic Markov chains

• design of transition kernels

◮ Metropolis-Hastings algorithm

• example: robot localization

◮ Gibbs sampling

• example: bearing-only tracking

• example: Gaussian mixture

◮ slice sampling

◮ simulated annealing

Chapter 07: Monte Carlo Methods -...

Documents

Transcript of Chapter 07: Monte Carlo Methods -...

Hidden Markov Models - AUusers-cs.au.dk/cstorm/courses/PRiB_f12/slides/hidden-markov-model… · Hidden Markov Models Markov Model Hidden Markov Model If the latent variables are

Markov Chains and Hidden Markov Models - Rice University · are “hidden”; hence, we have a hidden Markov model, or HMM. ... In Markov chains and hidden Markov models, the probability

Markov Processes and Controlled Markov Chains

FINITE HORIZON MARKOV DECISION PROBLEMS …steele/Publications/PDF/Dobrushin...FINITE HORIZON MARKOV DECISION PROBLEMS ... non-homogeneous Markov chain, central limit theorem, Markov

Markov Chains and Hidden Markov Models - Cornell …physiology.med.cornell.edu/people/banfelder/qbio/resources_2012... · Conclusion: Introduction to Markov Chains and Hidden Markov

Principles of Markov Automata · Markov Chain Labelled Transition Systems Discrete-Time Markov Chains Continuous-Time Markov Chains Probabilistic Decisions Non-Deterministic Decisions

Principles of Knowledge Representation and Reasoninggki.informatik.uni-freiburg.de/teaching/ws1314/krr/krr14.pdf · Motivation History Systemsand Applications DescriptionLogics inaNutshell

Chapter 9: Markov Chain Regular Markov Chains Section 9…momran/m118videos/notes/sec92.pdf · Chapter 9: Markov Chain Section 9.2: Regular Markov Chains • Irreducible Markov Chain:

Markov Systems and Markov Decision Processes...States, Actions, Observations Passive Controlled Fully Observable Markov Model Markov Decision Process (MDP) Hidden State Hidden Markov

9 Markov Chains Regular Markov Chains Absorbing Markov Chains Game Theory and Strictly Determined Games Games with Mixed Strategies Markov Chains.

Markov Chains Regular Markov Chains Absorbing Markov Chains

Reinforcement learning for robot soccer - uni-freiburg.deml.informatik.uni-freiburg.de/_media/publications/gr_09.pdf · e-mail: martin.riedmiller@informatik.uni-freiburg.de this paper,

9 Markov chains and Hidden Markov Models - Freie … · 9 Markov chains and Hidden Markov Models We will discuss: Markov chains Hidden Markov Models (HMMs) Algorithms: Viterbi, forward,

SCIENTIFIC ABSTRACT MARKOV, K.K. - MARKOV, K.K. · Title: SCIENTIFIC ABSTRACT MARKOV, K.K. - MARKOV, K.K. Subject: SCIENTIFIC ABSTRACT MARKOV, K.K. - MARKOV, K.K. Keywords: k. a r

SCIENTIFIC ABSTRACT MARKOV, P.U. - MARKOV, V.A. · Title: SCIENTIFIC ABSTRACT MARKOV, P.U. - MARKOV, V.A. Subject: SCIENTIFIC ABSTRACT MARKOV, P.U. - MARKOV, V.A. Keywords: 303!5

Hidden Markov Models€¦ · •Hidden Markov models 1 The “Markov”swe have learned so far. Markov Models 2 •A Markov model is a chain-structured BN –Conditional probabilities

Markov chains, Markov Processes, Queuing Theory and ...anthonybusson.fr/fileTeaching/Markov.pdf · Markov chains, Markov Processes, Queuing Theory and Application to Communication

AUTHORS FR: MARKOV, G.L. TO: MARKOV, V.A. · Title: AUTHORS FR: MARKOV, G.L. TO: MARKOV, V.A. Subject: AUTHORS FR: MARKOV, G.L. TO: MARKOV, V.A. Keywords: Tho IL-28 kircrtSt (Technical

uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ws1617/presentation1a.pdfTitle: Deep Learning Lab Course 2016 (Deep Learning Practical) Author: Labs: (Computer Vision)

Informatik Igki.informatik.uni-freiburg.de/teaching/ws1314/info1/infoI07.pdfInformatik I Bernhard Nebel Sequenzen Operationen auf Sequenzen Iteration Informatik I 7.Sequenzenundfor-Schleifen