Markov Models and Applications

Hans Peter Schwefel

Markov Models Lecture 5, Spring 08

Markov Models and Applications

• Mm1 Discrete time Markov processes

• Mm2 Continuous time Markov processes

• Mm3 M/M/1 type models

• Mm4 Advanced queueing models

• Mm5 Hidden Markov Models and their application (hps)

[email protected]

http://www.kom.auc.dk/~hps

Henrik Schiøler, Hans-Peter Schwefel

Note: slide-set will be complemented by formulas, mathematical derivations, and examples on the black-board!

Hans Peter Schwefel


Motivation: Stochastic models• Goals:

– Quantitative analysis of (communication) systems• E.g., Quality of Service

– Enhanced Algorithms for Information Processing• ’Extrapolation’, Error Concealment, Localisation, fault detection, etc.

• Stochastic Impact– Error Models– Randomization in Transmission Protocols– Complex systems abstraction using statistics– Human Impact (e.g. Traffic, Mobility Models)

Frequently use of stochastic models– Simulation Models Stochastic Simulation– Analytic Models, e.g. Markovian Type, stochastic Petri Nets

Hans Peter Schwefel


Content1. Intro2. Revision: Discrete Time Markov Processes

• Definition, basic properties• State-probabilities, steady-state analysis• Parameter Estimation, Example: Mobility Model

3. Hidden Markov Models• Definition & Example• Efficient computation of Pr(observation)• Most likely state sequence• Parameter Estimation

4. Application Examples of HMMs• Link error models• Mobility models, positioning• Fault-detection• error concealment

5. Summary & Exercises

Hans Peter Schwefel


Discrete Time Markov Processes• Definition

– State-Space: finite or countable infinite, w/o.l.g. E={1,2,...,N} (N= also allowed)

– Transition probabilities: pjk=Pr(transition from state j to state k)

– Xi = RV indicating the state of the Markov process in step i– ’Markov Property’: State in step i only depends on state in step i-1

• Pr(Xi=s | Xi-1=si-1,Xi-2=si-2 ,..., X0=s0 ) = Pr(Xi=s | Xi-1=si-1)

• Computation of state probabilities– Initial state probabilities (Step 0): 0

– Probability of state-sequence s0 ,s1 ,...,si: Pr(X0=s0 ,X1=s1 ,..., Xi=si ) = ...– Pr(Xi=k)=j [Pr(Xi-1=j)*pjk] i = i-1 P

• State-holding time: geometric with parameter p ii

• Parameter Estimation for ’observable’ discrete time Markov Chains• Example: 2-state Markov chain (state = link behavior at packet transmission

{erroneous,ideal})– Parameter estimation, Markov property validation, limitations

Hans Peter Schwefel


Discrete Time Markov Processes (cntd.)

• Properties– homogenuity: P independent of step i– Irreducibility: each state is reachable from any other state (in potentially multiple

steps)

– Transient states, positive recurrent states– Periodicity

• Steady-state probabilities =limi i

– Limit exists and is independent of 0 if Markov chain irreducible and aperiodic– Aperiodic & positive recurrent = ergodic is probability distribution

• Examples (periodicity, ergodicity, steady-state probabilities, absorbing states)

• Application example: mobility model – set-up, benefits, problems

Hans Peter Schwefel







Hans Peter Schwefel


Hidden Markov Models (HMMs): Definition• Main property

– In each state s E, an ’observation symbol’ from some alphabet V is generated probabilistically

– The underlying state cannot be observed, only the sequence O=[O1,O2,...,OT] of generated symbols

• HMM = <E, V, 1, P, B>– E: state-space (discrete, finite/infinite), w/o.l.g. E={1,2,...,N}– V: set of possible observation symbols (discrete for now), w/o.l.g V={1,2,...,M} 1: initial state probabilities at step 1– P: NxN matrix of state transition probabilities pij = Pr(Xk+1=j | Xk=i)– B: NxM matrix of symbol generation probabilities: bij = Pr (Ok=j | Xk=i)

• Example: 2-state HMM, observations = result from biased coin-toss• Note: Discrete time Markov model is special case of HMM,

namely each column of B contains at most one non-zero element

• Exercise: Write a (Matlab) program with input (1, P, B,T) that generates a sequence of observations of length T

Hans Peter Schwefel


Hidden Markov Models (HMMs): Computations• Problem 1: Compute probability of observing a certain sequence o=[o1,...,oT] in a

given HMM.– First (inefficient) approach (’brute-force’):

• Generate all possible state-sequences of length T: q=[q1,...,qT]• Sum up all Pr(o| q) weigthed by Pr(q) (total probabilities)• Problem: Number of paths grows exponentially as NT

– More efficient (quadratic in N) approach: forward procedure• Iterative method computing probabilities for pre-fixes of the observation sequence:

t := [Pr(O1=o1,...,Ot=ot, Xt=1), ..., Pr(O1=o1,...,Ot=ot, Xt=N)]• At step t=1: 1(i) = Pr(O1=o1, X1=i) = 1(i) bi,o1 [ Matlab Notation: 1 = 1 .* B(:, o1 ) ’]• tt+1 (t=1,2,...,T-1):

t+1(i) = (jE t(j) pji) Pr(Ot+1=ot+1 | Xt+1=i ) t+1 = (t P) .* B(:, ot+1 )’

• Finally: Pr(O=o) = jE T(j) • Computation can be illustrated in Trellis structure

– Similarly (and identifiers needed later): Backwards procedure t := [Pr(Ot+1=ot+1,...,OT=oT| Xt=1), ..., Pr(Ot+1=ot+1,...,OT=oT | Xt=N)] T =1 (vector with all elements = 1); t = (P * B(:, ot+1 ))’ .* t+1

Hans Peter Schwefel


HMMs: Computations (cntd.)Problem 2: Find ’most likely’ state sequence for an observation

o=[o1,...,oT] in a given HMM.

• I.e. find the sequence q*=[q1*,...,qT*] that maximizes Pr(X1=q1,...,XT=qT | O=o) (or, equivalently, the joint probability)

– Optimization via pre-fix of length t (Viterbi Algorithm):

t := [maxq1,...,qt-1{Pr(X1=q1,...,Xt-1=qt-1, Xt=1, O1=o1,...,Ot=ot)},

..., maxq1,...,qt-1{ Pr(X1=q1,...,Xt-1=qt-1, Xt=N, O1=o1,...,Ot=ot)}]

– Algorithm 1 = 1 .* B(:, o1 )

t+1 (j) = [maxi=1,...,N t(i) pij] Bj,ot+1, t+1(j)=argmaxi=1,...,N t(i) pij, t=1,2,...,T-1

• Maximum of probability: p*= maxi=1,...,N T(i), qT*= argmaxi=1,...,N T(i)

• state sequence: qt*= t+1(qt+1*), t=T-1,...,1

– Efficient implementations: use of logarithms to avoid multiplications

Hans Peter Schwefel


HMMs: Computations (cntd.)Problem 3: Find ’most likely’ HMM model for an observation o=[o1,...,oT].• Assumption: State-space E and symbol alphabet V are given• Hence, desired is <1*, P*, B*> such that Pr <1, P, B> (O=o) is maximized• Iterative procedure for maximization: <1

(m), P(m), B(m)> <1(m+1), P(m+1), B(m+1)>

– Compute using model <1(m), P(m), B(m)>:

t(i):=Pr(Xt=i | O=o) = t(i) t(i) / i [t(i) t(i)] t(i,j):= Pr(Xt=i, Xt+1=j | O=o) = t(i) pij bj,ot+1 t+1(j) / j i [t(i) pij bj,ot+1 t+1(j)]

– ’Expectations’:• T(i):= t=1

T-1 t(i) = expected number of transitions from state i in o

• T(i,j):= t=1T-1 t(i,j) = expected number of transitions from state i to state j in o

• S(i,k):= t=1,...,T, ot=k t(i) = expected number of times in state i in o and observing symbol k

• S(i):= t=1,...,T, t(i) = expected number of times in state i in o

– Updated HMM: 1

(m+1) =[1(1),..., 1(N)], pij(m+1)=T(i,j)/T(i),

• bik(m+1)= S(i,k)/S(i)

– Update-step increases Pr <1, P, B> (O=o), but potentially convergence to local maximum

Hans Peter Schwefel







Hans Peter Schwefel


HMMs: Application Examples

• Link error models– State-space=different levels of link quality, observation V={error, correct}– Equivalent to ’biased’ coin toss example– Extensions to multiple link-states– Advantage: more general types of burst errors

• Mobility models– State-space=product space(different classification of user-behavior, current coordinates)– observation = set of discrete positions of user/device

• Positioning– State-space same as mobility model– Observations now e.g. RSSI distributions

Hans Peter Schwefel


HMMs: Application Examples II

• Fault-detection (Example from last semester student project)

– State-space={Congested, lowly utilized} x {good wireless link, bad link}– Observations: discrete levels of RTT measurements (per packet) and packet loss events

(binary)– Discussion of advantages/disadvantages, comparison to Bayesian Networks

• Error concealment– E.g. Transmission of speech over noisy/lossy channel– State-space=speaker model– observation = received symbols, subject to loss/noise

Hans Peter Schwefel


Summary1. Intro 2. Revision: Discrete Time Markov Processes





Hans Peter Schwefel


References• L. Rabiner, B-H Juang: ’Fundamentals of Speech Recognition’, Prentice Hall, 1993.

– Sections 6.1-6.4

Hans Peter Schwefel


Exercises 1

Hidden Markov Models: Given is the following 3-state hidden Markov model with parameters pi1=[0.2,0.3,0.5], P=[0.2,0.4,0.4; 0.5,0.1,0.4; 0.2,0.2,0.6]. The observations are coin-toss results (Heads=1, Tails=2) with B=[0.8,0.2;0.5,0.5;0.1,0.9].

a) write a (Matlab) program that generates observation sequences of length T from the given HMM.

b) Write a program that efficiently compute the probability of a given observation sequence. Run the program for S=’HHTHTTTHT’. Compare with a probability estimate via simulation using the program from Task a.

c) Write a program to determing the most-likely state sequence and run the program for the sequence in (b).

Hans Peter Schwefel


Exercises 2

Localisation with HMMs: Consider a 5mx5m squared room in which 3 access points are placed in the three corners (0,5), (5,5), (5,0). Use a grid with 1mx1m elements to discretize this geographic space. A mobile device is moving through the room and the Access Points measure received signal strength which follows a path-loss model RSSI[dB] = Round(- 6 log10 (d/d0)+13+N), with d0=0.1m. The Noise N is assumed to be Normal distributed with standard deviation sigma=2.

Write Matlab functions to

a) Compute for each grid position (i,j), the probabilities of observing an RSSI triplet (R1,R2,R3), Ri=0,...,9.

b) Determine the MLE of the trajectory of the mobile device for observation sequence [1,2,1],[2,0,4],[4,2,1],[7,3,4].

c) Assume that the mobile device moves equally likely in any of the possible (2-4) vertical/horizontal directions, with velocity 1m/timeunit. Setup the matrices P and B that describe the resulting HMM. (Use lexiographic order for the 2-dimensional coordinates and for the RSSI triplets)

d) Determine the most likely trajectory for the above observation sequence resulting from the HMM.

Markov Models and Applications

Documents

Transcript of Markov Models and Applications