Markov Chains Introduction Simulation - News · Markov Chains Introduction Simulation ... is called...

Post on 17-May-2018

226 views 6 download

Transcript of Markov Chains Introduction Simulation - News · Markov Chains Introduction Simulation ... is called...

Markov Chains

Introduction

Simulation

Modelling Cloud Cover Data

1

Finite State Markov Chains

Definition: State Space = S = {1,2, . . . ,m}.

Definition: The sequence of random variables X1, X2, X3, . . . ,

is called a Markov chain if the Markov property holds:

P (Xn = xn|Xn−1 = xn−1, Xn−2 = xn−2, . . .) =

P (Xn = xn|Xn−1 = xn−1)

where xn, xn−1, xn−2, . . . are elements of S.

2

Example

B1, B2, . . . are independent Bernoulli random variables with pa-

rameter p. Xn =∑nk=1Bk(mod 2) + 1, for n = 1,2, . . . is a

Markov chain with state space S = {1,2}.

P (Xn = 1|Xn−1 = 1) = 1− p.

P (Xn = 2|Xn−1 = 1) = p.

P (Xn = 1|Xn−1 = 2) = p.

P (Xn = 2|Xn−1 = 2) = 1− p.

3

Transition Matrix

Define a matrix P with (i, j)th entry

pij = P (Xn = j|Xn−1 = i)

pij is called the transition probability from state i to state j. P

is called a transition matrix.

All rows of P sum to one. That is,

P

1...1

=

1...1

4

Example (cont’d)

P =

1-p pp 1-p

P

11

=

11

5

Another Example

Set

P =

0 0.5 0.50 0.5 0.51 0 0

.

6

Example

We can study this example with R. Set up the 3× 3 matrix P:

> P <- matrix(c(0, 0, 1, .5, .5, 0, .5, .5, 0), nrow=3)

> P

[,1] [,2] [,3]

[1,] 0 0.5 0.5

[2,] 0 0.5 0.5

[3,] 1 0.0 0.0

7

Example

Add up the rows:

> P%*%rep(1,3)

[,1]

[1,] 1

[2,] 1

[3,] 1

8

Simulating a Markov Chain

We want to simulate n values of a Markov chain having transi-

tion matrix P, starting at x1.

Function name: MC.sim

Input: n, P, x1

Output: a vector of length n

9

Reproducing Our Output

> options(width=55)

> set.seed(12867)

10

Example

Generate 20 values from the 3 × 3 transition matrix P with

starting value 3:

> MC.sim(20, P, 3)

[1] 3 1 2 2 3 1 3 1 3 1 3 1 2 3 1 2 2 3 1 3

11

Another Example

P =

0.7 0.30.4 0.6

> P.mat <- matrix(c(0.7,0.3,0.4,0.6),ncol=2,byrow=TRUE)

> MC.eg <- MC.sim(100, P.mat)

12

Output

> MC.eg

[1] 2 2 2 1 1 1 1 1 1 2 2 1 1 2 2 1 1 1 1 1 1 2 1 1 1

[26] 1 1 2 2 2 2 2 1 1 1 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2

[51] 1 1 1 1 2 2 1 1 1 2 1 2 2 1 1 2 1 1 2 2 1 1 2 1 1

[76] 1 1 1 2 1 2 2 2 1 1 1 2 2 1 1 1 1 1 1 2 2 2 2 2 1

13

A Markov Chain Simulator

> MC.sim <- function(n,P,x1) {

+ sim <- as.numeric(n)

+ m <- ncol(P)

+ if (missing(x1)) {

+ sim[1] <- sample(1:m,1) # random start

+ } else { sim[1] <- x1 }

+ for (i in 2:n) {

+ newstate <- sample(1:m,1,prob=P[sim[i-1],])

+ sim[i] <- newstate

+ }

+ sim

+ }14

Understanding the Code

Use of the sample() function:

> m <- 3

> sample(1:m,1,prob=c(.2, .7, .1))

[1] 2

The above simulates a random variable with distribution

p(1) = .2, p(2) = .7, p(3) = .1.

15

Understanding the Code

If the current value of the Markov Chain, having transition ma-

trix P, is j, then the next value will be

> j <- 2

> sample(1:m,1,prob=P[j,])

[1] 2

The above simulates a random variable with distribution

P (j,1), P (j,2), P (j,3). (j is 2 here.)

16

Analyzing Cloud Cover Data

> MC2 <- function(x) {

+ # Fit a 2 state MC to data in vector x (S = {1, 2})

+ n <- length(x)

+ N1 <- sum(x[-n]==1)

+ N2 <- sum(x[-n]==2)

+ N11 <- sum(x[-n]==1 & x[-1]==1)

+ N12 <- sum(x[-n]==1 & x[-1]==2)

+ N21 <- sum(x[-n]==2 & x[-1]==1)

+ N22 <- sum(x[-n]==2 & x[-1]==2)

+ P <- matrix(c(N11/N1, N21/N2, N12/N1, N22/N2), nrow=2)

+ return(P)

+ }17

Analyzing Cloud Cover Data

> source("cloud70.R")

> # this vector contains hourly records of cloud data

> # from Winnipeg, Canada for June 1, 1970 through

> # September 30, 1970. "1" is clear; "2" is cloudy

> # (cloudy means the measured value exceeds 2)

> cloud70[1:100] # observed data (first 100 hours)

[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2

[26] 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

[51] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

[76] 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 2 2 2 2 1 2 2

18

Analyzing Cloud Cover Data

> P <- MC2(cloud70)

> P # estimated transition matrix for hourly cloud cover

[,1] [,2]

[1,] 0.8211998 0.1788002

[2,] 0.2537190 0.7462810

19

Analyzing Cloud Cover Data

Simulate data from the transition matrix in order to see if the

Markov chain model is realistic:

> cloud70.sim <- MC.sim(length(cloud70), P)

> cloud70.sim[1:100]

[1] 1 1 1 1 1 2 1 1 2 2 2 2 2 1 1 2 2 2 1 1 1 1 1 1 2

[26] 2 2 2 1 1 1 1 1 1 1 2 2 2 1 2 1 1 1 1 1 1 2 2 2 2

[51] 2 2 1 1 2 1 1 1 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 2

[76] 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 2 2

20

Checking the Markov Chain model

Comparing runlengths in simulated and real data:

> MC2.chk <- function(x) {+ P <- MC2(x)+ x.sim <- MC.sim(length(x), P)+ x.runs <- rle(x)+ x.len <- length(x.runs[[1]])+ x.1 <- x.runs[[1]][seq(1,x.len,2)]+ x.2 <- x.runs[[1]][seq(2,x.len,2)]+ x.sim.runs <- rle(x.sim)+ x.sim.len <- length(x.sim.runs[[1]])+ x.sim.1 <- x.sim.runs[[1]][seq(1,x.sim.len,2)]+ x.sim.2 <- x.sim.runs[[1]][seq(2,x.sim.len,2)]+ par(mfrow=c(1,2))+ qqplot(x.1, x.sim.1, xlab="observed state 1 runlengths",+ ylab="simulated state 1 runlengths")+ abline(0,1)+ qqplot(x.2, x.sim.2, xlab="observed state 2 runlengths",+ ylab="simulated state 2 runlengths")+ abline(0,1)+ }

21

Checking the Markov Chain model

According to the following plots, the Winnipeg stays sunnier

longer than predicted by the Markov chain, but the cloudy pe-

riods are predicted well.

> MC2.chk(cloud70)

22

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●

●●●●●●●●●

●●

0 20 40 60

510

1520

observed state 1 runlengths

sim

ulat

ed s

tate

1 r

unle

ngth

s

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●

●●●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●

●●●●

●●●●

●●●

5 10 15 20

05

1015

2025

observed state 2 runlengths

sim

ulat

ed s

tate

2 r

unle

ngth

s