Kalman filtering & smoothing - Sharif

Kalman filtering & smoothing

40-957 Special Topics in Artificial Intelligence:

Probabilistic Graphical Models

Sharif University of Technology

Soleymani

Spring 2014

From latent variable models to dynamic

models

2

Two categories of latent variable models that we have seen in

previous lectures:

Discrete latent variable

Mixture models

Continuous latent variable

Factor analysis

Dynamical models

Mixture models -> HMM

HMM as dynamical generalization of mixture models

latent variables are discrete but with arbitrary emission probability distributions.

Factor Analysis -> Kalman Filter

Kalman filter as dynamical generalization of factor analysis

State-Space Model (SSM)

3

The latent variables 𝒛1, … , 𝒛𝑇 are form a chain.

Independence relationships (the same as HMM):

Given the state at one moment in the time, the states in the future are conditionally independent of those in the past.

The observation of the observation nodes fails to separate any of the state nodes.

𝒛1 𝒛2 𝒛𝑇

𝒙1 𝒙2 𝒙𝑇

… 𝒛𝑇−1

𝒙𝑇−1

Linear Dynamical System (LDS)

4

𝒛1 = 𝒖 𝒖𝑡~𝒩(𝟎, 𝚺0)

𝒛𝑡+1 = 𝑨𝒛𝑡 + 𝑮𝒘𝑡+1 𝒘𝑡~𝒩(𝟎, 𝑸)

𝒙𝑡 = 𝑪𝒛𝑡 + 𝒗𝑡 𝒗𝑡~𝒩(𝟎, 𝑹)

𝑃 𝒛1 = 𝒩(𝒛1|𝟎, 𝚺0) 𝑃 𝒛𝑡+1|𝒛𝑡 = 𝒩 𝒛𝑡+1|𝑨𝒛𝑡 , 𝑮𝑸𝑮𝑇

𝑃 𝒙𝑡|𝒛𝑡 = 𝒩 𝒙𝑡|𝑪𝒛𝑡, 𝑹

Linear Gaussian model



… 𝒛𝑇−1

𝒙𝑇−1

𝒛𝑡+1 = 𝒇(𝒛𝑡) + 𝑮𝒘𝑡+1 𝒘𝑡~𝒩(𝟎, 𝑸)

𝒙𝑡 = 𝒈(𝒛𝑡) + 𝒗𝑡 𝒗𝑡~𝒩(𝟎, 𝑹)

General state space model:

Kalman filter applications

5

The Kalman filter has been widely used in many real-time

tracking applications.

Many other applications such as:

Navigation and guidance system (Simultaneous Localization

And Mapping)

Control systems

Time-series processing

Inference in LDS

6

Calculation of the posterior probability of the states given

an observation sequence

We will see two types of inference problems on LDS:

Filtering

𝑃 𝒛𝑡|𝒙1, … , 𝒙𝑡

Smoothing

𝑃 𝒛𝑡|𝒙1, … , 𝒙𝑇

Online inference

Offline inference

Inference in LDS

7

The graphical model of LDS is tree-structured and inference

can be solved efficiently using the sum-product algorithm:

Filtering: The forward recursions, analogous to 𝛼 messages of HMM, are

known as Kalman filter equations

Smoothing: The backward recursions, analogous to 𝛽 messages, are

known as the Kalman smoother equations

Inference algorithms on SSM are similar to the inference

algorithms on HMM

Message passing

8

Joint distribution (on a linear-Gaussian network) is multi-

variate Gaussian

Thus, marginal and conditional distributions will also be

Gaussian.

We can use message-passing in Gaussian networks to

solve inference problems of LDS

We will focus on only mean and variance computations

Kalman filter: messages

9

Filtering is similar to (but not the same as) forward algorithm in

HMM:

𝛼 𝑡 𝒛𝑡 = 𝑃 𝒛𝑡|𝒙1, … , 𝒙𝑡 =𝑃 𝒛𝑡 , 𝒙1, … , 𝒙𝑡

𝑃 𝒙1, … , 𝒙𝑡=

𝛼𝑡 𝒛𝑡

𝑃 𝒙1, … , 𝒙𝑡

The distribution 𝑃 𝒛𝑡|𝒙1, … , 𝒙𝑡 is 𝒩 𝒛𝑡|𝝁𝒛𝑡|𝒙1:𝑡, 𝜮𝒛𝑡|𝒙1:𝑡

Assume that we have calculated 𝛼 𝑡 𝒛𝑡 we need to calculate

𝛼 𝑡+1 𝒛𝑡+1 = 𝑃 𝒛𝑡+1|𝒙1, … , 𝒙𝑡+1



… 𝒛𝑇−1

𝒙𝑇−1

𝛼 1(. ) 𝛼 2(. ) 𝛼 𝑇−1(. ) 𝛼 𝑇(. )

Kalman filter

10

To find 𝑃 𝒛𝑡+1|𝒙1, … , 𝒙𝑡+1 , we use two recursive

updates:

Predict step (time update): best guess before seeing

measurement

Compute 𝑃 𝒛𝑡+1|𝒙1, … , 𝒙𝑡 from 𝑃 𝒛𝑡|𝒙1, … , 𝒙𝑡

Measurement update step: after measurement, we find the

new posterior in which 𝒙𝑡+1 is also given as evidence

Compute 𝑃 𝒛𝑡+1|𝒙1, … , 𝒙𝑡+1 from 𝑃 𝒛𝑡+1|𝒙1, … , 𝒙𝑡


16

Updates based on Kalman gain matrix:

𝝁𝑡+1|𝑡+1 = 𝝁𝑡+1|𝑡 + 𝑲𝑡+1 𝒙𝑡+1 − 𝑪𝝁𝑡+1|𝑡

𝜮𝑡+1|𝑡+1 = 𝜮𝑡+1|𝑡 − 𝑲𝑡+1𝑪𝜮𝑡+1|𝑡 = 𝑰 − 𝑲𝑡+1𝑪 𝜮𝑡+1|𝑡

Update takes linear combination of predicted mean 𝑪𝝁𝑡+1|𝑡 and

observation 𝒙𝑡+1, weighted by predicted covariance

Kalman filter as a process of making successive predictions 𝑪𝝁𝑡+1|𝑡 and then

correcting these predictions in the light of the new observations 𝒙𝑡+1.

Covariance update is independent of observed measurements

Only depends on LDS parameters and can be computed offline

Kalman gain matrix

𝑲𝑡+1 ≡ 𝜮𝑡+1|𝑡𝑪𝑇 𝑪𝜮𝑡+1|𝑡𝑪

𝑇 + 𝑹−1

Smoothing

18

Off-line inference in an LDS

Combine forward recursion with a backward recursion

It is the Gaussian analog of the forwards-backwards (alpha-

gamma) algorithm on an HMM

RTS smoother: update equations

22

𝑃 𝒛𝑡|𝒙1, … , 𝒙𝑇 = 𝒩 𝝁𝑡|𝑇 , 𝚺𝑡|𝑇

𝝁𝑡|𝑇 = 𝝁𝑡|𝑡 + 𝑳𝑡 𝝁𝑡+1|𝑇 − 𝝁𝑡+1|𝑡

𝚺𝑡|𝑇 = 𝚺𝑡|𝑡 + 𝑳𝑡 𝚺𝑡+1|𝑇 − 𝚺𝑡+1|𝑡 𝑳𝑡𝑇

Derivation for mean: 𝝁𝑡|𝑇 = 𝐸 𝒛𝑡 𝒙1:𝑇 = 𝐸 𝐸[𝒛𝑡|𝒛𝑡+1, 𝒙1:𝑡]|𝒙1:𝑇

= 𝐸 𝝁𝑡|𝑡 + 𝑳𝑡 𝒛𝑡+1 − 𝝁𝑡+1|𝑡 |𝒙1:𝑇

= 𝝁𝑡|𝑡 + 𝑳𝑡 𝝁𝑡+1|𝑇 − 𝝁𝑡+1|𝑡

𝚺𝑡|𝑇 can be found similarly

𝐸 𝑋 𝑍 = 𝐸 𝐸[𝑋|𝑌, 𝑍]|𝑍

𝑉𝑎𝑟 𝑋 𝑍 = 𝑉𝑎𝑟 𝐸[𝑋|𝑌, 𝑍]|𝑍 + 𝐸[𝑉𝑎𝑟[𝑋|𝑌, 𝑍]|𝑍]

𝝁𝑇|𝑇 and 𝚺𝑇|𝑇 are initialized from the filtering pass


−1

Example

23

In LDS, the sequence of individually most probable values of latent variables is the same as the most probable latent sequence.

No need to use the analogue of the Viterbi algorithm for the LDS.

Thus, red crosses show the most probable sequences obtained using filtering (b) and smoothing (c) algorithms

[Murphy]

Learning of LDS

24

Learning – EM algorithm

E-step: expected sufficient statistics are found

𝐸[𝒛𝑡]

𝐸[𝒛𝑡𝒛𝑡𝑇]

𝐸[𝒛𝑡𝒛𝑡−1𝑇 ]

M-step:

update of 𝑪 and 𝑹 is similar to the M-step of factor analysis

LDS: summary

25

SSMs are dynamical models that allows continuous states

(latent variables)

LDS is a linear-Gaussian SSM

Inference problems in LDS can be solved using message

passing:

Kalman filter can be used to solve the filtering problem

RTS smoother can be used to solve the smoothing problem

Kalman filtering & smoothing - Sharif

Documents

Transcript of Kalman filtering & smoothing - Sharif