Lecture7 - Automatic Controlfredrik/edu/sensorfusion/lecture7.pdf · Sensor Fusion, 2014 Lecture 7:...

Sensor Fusion, 2014 Lecture 7: 1

Lecture 7:

Whiteboard:I PF tuning and properties

Slides:I Proposal densities and SIS PFI Marginalized PF (MPF)I Filterbanks


Lecture 6: SummaryBasic SIR PF algorithmChoose the number of particles N.Initialization: Generate x i

0 ∼ px0 , i = 1, . . . ,N, particles.Iterate for k = 1, 2, . . . , t:1. Measurement update: For k = 1, 2, . . . ,

w ik = w i

k−1p(yk |x ik).

2. Normalize: w ik := w i

k/∑

i wik .

3. Estimation: MMSE xk ≈∑N

i=1 w ikx i

k or MAP.4. Resampling: Bayesian bootstrap: Take N samples with

replacement from the set {x ik}Ni=1 where the probability to take

sample i is w ik . Let w i

k = 1/N.5. Prediction: Generate random process noise samples

v ik ∼ pvk x i

k+1 = f (x ik , v

ik).


PF design

Design freedoms:1. Choice of N is a complexity vs performance trade-off.

Complexity is linear in N, while the error in theory is boundedas gk/N, where gk is polynomial in k but independent of nx .

2. Neff = 1∑i (w

ik)

2 controls how often to resample. Resample if

Neff < Nth. Nth = N gives SIR. Resampling increases variancein the weights, and thus the variance in the estimate, but it isneeded to avoid depletion.

3. The proposal density. An appropriate proposal makes theparticles explore the most critical regions, without wastingefforts on meaningless state regions.

4. Pretending the process (and measurement) noise is larger thanit is (dithering, jittering, roughening) is as for the EKF andUKF often a sound ad-hoc solution to avoid filter divergence.


Common PF extensions

Main problem with basic SIR PF: depletion.

After a while, only one or a few particles are contributing to x .

The effective number of samples, Neff is a measure of this.Neff = N means that all particles contribute equally, and Neff = 1means that only one has a non-zero weight.Too few design parameters, more degrees of freedom:

I Sequential importance sampling: means that you onlyresample when needed, Neff < Nth.

I The theory allows for a general proposal distributionq(x i

k |x i0:k−1, y1:k) for how to sample a new state in the time

update. The “prior” q(x ik |x i

0:k−1, y1:k) = p(x ik |x i

k−1) is thestandard option, but there might be better ones.


SIS PF algorithmChoose the number of particles N, a proposal densityq(x i

k |x i0:k−1, y1:k), and a threshold Nth (for instance Nth = 2N/3).

Initialization: Generate x i0 ∼ px0 , i = 1, . . . ,N.

Iterate for k = 1, 2, . . . :1. Measurement update: For i = 1, 2, . . . ,N,

w ik = w i

k−1p(yk |x i

k)p(xik |x i

k−1)

q(x ik |x i

k−1, yk).

2. Normalize: w ik := w i

k/∑

i wik .

3. Estimation: MMSE xk ≈∑N

i=1 w ikx i

k .4. Resampling: Resample with replacement when

Neff = 1∑i (w

ik)

2 < Nth.

5. Prediction: Generate samples x ik+1 ∈ q(xk |x i

k−1, yk).


Choice of proposal density

1. Factorized form

q(x0:k |y1:k) = q(xk |x0:k−1, y1:k)q(x0:k−1|y1:k),

In the original form, we sample trajectories.2. Approximate filter form

q(x0:k |y1:k) ≈ q(xk |x0:k−1, y1:k)

In the approximate form, we keep the previous trajectory and justappend xk .


1. Optimal form

q(xk |x ik−1, yk) = p(xk |x i

k−1, yk) =p(yk |xk)p(xk |x i

k−1)

p(yk |x ik−1)

,

w ik = w i

k−1p(yk |xk−1).

Optimal since the sampling process of xk does not influence (thatis, increase variance of) the weights. Drawbacks:

I It is generally hard to sample from this proposal density.I It is generally hard to compute the weight update needed for

this proposal density, since it would require to integrate overthe whole state space to obtain something computablep(yk |xk−1) =

∫p(yk |xk)p(xk |xk−1) dx .

For linear (linearized) Gaussian likelihood and additive Gaussianprocess noise, the integral can be solved, leading to a (extended)KF time update.


2. Prior

q(xk |x ik−1, yk) = p(xk |x i

k−1),

w ik = w i

k−1p(yk |x ik).

The absolutely simplest and most common choice.


3. Likelihood

q(xk |x ik−1, yk) = p(yk |xk),

w ik = w i

k−1p(xk |x ik−1).

Good in high SNR applications, when the likelihood contains moreinformation about x than the prior.Drawback: the likelihood is not always invertible in x .


Marginalized Particle Filter

Objectives: decrease the number of particles for large state spaces(say nx > 3) by utilizing partial linear Gaussian substructures.The task of nonlinear filtering can be split into two parts:1. representation of the filtering probability density function2. propagation of this density during the time and measurement

update stagesPossible to mix a parametric distribution in some dimensions withgrid/particle represention in the other dimensions.

True Particle Mixed

−20

24

−2

0

2

4

0

0.05

0.1

0.15

x1x2

−2 −1 0 1 2 3 4 5−2

−1

0

1

2

3

4

5

x1

x2

−20

24

−2

0

2

4

0

0.05

0.1

0.15

x1x2


Marginalized particle filter

Model

xnk+1 = f n

k (xnk )+F n

k (xnk )x

lk+Gn

k (xnk )w

nk ,

x lk+1 = f l

k (xnk )+F l

k(xnk )x

lk +G l

k(xnk )w

lk ,

yk = hk(xnk )+Hk(xn

k )xlk+ek .

All of wn, w l , e and xk0 are Gaussian. xn

0 can be general.Basic factorization holds: conditioned on xn

1:k , the model is linearand Gaussian.This framework covers many navigation, tracking and SLAMproblem formulations! Typically, position is the non-linear state,while all other ones are (almost) linear where the (extended) KF isused.


Key factorization

Split the state vector into two parts (’linear’ and ’non-linear’)

xk =

(xnk

x lk

),

The key idea in the MPF is the factorization

p(x lk , x

n1:k |y1:k) = p(x l

k |xn1:k , y1:k)p(xn

1:k |y1:k).


ExampleTerrain navigation in 1D. Unknown velocity considered as a state.

xk+1 = xk + uk +T 2

s2

vk ,

uk+1 = uk + Tsvk ,

yk = h(xk) + ek ,

Conditional on trajectory x1:k , the velocity is given by a linear andGaussian model

uk+1 = uk + Tsvk , dynamics

xk+1 − xk = uk +T 2

s2

vk measurement.

Given this trajectory, KF time updates linear part

xk+1 = xk + uk|k +T 2

s2

vk , Cov(uk) = Pk|k

yk = h(xk) + ek .


Principal MPF Algorithm

1. PF time update using where x lk is intepreted as process noise.

2. KF time update using for each particle xn,i1:k .

3. KF extra measurement update using for each particle xn,i1:k .

4. PF measurement update and resampling where x lk is intepreted

as measurement noise.5. KF measurement update for each particle xn,i

1:k .

If there is no linear term in the measurement equation, the KFmeasurement update in 5 disappears, and the Ricatti equation forP becomes the same for all sequences xn

1:k . That is, only oneKalman gain for all particles!


Properties

MPF compared to full PF givesI Fewer particles needed.I Less variance.I Less risk of divergence.I Less tuning of importance density and resampling needed.

at the cost of a more complex algorithm.


The variance formula

The law of total variance says that

Cov (U) = Cov (E(U|V )) + E (Cov(U|V )) .

Example: x ∈ 0.5N (−1, 1) + 0.5N (1, 1). Let U = N (0, 1) and Vthe mode ±1. Then

E(x) = 0,

Var(x) = 1+(0.5 · (1− 0)20.5 · (−1− 0)2

)= 2


Property: variance reduction

Letting U = x lk and V = xn

1:k gives the following decomposition ofthe variance of the PF:

Cov(x lk)︸︷︷︸

PF

= Cov(E(x l

k |xn1:k))+ E

(Cov(x l

k |xn1:k))

= Cov(x lk|k(x

n,i1:k))

︸︷︷︸MPF

+N∑

i=1

w ik Pk|k(x

n,i1:k)︸︷︷︸

KF

.


Decreasing variance or N

102

103

104

105

0

2

4

6

8

10

12

N

Pl

Covariance for linear states

MPF

PF

KF

Schematic view of how the covariance of the linear part of the statevector depends on the number of particles for the PF and MPF,respectively. The gain in MPF is given by the Kalman filtercovariance.


Jump Markov Linear ModelGeneral linear model with discrete mode parameter ξ, which takeson values 1, 2, . . .m:

xk+1 = F (ξk)xk + vk(ξk),

yk = H(ξk)xk + ek(ξk),

vk(ξk) ∈ N (µv (ξk),Q(ξk)),

ek(ξk) ∈ N (µe(ξk),R(ξk)),

x0(ξk) ∈ N (µx0(ξk),Px0(ξk)).

I Noises belong to a Gaussian mixture model.I A Gaussian mixture can approximate any PDF arbitrarily well.I The model is conditionally linear and Gaussian, so given the

sequence ξ1:k , the posterior is Gaussian and computed by theKalman filter(

xk |y1:k , ξ1:k)∈ N (xk|k(ξ1:k),Pk|k(ξ1:k)).


MPF as a filter bank

Denoting xnk = ξk and x l

k = xk , we have a formal relationshipbetween the KF bank and MPF. They both compute theconditional distribution(

xk |y1:k , ξ1:k)∈ N (xk|k(ξ1:k),Pk|k(ξ1:k)),

using a KF.For a finite state space of ξk ∈ [1, 2, . . . ,M], there are moreintelligent algorithms than just sampling. However, the PF can stillbe used to get a unified and simple filter framework.


Filter SummaryI Approximate the model to a case where an optimal algorithm

exists.I EKF1 approximates the model with a linear one.I UKF and EKF2 apply higher order approximations.

Gives an approximate Gaussian posterior.I Approximate the optimal nonlinear filter for the original model.

I Point-mass filter (PMF) which uses a regular grid of the statespace and applies the Bayesian recursion.

I Particle filter (PF) which uses a random grid of the state spaceand applies the Bayesian recursion.

Gives a sample-based numerical approximation of the posteriorI Restrict model class to JML models with Gaussian mixture

(GM) noises.I Mitigate exponential complexity with merging (IMM).I Mitigate exponential complexity with pruning (link to PF).

Gives a GM-based numerical approximation of the posterior.I MPF is a hybrid method!

Lecture7 - Automatic Controlfredrik/edu/sensorfusion/lecture7.pdf · Sensor Fusion, 2014 Lecture 7:...

Documents

Transcript of Lecture7 - Automatic Controlfredrik/edu/sensorfusion/lecture7.pdf · Sensor Fusion, 2014 Lecture 7:...