State Space Modelling For UK LFS Unemployment

38
State Space Modelling For UK LFS Unemployment Gary Brown, Ping Zong Time Series Analysis Branch Office for National Statistics Jan Angenendt Knowledge, Analysis and Intelligence HM Revenue and Customs Moshe Feder Southampton Statistical Sciences Research Institute (S3RI) University of Southampton

description

State Space Modelling For UK LFS Unemployment. Gary Brown, Ping Zong Time Series Analysis Branch Office for National Statistics Jan Angenendt   Knowledge, Analysis and Intelligence HM Revenue and Customs Moshe Feder Southampton Statistical Sciences Research Institute (S3RI) - PowerPoint PPT Presentation

Transcript of State Space Modelling For UK LFS Unemployment

Page 1: State Space Modelling For UK LFS Unemployment

State Space Modelling For UK LFS Unemployment

Gary Brown, Ping Zong

Time Series Analysis Branch

Office for National Statistics

Jan Angenendt

  Knowledge, Analysis and Intelligence

HM Revenue and Customs

Moshe Feder

Southampton Statistical Sciences Research Institute (S3RI)

University of Southampton

Page 2: State Space Modelling For UK LFS Unemployment

Overview

• Introduction

• LFS Rolling Quarterly Data

• State Space Model • The General State Space Model• The Specific Model Proposed for UK LFS

• Results

• Further Work

Page 3: State Space Modelling For UK LFS Unemployment

Introduction

• A State Space Model (SSM) represents a structural time series approach to capturing the characteristics of a time series.

• Similarly to X-12-ARIMA, a time series can be decomposed into trend, seasonal and irregular using SSM.

• A key advantage of using SSM:

• allows explicitly modelling of unobservable components

Page 4: State Space Modelling For UK LFS Unemployment

Aims of the SSM Project

• Currently the UK LFS publishes a single estimate for each rolling quarter, based on a rotating panel design with five waves of interviews

• The aim of the SSM project is to model the complex LFS structure by fitting wave-specific rolling quarter data, to better account for:

• sampling error autocorrelation (between wave-specific estimates)

• rotation group bias (systematic differences between wave-specific estimates)

Page 5: State Space Modelling For UK LFS Unemployment

LFS Sample Design

• The LFS sample size, around 120,000 people in 40,000 households, is split into 190 Interviewer Areas (IAs).

• Each IA is split into 13 weekly ‘stints’ – in this way a representative sample is achieved every 13 weeks.

• To weight the sample, the 13 weeks are allocated to ‘months’ in a 4-4-5 pattern.

Page 6: State Space Modelling For UK LFS Unemployment

LFS Sample Design (cont)

• Interviews are (approximately) split by mode:

• First interview – face-to-face

• Second interview (13 weeks later) – telephone

• Third, fourth and fifth interviews (each 13 weeks after the previous) – telephone.

• After the fifth interview (wave 5) households drop out of the survey and are replaced with a new set of households (wave 1).

Page 7: State Space Modelling For UK LFS Unemployment

Data Structure

Table 1: Rolling quarterly estimates

Rolling quarter

Wave 1 Wave 2 Wave 3 Wave 4 Wave 5

May-Jul Ym-j,w1 Ym-j,w2 Ym-j,w3 Ym-j,w4 Ym-j,w5 Jun-Aug Yj-a,w1 Yj-a,w2 Yj-a,w3 Yj-a,w4 Yj-a,w5 Jul-Sep Yj-s,w1 Yj-s,w2 Yj-s,w3 Yj-s,w4 Yj-s,w5 Aug-Oct Ya-o,w1 Ya-o,w2 Ya-o,w3 Ya-o,w4 Ya-o,w5 Sep-Nov Ys-n,w1 Ys-n,w2 Ys-n,w3 Ys-n,w4 Ys-n,w5 Oct-Dec Yo-d,w1 Yo-d,w2 Yo-d,w3 Yo-d,w4 Yo-d,w5

Page 8: State Space Modelling For UK LFS Unemployment

Rolling quarterly estimate

• Each three months yields a representative sample, and each month is in three of these

• For example, survey responses from June are included in three rolling quarterly estimates: (April,May,Jun), (May,Jun,Jul), (Jun,Jul,Aug)

• Given this structure, the overall unemployment quarterly estimate at time t is a combination of three months:

• where ‘Y’ = unemployment rate, ‘ILO’ = ILO unemployed, and ‘EA’ = economically active.

1 2

1 2

t t tt

t t t

ILO ILO ILOY

EA EA EA

Page 9: State Space Modelling For UK LFS Unemployment

Table 2: Sample rotation in wave-specific data

Central month

Stint Wave 1 Wave 2 Wave 3 Wave 4 Wave 5

July Week 10 Moshe Nigel Oscar Ping Quentin Week 11 Mary Nuovella Olivia Penelope Queenie Week 12 Mark Nat Owen Paul Quinlan Week 13 Maxine Naomi Olga Pam Quanita

August week 1 Andrew Brian Craig David Eric week 2 Amy Bella Catherine Dominica Edwina week 3 Albert Bill Charles Danny Edgar week 4 Anthony Brenda Carys Davina Emily week 5 Amanda Ben Callum Dominic Edward

September week 6 Frederic Giovanna Helen Iris Janice week 7 Fred Gary Harry Ian Jan week 8 Fenella Gemima Hannah Irene Jacky week 9 Frank Geoff Henry Iqbal Jeremy

October week 10 Lionel Moshe Nigel Oscar Ping week 11 Lorna Mary Nuovella Olivia Penelope week 12 Larry Mark Nat Owen Paul week 13 Lesley Maxine Naomi Olga Pam

Page 10: State Space Modelling For UK LFS Unemployment

Sample rotation

The sample rotation means:• The same wave does not include the same households

(samples) in each rolling quarter.

• The same households (cohort) appear in different waves after one quarter.

• There are different data collection methods in waves.

These characteristics need to be accounted for.

• Using the SSM approach for UK LFS unemployment enables this to happen

Page 11: State Space Modelling For UK LFS Unemployment

The General State Space Model (GSSM) is: (1)

(2)

• where:• yt is the measurement equation

is the state vector (the transition equation)

• Z, T, H and Q are matrices

• and are error terms

Compare the General Linear Regression Model (GLRM)

(3)

State space model

t t t ty Z 1t t tT

(0, )t tN H

(0, )t tN Q

t t ty Z

Page 12: State Space Modelling For UK LFS Unemployment

Comparing SSM and GLRM

• In (1) and (3), y is a function of time, but

• GLRM: the coefficient is which is fixed

• SSM: the coefficient is t which will vary over time

• Hence, GLRM is a static regression model and SSM is a dynamic regression model

• In the SSM, each coefficient tvaries according to a random walk t = t-1twhich gives a state vector in the form t = Tt-1tas in Equation (2).

• So, equation (1) expresses the dynamic regression process, and equation (2) expresses the dynamic change condition

Page 13: State Space Modelling For UK LFS Unemployment

SSM with Signal and Noise

The SSM model can be expressed as two parts: • signal t

• noise et (4)

where: • yt is the design unbiased survey estimate

• t is signal - the unknown population quantity

• et is noise - the survey errors

t t ty e

Page 14: State Space Modelling For UK LFS Unemployment

Signal t - Basic Structural Model (BSM)

(4a)

(4b)

(4c)

(4d) if using dummy seasonality

where:

• Lt is the level

• Rt is the slope

• St is the seasonal

L,t, R,t, S,t are white noise terms

t t tL S

1 1 ,t t t L tL L R

1 ,t t R tR R 11

1

,t t j S tj

S S

Page 15: State Space Modelling For UK LFS Unemployment

(5)

where: is the coefficient of AR process

• et-j is the sampling error

• e,t is white noise

The standard assumption is independence of errors

Noise et - the Extended SSM model

,1

p

t j t j e tj

e e

2 2 2 2, , , ,[ ] , [ ] , [ ] , [ ]

e L R Se t L t R t S tVar Var Var Var

, , , ,[ ] [ ] [ ] [ ] 0L t R t S t e tE E E E

Page 16: State Space Modelling For UK LFS Unemployment

BSM + Extended SSM

• Both signal and noise have their own measurement equation (ZBSM,t and Ze,t) and state vector (BSM,t and e,t) in the transition equation, ie

- measurement equation for signal

- state vector for signal

- measurement equation for noise

- state vector for noise if AR(4)

• These two parts, signal and noise, are brought together to form a completed SSM model.

, ,( )t BSM t BSM t ty Z

, ,( )t e t e t te Z , ( , , ) 'BSM t t t tL R S

, 1 2 3( , , , ) 'e t t t t te e e e

Page 17: State Space Modelling For UK LFS Unemployment

The Specific SSM Proposed for UK LFS

State Vector (Lt, Rt, St, et)

• As survey responses from month t are included in estimates based on three representative samples centred at (t-1,t,t+1), the state vector will not only consider parameters at time t but will take all three time periods (t-1,t,t+1) into account

• All the original Lt, Rt, St and et will include

for level

for slope

for seasonal

for sample error

* * *1 1,t t tL L and L

* * *1 1,t t tR R and R

* * *1 1,t t tS S and S

* * *1 1,t t te e and e

Page 18: State Space Modelling For UK LFS Unemployment

State vector (Lt, Rt, St, et) - cont

• The slope (R) will be the same at three different levels, so is kept at the t+1 value.

• Also because there are five waves, and each wave includes three time periods (t-1, t, t+1) for sample error, there are in total 15 sample error state variables in our model.

• Total = 30 state variables for the state vector.

Page 19: State Space Modelling For UK LFS Unemployment

Survey error structure (et)

• Survey errors do not overlap in the wave structure data but do appear between two quarter across waves (cohorts), ie• someone interviewed in wave i at time t will be

interviewed in wave (i+1) at time t+3 – the sample errors will correlated so are defined as follows.

(6)

, 2 ( 3), 1 , 2

, 3 ( 3), 2 , 3

, 4 ( 3), 3 , 4

, 5 ( 3), 4 , 5

, 1 ( 3), 5 , 1*

t w t w t w

t w t w t w

t w t w t w

t w t w t w

t w t w t w

e e

e e

e e

e e

e e

Page 20: State Space Modelling For UK LFS Unemployment

Building the SSM for UK LFS

• Matrices are used as the basic method for building the SSM

• The main SSM matrices/vectors for UK LFS are:

• observation matrices (Z)

• transition matrices (T)

• covariance matrices (Q)

• state vectors (t)

• disturbance vectors

( )t

Page 21: State Space Modelling For UK LFS Unemployment

Observation matrices (Z)

• Observation matrices: • ZBSM (signal) is a 5x15 matrix for SSM with dummy

seasonality

• Ze (noise) is a 5x15 matrix

BSM

1 1 1 0 1 1 1 0 0 0 0 0 0 0 0

1 1 1 0 1 1 1 0 0 0 0 0 0 0 01Z 1 1 1 0 1 1 1 0 0 0 0 0 0 0 03

1 1 1 0 1 1 1 0 0 0 0 0 0 0 0

1 1 1 0 1 1 1 0 0 0 0 0 0 0 0

5 5 5 5 5 51 ( )3e x x xZ I I I

Page 22: State Space Modelling For UK LFS Unemployment

State vectors (tt-1) and transition matrix (T): BSM

• State vectors (BSM,t), transition matrix (TBSM) + disturbance

1

1

1

1

1

2

3

4

5

6

7

8

9

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 1 1 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

0 0 0 0

t

t

t

t

t

t

t

t

t

t

t

t

t

t

t

L

L

L

R

S

S

S

S

S

S

S

S

S

S

S

1

1

1 1 1 1 1 1 1 1 1 1 1

0 0 0 0 0 0 1 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 1 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 1 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 1 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 1 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 1 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 1 0

t

t

t

L

L

L

2

1

1

2

3

4

5

6

7

8

0

0

0

0

0

0

0

0

0

0

0

0

LtR

t t

t

t

St t

t

t

t

t

t

t

t

t

R

S

S

S

S

S

S

S

S

S

S

S

Page 23: State Space Modelling For UK LFS Unemployment

11213141511

2

3

4

5

1121314151

0 0 0 0 0 1 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 1 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 1 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 1 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 1 0

wtwtwtwtwtwtwtwtwtwtwtwtwtwtwt

e

e

e

e

e

e

e

e

e

e

e

e

e

e

e

0 0 0 0

0 0 0 0 0 0 0 0 0 0 1 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 1 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 1 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

0 0 0 0 * 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0

12223242521121314151

, 1

, 23

, 36

, 49

, 512

0

0

0

0

0

0

0

0

0

0

wtwtwtwtwtwtwtwtwtwt

e wttte wttte wttte wttte wttt

e

e

e

e

e

e

e

e

e

e

e

e

e

e

e

State vectors (tt-1) and transition matrix (T): e

• State vectors (e,t), transition matrix (Te) + disturbance

Page 24: State Space Modelling For UK LFS Unemployment

Join Signal and Noise - block all matrices together

• Observation matrices:

• State vectors:

• Disturbance vectors:

• Transition matrices:

where TBSM is the 15x15 matrix and Te is the 15x15 matrix

with

, ,( )BSM t e tZ Z Z

, ,( )t BSM t e t

(15 15) 15 15

(30 30)15 15 (15 15)

0

0BSM x x

xx e x

TT

T

5 5 5 5 5 5

5 5 5 5 5 5

5 5 5 5 5 5

0 0

0 0

0 0

x x x

e x x x

x x x

I

T I

AR

5 5

0 0 0 0 *

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

xAR

, ,( )t BSM t e t

Page 25: State Space Modelling For UK LFS Unemployment

Covariance matrices (Q)

where

with

and with

, (15 15) (15 15)

30 30(15 15) , (15 15)

0

0BSM t x x

xx e t x

QQ

Q

, ( , 4 4) 4 112, (15 15)

11 4 , (11 11)

0

0t L R x x

BSM t xx s t x

QQ

2 2 2 2, ( , )(4 4)

2 2

0 0

0x x

t L R xx LR

QQ

2

2

0

0L

LRR

Q

Page 26: State Space Modelling For UK LFS Unemployment

Covariance matrices (Q) – cont.

and

with

5 5 5 5 5 5

, (15 15) 5 5 5 5 5 5

5 5 5 5 5 5

0 0 0

0 0 0

0 0

x x x

e t x x x

x x x

Q

VC

2(5 5) (5 5)( )xVC I

Page 27: State Space Modelling For UK LFS Unemployment

The Model Estimate Setting

• As long as all SSM matrices are set appropriately and all parameters in the model are known, the state vector can be predicted, filtered and smoothed using the Kalman Filter

• In fact, all these parameters are unknown, thus we need initialisation of all parameters:

• (t-1) in the state vector

• in the disturbance matrix

• AR parameters (and) in the transition matrix

2 2 2 2, , ,( )L R S e

Page 28: State Space Modelling For UK LFS Unemployment

Initialisation for (t-1) in the state vector

• For non-stationary components:

• initialised the non-stationary components mean by zero

• initialised the associated non-stationary component variances with a very large value (ie 10000)

• For stationary components:

• initialised the stationary component (sampling error et) mean with unconditional mean

• initialised the stationary component variance with its own pseudo-error variance

Page 29: State Space Modelling For UK LFS Unemployment

Initialisation for (t) in the disturbance vector

• All in the disturbance vector can be estimated using Maximum Likelihood in the model.

• There are two approaches to estimating these parameters:

• The hyper-parameters approach assumes that all parameters are unknown and are estimated simultaneously in the SSM model.

• The pseudo-error approach is different ...

2 2 2 2, , ,L R S e

Page 30: State Space Modelling For UK LFS Unemployment

Initialisation in the pseudo-error approach

Different approaches for different parameters

• L,t, R,t,S,t are treated as unknown parameters, and e,t is treated as a known parameter (estimated in a separate process)

• Initialised variance value for the unknown parameter vector (in Q matrices)

• are set based on a separate estimation process (‘Proc ucm’ in SAS, ‘StructTS’ in DLM/R)

• obtained based on calculation of the autocorrelation through the pseudo-error process

• AR parameters, and, are estimated using Yule-Walker equations and substituted into SSM

2e

2, 0L t

2 2andR S

Page 31: State Space Modelling For UK LFS Unemployment

Simulation results

1. Trend prediction:

Page 32: State Space Modelling For UK LFS Unemployment

Seasonality prediction

Page 33: State Space Modelling For UK LFS Unemployment

Sample error prediction

Page 34: State Space Modelling For UK LFS Unemployment

Further work

The project is not complete – work remaining:

• test whether including (t-1, t, t+1) into the model is necessary (through comparison analysis)

• test whether the proposed method for sample error estimation is correct

• test a consistent approach with one used in SSM to estimate the AR(1) coefficients of the pseudo-error

• consider MA models

• consider including rotating group bias and the claimant count

Page 35: State Space Modelling For UK LFS Unemployment

[email protected]

Any questions?

Page 36: State Space Modelling For UK LFS Unemployment

Appendix

• Trigonometric seasonality model was using sines and cosines.

where E[ ]=0, E[ ]=0, Var[ ]= Var[ ]= and

for j = 1,...,6.

6

. .1

t j t j tj

S

*

. . 1 , 1 ,cos sinj t j t j j t j j t * * *, . 1 , 1 ,sin nj t j t j j t j j tco

,tj t *

,j t,j t,j t2

6j

j

Page 37: State Space Modelling For UK LFS Unemployment

The observation matrix (ZBSM)(5 x17 matrix)

• Total sate vector ( ) - 32 variables:

BSM

1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 0

1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 01Z 1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 03

1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 0

1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 0 0

t

* *1, , 1, 1, 1, , 1, 1, 6, 1, 1, 1, 6, 1,

1, 1 1, 5, , 1 , 5, 1, 1 1, 5

( .... ....

... ... .. ) 't t t t t t t t t t t

t w t w t w t w t w t w

L L L R S S

e e e e e e

Page 38: State Space Modelling For UK LFS Unemployment

The transition matrix (TBSM) (5 x17 matrix)

1

1

1

1

1, 1

2, 1

3, 1

4, 1

5, 1

6, 1*1, 1*2, 1*3, 1*4, 1*5, 1

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0

t

t

t

t

t

t

t

t

t

t

t

t

t

t

t

t

t

L

L

L

R

S

S

1 1

2 2

3 3

4 4

5 5

1 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0

0 0 0 0 0 0 cos( ) 0 0 0 0 0 sin( ) 0 0 0 0

0 0 0 0 0 0 0 cos( ) 0 0 0 0 0 sin( ) 0 0 0

0 0 0 0 0 0 0 0 cos( ) 0 0 0 0 0 sin( ) 0 0

0 0 0 0 0 0 0 0 0 cos( ) 0 0 0 0 0 sin( ) 0

0 0 0 0 0 0 0 0 0 0 cos( ) 0 0 0 0 0 sin( )

0 0 0 0 0 0 0

6

1 1

2 2

3 3

4 4

5 5

0 0 0 0 cos( ) 0 0 0 0 0

0 0 0 0 0 0 sin( ) 0 0 0 0 0 cos( ) 0 0 0 0

0 0 0 0 0 0 0 sin( ) 0 0 0 0 0 cos( ) 0 0 0

0 0 0 0 0 0 0 0 sin( ) 0 0 0 0 0 cos( ) 0 0

0 0 0 0 0 0 0 0 0 sin( ) 0 0 0 0 0 cos( ) 0

0 0 0 0 0 0 0 0 0 0 sin( ) 0 0 0 0 0 cos( )

1

1

2

1

1, 1,

2, 2,

3, 3,

4, 4,

5, 5,

6, 6,* *1, 1,* *2, 2,* *3, 3,* *4, 4,*5,

0

0

0

0

t

tL

t tR

t t

t

t

t t

t t

t t

t t

t t

t t

t t

t t

t t

t t

t

L

L

L

R

S

S

*5,t