Download - John Rose 1 Michiel Bliemer 1,2 1 The University of Sydney, Australia

1

Advances in the Construction of Efficient Stated Choice

Experimental Designs

John Rose 1 Michiel Bliemer 1,2

1 The University of Sydney, Australia2 Delft University of Technology, The Netherlands

2

Contents

• Efficient Designs Defined• State of Practice in Experimental Design• Efficient Designs for Stated Choice Experiments• Bayesian Efficient Designs• Example• When is an Orthogonal Design Appropriate?• How can I do this?

3

Efficient Designs Defined

4

What are efficient designs?

• Based on a design, a survey is composed and the outcomes of the survey are used to estimate the model parameters

• The more reliable the parameter estimates are (i.e., having small standard error), the more efficient the design is

1 30 3 15 42 30 1 35 43 30 1 20 44 20 1 25 45 25 5 30 26 20 3 35 27 20 1 20 48 25 3 40 29 25 5 25 2

10 20 5 15 211 30 5 30 212 25 3 40 4

experimental design

respondents

0 0 1 00 1 0 00 1 0 00 0 0 11 0 0 00 1 0 01 0 0 0

data estimation

ˆ

ˆ( )se

results

5

What are efficient designs?

• The asymptotic variance-covariance (AVC) matrix is an approximation of the true variance-covariance matrix

• “Asymptotic” means – assuming a very large sample; or– assuming a large number of repetitions using a small sample

• The roots of the diagonals of the variance-covariance matrix denote the standard errors

21

2

( )

,

( )K

se

se

( )kse

variance-covariance matrix where is the standard

error of parameter k

6

Asymptotic variance-covariance matrix

• Efficient Design:– Generate a design that when applied in practice will likely yield standard

errors that are as small as possible

1 30 3 15 42 30 1 35 43 30 1 20 44 20 1 25 45 25 5 30 26 20 3 35 27 20 1 20 48 25 3 40 29 25 5 25 2

10 20 5 15 211 30 5 30 212 25 3 40 4

experimental design

respondents

0 0 1 00 1 0 00 1 0 00 0 0 11 0 0 00 1 0 01 0 0 0

data estimation

AVC matrix

7

State of Practice in Experimental Design

8

State of Practice

• In linear regression models:

2

'X X

variance-covariance matrix = Xwhere is the data or design

• If X is orthogonal, then• The diagonal elements of will be made as large as possible and the off diagonals equal to zero

'X X

• If we take the inverse of the diagonal elements will be minimised whilst the off diagonals remain zero

'X X

21

2

( ) 0

0

( )K

se

se

9

State of Practice

• In linear regression models:

2

'X X

variance-covariance matrix = Xwhere is the data or design

• If the diagonal elements as small as possible…

21

2

( ) 0

0

( )K

se

se

2k

k

tse

will be maximised

• And the zero off-diagonals suggest no multicolinearity

10

State of Practice

Question: Is the variance-covariance matrix of the logit model represented by ?

2

'X X

Answer: Logit model

Question: For discrete choice data, what type of econometric model do we typically employ?

11

Efficient Designs for Stated Choice Experiments

12

Efficient Designs and Logit Models

• The variance-covariance matrix for logit models is related to the log-likelihood of the model

Note that:• In estimation, given the (design) data X and the observations y, one

aims to determine estimates such that is maximised [maximum likelihood estimation]

• When generating an experimental design, these parameter estimates are unknown

• The values of depend on the model used (MNL, NL, ML)

1 1 1

( | , ) log ( | )N S J

N jsn jsnn s j

L X y y P X

( | , )NL X y

( | )jsnP X

13

• The second derivatives of the log-likelihood gives the Fisher information matrix:

• The negative inverse of the Fisher information matrix yields the model variance-covariance matrix

2 ( | , )( | , )

'N

N

L X yI X y

[Hessian matrix of second derivatives]


[negative inverse matrix]1( | , ) ( | , )N NX y I X y

14


• Example: MNL model with generic parameters (McFadden, 1974)

1 1 1

( | , ) log ( | )N S J

N jsn jsnn s j

L X y y P X

1 1 1

( | , )( | )

N S JN

jsn jsn jksnn s jk

L X yy P X X

First derivative:

Second derivative:1 2 2

1 2

2

1 1 1 1

( | )( | ) ( | )

N S J JN

jk sn jsn jk sn ik sn isnn s j ik k

L XX P X X X P X

1

exp ( | )( | )

exp ( ,| )

jsn

jsn J

jsni

V XP X

V X

1

( | )K

jsn k jksnk

V X X

with

Note: y drops out!

15


1 2 21 1 1 1

( | ) ( | ) ( | )N S J J

N jk sn jsn jk sn ik sn isnn s j i

I X X P X X X P X

Assuming that all responds observe the same choice situations,

1 2 21 1 1

1

( | ) ( | ) ( | )

( | )

S J J

N jk s js jk s ik s iss j i

I X N X P X X X P X

N I X

1

111

( | ) ( | )

( | )

N N

N

X I X

I X

Therefore, the AVC matrix becomes:

• Example: MNL model with generic parameters (cont’d)

16

11( | ) ( | )N N

se X se X

11( | ) ( | )N NX X


• Example: MNL model with generic parameters (cont’d)

0 5 10 15 20 25 30 35 40 45 500.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1( | )Nse X

1

N

N (sample size)

“A design that yields 50% lower standard errors requires 4 x less respondents”

17

( , )INse X

N

standard error

0 10 20 30 40 50

1 ( , )Ise X

N

standard error

0 10 20 30 40 50

( , )IINse X

1 ( , )Ise X

1 ( , )IIse X

Investing in more respondents Investing in better design

( , )INse X

1

N

(sample size) (sample size)


18


• Numerical example

1 2 1 3 2

2 1 4 2

A

B

U A A

U B B

MNL model: Priors: 1

2

3

4

0.2

0.3

0.4

0.5

Design:

1

2

3

4

1 2 3 4

1( | )I X 1( | )X

1

2

3

4

1 2 3 4

10 ( | )X

19

11( | ) ( | )N N

se X se X

11( | ) ( | )N NX X


• Sample size and designs

2

kk

k

k

tse

n

22

2k k

kk

t sen

20


Design 1:

Design 2:

1

1

Which design is more efficient?

21

Efficiency measures

• In order to assess the efficiency of different designs, several efficiency measures have been proposed

• The most widely used ones are:

– D-error

– A-error

• The lower the D-error or A-error, the more efficient the design

1/

1detK

1tr

K

[determinant of AVC matrix]

[trace of AVC matrix]

K number of parameters (size of the matrix),used as a scaling factor for the efficiency measure

22

Bayesian Efficient Designs

23

Bayesian efficient designs

• Efficient design– Example:

Find D-efficient design based on priors

• Bayesian efficient design– Example:

Find Bayesian D-efficient design based on priors

k

2( , )k k kN

24

Bayesian efficiency measures

Bayesian D-error = 1/det ( | ) ( | )

KX f d

• Bayesian efficiency is difficult to compute, it needs to evaluate a complex (multi-dimensional) integral

• However, it is nothing more than a simple average of D-errors:

where are random draws from the distribution function(we take r = 1,…,R draws)

Bayesian D-error ≈ 1/( )

1

1det ( | )

R Kr

r

XR

( )r

25

How to obtain priors?

• Prior parameter estimates can be obtained from:– the literature

– pilot studies

– focus groups

– expert judgement

• If no prior information is available, what to do?

1. Create a design using zero priors or use an orthogonal design

2. Give design to 100% of respondents

1. Create a design using zero priors or use an orthogonal design

2. Give design to 10% of respondents3. Estimate parameters, use as priors4. Create efficient design5. Give design to 90% of respondents

26

Example

27

Example

1 1 2 11 3 21 4 31

2 5 2 12 3 22 6 32

3 2 13 7 33

U X X X

U X X X

U X X

11 12 13

21 22

31 32 33

, , {6,8,10,12}

, {4,8}

, , {1,0}

X X X

X X

X X X

1 5

2

3

4

6

7

4, 4.9,

~ ( 0.6,0.2),

~ ( 0.8,0.2),

~ (0.6,0.2),

~ (0.8,0.2),

~ ( 1.0,0.2)

N

N

N

N

N

Let S = 12

28

Orthogonal DesignAlternative A Alternative B Alternative C

S A B C A B C A C

1 12 8 0 12 8 0 12 0

2 8 4 1 8 8 1 10 0

3 8 8 1 10 4 1 8 0

4 6 4 0 12 4 0 6 0

5 12 4 0 10 8 1 6 1

6 10 4 0 6 4 1 12 0

7 10 8 1 12 4 1 10 1

8 8 8 0 6 4 0 10 1

9 10 8 1 6 8 0 6 0

10 6 4 1 10 8 0 12 1

11 6 8 0 8 8 1 8 1

12 12 4 1 8 4 0 8 1

Alternative A Alternative B Alternative C

A B C A B C A C

Alternative A

A 1 . . . . . . .

B 0 1 . . . . . .

C 0 0 1 . . . . .

Alternative B

A 0 0 0 1 . . . .

B 0 0 0 0 1 . . .

C 0 0 0 0 0 1 . .

Alternative CA 0 0 0 0 0 0 1 .

C 0 0 0 0 0 0 0 1

Constant1 A B Constant2 C1 C2 C3

Constant1 6.580 -0.238 -0.796 -1.504 5.832 0.092 0.805

A -0.238 0.132 0.040 -0.053 -0.271 -0.107 0.143

B -0.796 0.040 0.151 -0.018 -1.007 0.113 0.132

Constant2 -1.504 -0.053 -0.018 5.391 0.420 0.084 -0.327

C1 5.832 -0.271 -1.007 0.420 9.170 -2.656 -0.109

C2 0.092 -0.107 0.113 0.084 -2.656 4.210 1.100

C3 0.805 0.143 0.132 -0.327 -0.109 1.100 3.722

Db-error = 1.058

N = 316.28

29

Efficient Design

Db-error = 0.6617

N = 158.22


S A B C A B C A C

1 10 4 1 6 8 0 8 0

2 8 4 0 12 4 1 8 0

3 12 8 0 10 4 0 6 0

4 12 4 0 10 8 1 8 1

5 6 8 0 12 4 0 10 1

6 10 4 1 6 8 0 6 1

7 6 8 1 12 4 1 12 1

8 8 8 0 8 8 1 12 0

9 6 8 1 8 4 0 10 0

10 12 4 0 6 8 0 10 1

11 10 4 1 8 8 1 12 0

12 8 8 1 10 4 1 6 1


A B C A B C A C

Alternative A

A 1 . . . . . . .

B -0.60 1 . . . . . .

C -0.30 0 1 . . . . .

Alternative B

A -0.47 0.45 -0.30 1 . . . .

B 0.60 -0.67 0 -0.75 1 . . .

C -0.15 0 0 0.45 0 1 . .

Alternative CA -0.40 0.15 0 0.07 0.15 0.30 1 .

C 0 0 0 0.15 0 0 -0.15 1


Constant1 6.860 -0.669 -0.981 -0.773 6.389 0.979 0.081

A -0.669 0.119 0.136 -0.069 -0.745 -0.163 0.197

B -0.981 0.136 0.199 -0.063 -1.089 -0.172 0.244

Constant2 -0.773 -0.069 -0.063 2.093 0.260 0.250 -0.185

C1 6.389 -0.745 -1.089 0.260 7.530 0.232 -0.242

C2 0.979 -0.163 -0.172 0.250 0.232 1.985 -0.019

C3 0.081 0.197 0.244 -0.185 -0.242 -0.019 2.699

30

When is an orthogonal design appropriate?

31

Example

1 1 2 11 3 21 4 31

2 5 2 12 3 22 6 32

3 2 13 7 33

U X X X

U X X X

U X X

11 12 13

21 22

31 32 33

, , {6,8,10,12}

, {4,8}

, , {1,0}

X X X

X X

X X X

1 2 3 4 5 6 7 0

Let S = 12

32

Orthogonal Design


S A B C A B C A C

1 12 8 0 12 8 0 12 0

2 8 4 1 8 8 1 10 0

3 8 8 1 10 4 1 8 0

4 6 4 0 12 4 0 6 0

5 12 4 0 10 8 1 6 1

6 10 4 0 6 4 1 12 0

7 10 8 1 12 4 1 10 1

8 8 8 0 6 4 0 10 1

9 10 8 1 6 8 0 6 0

10 6 4 1 10 8 0 12 1

11 6 8 0 8 8 1 8 1

12 12 4 1 8 4 0 8 1


A B C A B C A C

Alternative A

A 1 . . . . . . .

B 0 1 . . . . . .

C 0 0 1 . . . . .

Alternative B

A 0 0 0 1 . . . .

B 0 0 0 0 1 . . .

C 0 0 0 0 0 1 . .

Alternative CA 0 0 0 0 0 0 1 .

C 0 0 0 0 0 0 0 1


Constant1 2.938 0.000 -0.281 2.313 2.313 0.000 -0.750

A 0.000 0.025 -0.281 -0.281 0.000 0.000 0.000

B -0.281 0.000 0.047 0.000 0.000 0.000 0.000

Constant2 2.313 -0.281 0.000 1.500 0.000 0.000 0.750

C1 2.313 0.000 0.000 0.000 2.938 0.000 0.000

C2 0.000 0.000 0.000 0.000 0.000 1.500 0.000

C3 -0.750 0.000 0.000 0.750 0.000 0.000 1.500

Db-error = 0.3572

N = ?

22

0k k

k

t sen

1 2 3 4 5 6 7 0

33

So how can I do this?

35

Ngene Software

36

Ngene Software