1
Advances in the Construction of Efficient Stated Choice
Experimental Designs
John Rose 1 Michiel Bliemer 1,2
1 The University of Sydney, Australia2 Delft University of Technology, The Netherlands
2
Contents
• Efficient Designs Defined• State of Practice in Experimental Design• Efficient Designs for Stated Choice Experiments• Bayesian Efficient Designs• Example• When is an Orthogonal Design Appropriate?• How can I do this?
3
Efficient Designs Defined
4
What are efficient designs?
• Based on a design, a survey is composed and the outcomes of the survey are used to estimate the model parameters
• The more reliable the parameter estimates are (i.e., having small standard error), the more efficient the design is
1 30 3 15 42 30 1 35 43 30 1 20 44 20 1 25 45 25 5 30 26 20 3 35 27 20 1 20 48 25 3 40 29 25 5 25 2
10 20 5 15 211 30 5 30 212 25 3 40 4
experimental design
respondents
0 0 1 00 1 0 00 1 0 00 0 0 11 0 0 00 1 0 01 0 0 0
data estimation
ˆ
ˆ( )se
results
5
What are efficient designs?
• The asymptotic variance-covariance (AVC) matrix is an approximation of the true variance-covariance matrix
• “Asymptotic” means – assuming a very large sample; or– assuming a large number of repetitions using a small sample
• The roots of the diagonals of the variance-covariance matrix denote the standard errors
21
2
( )
,
( )K
se
se
( )kse
variance-covariance matrix where is the standard
error of parameter k
6
Asymptotic variance-covariance matrix
• Efficient Design:– Generate a design that when applied in practice will likely yield standard
errors that are as small as possible
1 30 3 15 42 30 1 35 43 30 1 20 44 20 1 25 45 25 5 30 26 20 3 35 27 20 1 20 48 25 3 40 29 25 5 25 2
10 20 5 15 211 30 5 30 212 25 3 40 4
experimental design
respondents
0 0 1 00 1 0 00 1 0 00 0 0 11 0 0 00 1 0 01 0 0 0
data estimation
AVC matrix
7
State of Practice in Experimental Design
8
State of Practice
• In linear regression models:
2
'X X
variance-covariance matrix = Xwhere is the data or design
• If X is orthogonal, then• The diagonal elements of will be made as large as possible and the off diagonals equal to zero
'X X
• If we take the inverse of the diagonal elements will be minimised whilst the off diagonals remain zero
'X X
21
2
( ) 0
0
( )K
se
se
9
State of Practice
• In linear regression models:
2
'X X
variance-covariance matrix = Xwhere is the data or design
• If the diagonal elements as small as possible…
21
2
( ) 0
0
( )K
se
se
2k
k
tse
will be maximised
• And the zero off-diagonals suggest no multicolinearity
10
State of Practice
Question: Is the variance-covariance matrix of the logit model represented by ?
2
'X X
Answer: Logit model
Question: For discrete choice data, what type of econometric model do we typically employ?
11
Efficient Designs for Stated Choice Experiments
12
Efficient Designs and Logit Models
• The variance-covariance matrix for logit models is related to the log-likelihood of the model
Note that:• In estimation, given the (design) data X and the observations y, one
aims to determine estimates such that is maximised [maximum likelihood estimation]
• When generating an experimental design, these parameter estimates are unknown
• The values of depend on the model used (MNL, NL, ML)
1 1 1
( | , ) log ( | )N S J
N jsn jsnn s j
L X y y P X
( | , )NL X y
( | )jsnP X
13
• The second derivatives of the log-likelihood gives the Fisher information matrix:
• The negative inverse of the Fisher information matrix yields the model variance-covariance matrix
2 ( | , )( | , )
'N
N
L X yI X y
[Hessian matrix of second derivatives]
Efficient Designs and Logit Models
[negative inverse matrix]1( | , ) ( | , )N NX y I X y
14
Efficient Designs and Logit Models
• Example: MNL model with generic parameters (McFadden, 1974)
1 1 1
( | , ) log ( | )N S J
N jsn jsnn s j
L X y y P X
1 1 1
( | , )( | )
N S JN
jsn jsn jksnn s jk
L X yy P X X
First derivative:
Second derivative:1 2 2
1 2
2
1 1 1 1
( | )( | ) ( | )
N S J JN
jk sn jsn jk sn ik sn isnn s j ik k
L XX P X X X P X
1
exp ( | )( | )
exp ( ,| )
jsn
jsn J
jsni
V XP X
V X
1
( | )K
jsn k jksnk
V X X
with
Note: y drops out!
15
Efficient Designs and Logit Models
1 2 21 1 1 1
( | ) ( | ) ( | )N S J J
N jk sn jsn jk sn ik sn isnn s j i
I X X P X X X P X
Assuming that all responds observe the same choice situations,
1 2 21 1 1
1
( | ) ( | ) ( | )
( | )
S J J
N jk s js jk s ik s iss j i
I X N X P X X X P X
N I X
1
111
( | ) ( | )
( | )
N N
N
X I X
I X
Therefore, the AVC matrix becomes:
• Example: MNL model with generic parameters (cont’d)
16
11( | ) ( | )N N
se X se X
11( | ) ( | )N NX X
Efficient Designs and Logit Models
• Example: MNL model with generic parameters (cont’d)
0 5 10 15 20 25 30 35 40 45 500.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1( | )Nse X
1
N
N (sample size)
“A design that yields 50% lower standard errors requires 4 x less respondents”
17
( , )INse X
N
standard error
0 10 20 30 40 50
1 ( , )Ise X
N
standard error
0 10 20 30 40 50
( , )IINse X
1 ( , )Ise X
1 ( , )IIse X
Investing in more respondents Investing in better design
( , )INse X
1
N
(sample size) (sample size)
Efficient Designs and Logit Models
18
Efficient Designs and Logit Models
• Numerical example
1 2 1 3 2
2 1 4 2
A
B
U A A
U B B
MNL model: Priors: 1
2
3
4
0.2
0.3
0.4
0.5
Design:
1
2
3
4
1 2 3 4
1( | )I X 1( | )X
1
2
3
4
1 2 3 4
10 ( | )X
19
11( | ) ( | )N N
se X se X
11( | ) ( | )N NX X
Efficient Designs and Logit Models
• Sample size and designs
2
kk
k
k
tse
n
22
2k k
kk
t sen
20
Efficient Designs and Logit Models
Design 1:
Design 2:
1
1
Which design is more efficient?
21
Efficiency measures
• In order to assess the efficiency of different designs, several efficiency measures have been proposed
• The most widely used ones are:
– D-error
– A-error
• The lower the D-error or A-error, the more efficient the design
1/
1detK
1tr
K
[determinant of AVC matrix]
[trace of AVC matrix]
K number of parameters (size of the matrix),used as a scaling factor for the efficiency measure
22
Bayesian Efficient Designs
23
Bayesian efficient designs
• Efficient design– Example:
Find D-efficient design based on priors
• Bayesian efficient design– Example:
Find Bayesian D-efficient design based on priors
k
2( , )k k kN
24
Bayesian efficiency measures
Bayesian D-error = 1/det ( | ) ( | )
KX f d
• Bayesian efficiency is difficult to compute, it needs to evaluate a complex (multi-dimensional) integral
• However, it is nothing more than a simple average of D-errors:
where are random draws from the distribution function(we take r = 1,…,R draws)
Bayesian D-error ≈ 1/( )
1
1det ( | )
R Kr
r
XR
( )r
25
How to obtain priors?
• Prior parameter estimates can be obtained from:– the literature
– pilot studies
– focus groups
– expert judgement
• If no prior information is available, what to do?
1. Create a design using zero priors or use an orthogonal design
2. Give design to 100% of respondents
1. Create a design using zero priors or use an orthogonal design
2. Give design to 10% of respondents3. Estimate parameters, use as priors4. Create efficient design5. Give design to 90% of respondents
26
Example
27
Example
1 1 2 11 3 21 4 31
2 5 2 12 3 22 6 32
3 2 13 7 33
U X X X
U X X X
U X X
11 12 13
21 22
31 32 33
, , {6,8,10,12}
, {4,8}
, , {1,0}
X X X
X X
X X X
1 5
2
3
4
6
7
4, 4.9,
~ ( 0.6,0.2),
~ ( 0.8,0.2),
~ (0.6,0.2),
~ (0.8,0.2),
~ ( 1.0,0.2)
N
N
N
N
N
Let S = 12
28
Orthogonal DesignAlternative A Alternative B Alternative C
S A B C A B C A C
1 12 8 0 12 8 0 12 0
2 8 4 1 8 8 1 10 0
3 8 8 1 10 4 1 8 0
4 6 4 0 12 4 0 6 0
5 12 4 0 10 8 1 6 1
6 10 4 0 6 4 1 12 0
7 10 8 1 12 4 1 10 1
8 8 8 0 6 4 0 10 1
9 10 8 1 6 8 0 6 0
10 6 4 1 10 8 0 12 1
11 6 8 0 8 8 1 8 1
12 12 4 1 8 4 0 8 1
Alternative A Alternative B Alternative C
A B C A B C A C
Alternative A
A 1 . . . . . . .
B 0 1 . . . . . .
C 0 0 1 . . . . .
Alternative B
A 0 0 0 1 . . . .
B 0 0 0 0 1 . . .
C 0 0 0 0 0 1 . .
Alternative CA 0 0 0 0 0 0 1 .
C 0 0 0 0 0 0 0 1
Constant1 A B Constant2 C1 C2 C3
Constant1 6.580 -0.238 -0.796 -1.504 5.832 0.092 0.805
A -0.238 0.132 0.040 -0.053 -0.271 -0.107 0.143
B -0.796 0.040 0.151 -0.018 -1.007 0.113 0.132
Constant2 -1.504 -0.053 -0.018 5.391 0.420 0.084 -0.327
C1 5.832 -0.271 -1.007 0.420 9.170 -2.656 -0.109
C2 0.092 -0.107 0.113 0.084 -2.656 4.210 1.100
C3 0.805 0.143 0.132 -0.327 -0.109 1.100 3.722
Db-error = 1.058
N = 316.28
29
Efficient Design
Db-error = 0.6617
N = 158.22
Alternative A Alternative B Alternative C
S A B C A B C A C
1 10 4 1 6 8 0 8 0
2 8 4 0 12 4 1 8 0
3 12 8 0 10 4 0 6 0
4 12 4 0 10 8 1 8 1
5 6 8 0 12 4 0 10 1
6 10 4 1 6 8 0 6 1
7 6 8 1 12 4 1 12 1
8 8 8 0 8 8 1 12 0
9 6 8 1 8 4 0 10 0
10 12 4 0 6 8 0 10 1
11 10 4 1 8 8 1 12 0
12 8 8 1 10 4 1 6 1
Alternative A Alternative B Alternative C
A B C A B C A C
Alternative A
A 1 . . . . . . .
B -0.60 1 . . . . . .
C -0.30 0 1 . . . . .
Alternative B
A -0.47 0.45 -0.30 1 . . . .
B 0.60 -0.67 0 -0.75 1 . . .
C -0.15 0 0 0.45 0 1 . .
Alternative CA -0.40 0.15 0 0.07 0.15 0.30 1 .
C 0 0 0 0.15 0 0 -0.15 1
Constant1 A B Constant2 C1 C2 C3
Constant1 6.860 -0.669 -0.981 -0.773 6.389 0.979 0.081
A -0.669 0.119 0.136 -0.069 -0.745 -0.163 0.197
B -0.981 0.136 0.199 -0.063 -1.089 -0.172 0.244
Constant2 -0.773 -0.069 -0.063 2.093 0.260 0.250 -0.185
C1 6.389 -0.745 -1.089 0.260 7.530 0.232 -0.242
C2 0.979 -0.163 -0.172 0.250 0.232 1.985 -0.019
C3 0.081 0.197 0.244 -0.185 -0.242 -0.019 2.699
30
When is an orthogonal design appropriate?
31
Example
1 1 2 11 3 21 4 31
2 5 2 12 3 22 6 32
3 2 13 7 33
U X X X
U X X X
U X X
11 12 13
21 22
31 32 33
, , {6,8,10,12}
, {4,8}
, , {1,0}
X X X
X X
X X X
1 2 3 4 5 6 7 0
Let S = 12
32
Orthogonal Design
Alternative A Alternative B Alternative C
S A B C A B C A C
1 12 8 0 12 8 0 12 0
2 8 4 1 8 8 1 10 0
3 8 8 1 10 4 1 8 0
4 6 4 0 12 4 0 6 0
5 12 4 0 10 8 1 6 1
6 10 4 0 6 4 1 12 0
7 10 8 1 12 4 1 10 1
8 8 8 0 6 4 0 10 1
9 10 8 1 6 8 0 6 0
10 6 4 1 10 8 0 12 1
11 6 8 0 8 8 1 8 1
12 12 4 1 8 4 0 8 1
Alternative A Alternative B Alternative C
A B C A B C A C
Alternative A
A 1 . . . . . . .
B 0 1 . . . . . .
C 0 0 1 . . . . .
Alternative B
A 0 0 0 1 . . . .
B 0 0 0 0 1 . . .
C 0 0 0 0 0 1 . .
Alternative CA 0 0 0 0 0 0 1 .
C 0 0 0 0 0 0 0 1
Constant1 A B Constant2 C1 C2 C3
Constant1 2.938 0.000 -0.281 2.313 2.313 0.000 -0.750
A 0.000 0.025 -0.281 -0.281 0.000 0.000 0.000
B -0.281 0.000 0.047 0.000 0.000 0.000 0.000
Constant2 2.313 -0.281 0.000 1.500 0.000 0.000 0.750
C1 2.313 0.000 0.000 0.000 2.938 0.000 0.000
C2 0.000 0.000 0.000 0.000 0.000 1.500 0.000
C3 -0.750 0.000 0.000 0.750 0.000 0.000 1.500
Db-error = 0.3572
N = ?
22
0k k
k
t sen
1 2 3 4 5 6 7 0
33
So how can I do this?
34
35
Ngene Software
36
Ngene Software
Top Related