University of Castilla-La Mancha, Spain · 2021. 1. 21. · University of Castilla-La Mancha, Spain...
Transcript of University of Castilla-La Mancha, Spain · 2021. 1. 21. · University of Castilla-La Mancha, Spain...
Experimental Designs for different approaches ofSimultaneous Equations
Vıctor Casero-Alonso and Jesus Lopez FidalgoDepartment of Mathematics
Institute of Mathematics Applied to Science and EngineeringUniversity of Castilla-La Mancha, Spain
0. Abstract
I Models with simultaneous equations (widely used in economics, sociology, medicine,engineering...) are considered.
I A model with two equations is considered.I One explanatory variable (exogenous) of the first equation is the response variable (endogenous)
of the second equation, where there is a controllable variable which is being designed.I Plugging second equation into the first one the designable variable is now in both equations.I Two different models: different maximum likelihood estimators and therefore information matrices
and optimal designs.I Optimal designs for both approaches are computed and compared, both in a discrete and a
continuous design space.I Cases of completely known correlation and a unknown correlation to be estimated are considered
and compared.I A sensitivity analysis is performed to have an idea of the risk in choosing wrong nominal values
of the parameters.
1. Motivating examples
I Conlisk (1979): An oil company wants to study a controlled variation in the prices of gas andrepairs, Pg and Pr . (Other controlled variables: whether trading stamps are offered with gas andrepair sales respectively.)Two endogenous variables: quantity of gas and repairs sold.Two equations with all variables, exogenous and endogenous, included.
I Aigner and Balestra (1988): Designing electricity pricing experiments which takes place via anintervention in one or a succession of periods.
I Hahn, Hirano and Karlan (2011): Designing using propensity score, i.e. the conditional probabilityof treatment given some observed characteristics of the individual (covariates).
I Surgery of lung carcinoma (2004) Exercise test to predict morbidity after lung resection: Riding astatic bicycle during a period of time (controlled variable, exogenous).An uncontrollabled variable: % of maximum volume of expired air in the first second.Two endogenous variables: Oxygen desaturation during the test (y1, e.g. linear regression) andBinary response: morbidity (y2, Logistic model).
2. Different approaches of Simultaneous Equations
Poskitt and Skeels (2007): Two formulations probabilistically equivalent but conceptually quitedifferent. Different MLEs, then information matrices and optimal designs.
SES - Structural equation specification{y = Yβ + u,Y = Π2a + Z Π2b + V ,
I y , response variable,I Y , explanatory/response variable,I Z , controllable variable, which is being
designed,I β, Π2a and Π2b, unknown parameters,I u and V , error terms,(
uV
)∼ N
((00
),
(1 ρρ 1
))
RFS - Reduced form specification
Plugging second equation into the first one{y = Π2aβ + Z Π2bβ + ν,
Y = Π2a + Z Π2b + V .
The designable variable Z is now in bothequations.(
νV
)∼ N
((00
),
(1 ρρ 1
))
3. Optimal designs
I Design space: Z = {0,1}.I Information Matrices:
For ρ known ( aa ) and ρ unknown to be estimated (the whole matrix):
I MSES[ξ(z)] =1
1− ρ2
1 + Π2
2a + 2pΠ2aΠ2b + pΠ22b −(Π2a + pΠ2b)ρ −p(Π2a + Π2b)ρ 1
−(Π2a + pΠ2b)ρ 1 p 0−p(Π2a + Π2b)ρ p p 0
1 0 0 1+ρ2
1−ρ2
I MRFS[ξ(z)] =1
1− ρ2
Π2
2a + 2pΠ2aΠ2b + pΠ22b (Π2a + pΠ2b)(β − ρ) p(Π2a + Π2b)(β − ρ) 0
(Π2a + pΠ2b)(β − ρ) 1 + β2 − 2βρ p(1 + β2 − 2βρ
)0
p(Π2a + Π2b)(β − ρ) p(1 + β2 − 2βρ
)p(1 + β2 − 2βρ
)0
0 0 0 1+ρ2
1−ρ2
I Optimal design: ξ∗(z) =
{0 1
1− p∗ p∗
},p∗ = arg min Φ{M[ξ(z)]} ∈ [0,1]; Φ: D–optimal, c–optimal
4. Optimal weights
Similar D– and c–optimal weights p∗ for a given nominal values. Ex: (Π2a,Π2b, ρ) = (4,1,0.8):
D–opt cβ–opt cΠ2a–opt cΠ2b–opt cρ–optSES RFS SES RFS SES RFS SES RFS SES RFS
ρ known 0.547 .553 1 1 0 0.528 0.50092 0.5089 - -ρ unknown 0.548 0.50097 1 1
D–, cΠ2a–(only for RFS) and cΠ2b–optimal designs for Z = {0,1} are optimal for Z = [0,1] (GET).
Bounds for p∗ of D–optimal designs:
-20 000
-10 000
0
10 000
20 000
P2 a
-20 000
-10 000
0
10 000
20 000
P2 b
0.4
0.5
0.6
p*
-20 000
-10 000
0
10 000
20 000
P2 a
-20 000
-10 000
0
10 000
20 000
P2 b
0.4
0.5
0.6
p*
a) b)
Figure: Values of p∗ for (Π2a,Π2b) with ρ = 0.8 for D–optimal design in: a) SES model and b) RFS model.
Theorem (p∗ bounds)
For SES and RFS models (with ρ known or unknown) the weight of D–optimal design in Z = {0,1}for all values of ρ, Π2a, Π2b and β is bounded:
p∗ ∈ (1/3,2/3).
5. Robustness of D–optimal designs
Lower bound for D–efficiency:
D–effθ∗(p0) =|Mθ∗(p∗)|−1
|Mθ∗(p0)|−1 =|M(Π∗
2a,Π∗2b,ρ)(p∗)|−1
|M(Π∗2a,Π
∗2b,ρ)(p0)|−1
0.35 0.40 0.45 0.50 0.55 0.60 0.65p*
0.80
0.85
0.90
0.95
eff
0.35 0.40 0.45 0.50 0.55 0.60 0.65p*
0.80
0.85
0.90
0.95
eff
a) b)Figure: Values of D–effθ∗(p0) for SES model and
different values of (Π2a,Π2, ρ) with: a) ρ known andb) ρ unknown.
Theorem (D–efficiency lower bound)
The minimum D–efficiencies for SES andRFS models for all values of ρ, Π2a, Π2band β are:I for ρ known: min D-effθ∗(p0) = 1
21/3 ,
I for ρ unknown: min D-effθ∗(p0) = 121/4 .
Particular cases:If nominal values are θ0 but true values are θ∗...
-10
0
10P2 a -10
0
10
P2 b
0.92
0.94
0.96
0.98
1.00
D-eff
-10
0
10P2 a
-10
0
10
P2 b
0.85
0.90
0.95
1.00
D-eff
Figure: D–eff (with ρ known) for a neighborhood of θ0:a) θ0 = (4,1,0.8) (p0 = 0.547) b) θ0 = (4,−3,0.8)
(p0 = 0.368)
I D–eff in the point θ∗ = θ0 is 1.I 3 points more with D–eff = 1 (blacks in figures)I D–eff is high for true values (Π∗2a,Π∗2b):
a) greater (both) than (Π2a0,Π2b0),b) less (both) than (Π2a0,Π2b0).
I D-eff decay for true values in the direction:a) Π2a = −Π2b,b) Π2a = 0.
I min D-eff in the neighborhood is:a) 91.46%,b) 83.60%.
6.Conclusions
I In both models the optimal designs depend on the nominal values.I For Z = {0,1}:
I Similar SES optimal designs either for ρ known or unknown.I The same RFS optimal designs either for ρ known or unknown.I Similar optimal designs for SES and RFS, except the cΠ2a-optimality.
Ex: (Π2a,Π2b) = (4,1) (ρ = .8 known) ξ∗SESΠ2a=
{01
}and ξ∗RFSΠ2a
=
{0 1
1− 0.528 0.528
}.
but the relative efficiencies are quite good:
effSES (ξ∗RFS) =cT M−1
SES
(θ, ξ∗SES
)c
cT M−1SES
(θ, ξ∗RFS
)c
= 91.1% effRFS (ξ∗SES) =cT M−1
RFS
(θ, ξ∗RFS
)c
cT M−1RFS
(θ, ξ∗SES
)c
= 95.2%
I Bounds por p∗ of D–optimal design: p∗ ∈ (1/3,2/3).
I Lower bound for D–efficiency: min D-effθ∗(p0) =1
21/3 (ρ known) or1
21/4 (ρ unknown).
7. References
Aigner, D. J., and Balestra, P. (1988), “Optimal experimental design for error components models,”ECONOMETRICA, 56 (4), 955-971.
Conlisk, J. (1979), “Design for simultaneous equations,” J ECONOMETRICS, 11 (1), 63-76.
Hahn, J., Hirano, K., and Karlan, D. (2011), “Adaptive Experimental Design using the PropensityScore,” J BUS ECON STAT, 29 (1), 96-108.
Lopez-Fidalgo J. y Garcet-Rodrıguez S. (2004). Optimal experimental designs when someindependent variables are not subject to control. J AM STAT ASSOC. Vol.99.
Papakyriazis, P. A. (1986), “Adaptive optimal estimation control strategies for systems ofsimultaneous equations,” MATH MODELLING, 7, 241-257.
Poskitt, D.S. and Skeels, C.L. (2008) Conceptual frameworks and experimental design insimultaneous equations. ECON LETT 100