Introduction to Bayesian SEM
-
Upload
milicalazic123456789 -
Category
Documents
-
view
224 -
download
0
Transcript of Introduction to Bayesian SEM
-
8/11/2019 Introduction to Bayesian SEM
1/127
Bayesian Structural Equation Modelling Using Mplus
-
8/11/2019 Introduction to Bayesian SEM
2/127
Overview: Major Steps in the Bayesian Approach to Data Analysis
-
8/11/2019 Introduction to Bayesian SEM
3/127
Research Question
Estimation
Model fit
Hypotheses evaluation - Model selection
The Data to be Collected: Variables and Sample Size
How to enter the data into Mplus
Missing data
The Statistical Model
How to specify a statistical model in MplusHow to specify an imputation model in Mplus
The Prior Distribution
Default Uninformative and how to specify in Mplus
Informative and how to specify in Mplus
The Posterior Distribution
Estimates and credible intervals
How to check convergence
Model fit
Hypothesis evaluation and Model selection
How to interpret Mplus output
-
8/11/2019 Introduction to Bayesian SEM
4/127
Lecture 1: Bayesian Estimation
Data, Research Question, and Statistical Model
-
8/11/2019 Introduction to Bayesian SEM
5/127
Research Question
?
-
8/11/2019 Introduction to Bayesian SEM
6/127
The Data N=65
... ... ... ...20 3 7 13
21 1 4 11
22 2 4 9
23 3 6 11
24 4 6 9
25 8 7 1626 11 9 16
27 5 3 7
28 5 5 8
29 11 6 14
30 6 6 10
31 7 5 11
32 8 8 10
33 9 5 8
34 2 2 1
35 4 4 8
... ... ... ...
ID Stork Urban Babies
title:
Mediation Model for the Stork Data;
data:
file = stork.txt;
variable:
names = ID stork urban babies;
usev = stork urban babies;
-
8/11/2019 Introduction to Bayesian SEM
7/127
The Statistical Model - 1
model:
urban on stork (a);babies on urban stork (b c);
[urban] (d);
[babies] (e);
urban (f);
babies (g);
Stork
Urban
Babies
a b
c
f
g
-
8/11/2019 Introduction to Bayesian SEM
8/127
The Statistical Model - 2
d
af
Urban = d + a Stork + error with error ~ N(0,f)
-
8/11/2019 Introduction to Bayesian SEM
9/127
The Statistical Model - 3
City
Village
Rural
c
Babies = e + c Stork + b Urban + error with error ~ N(0,g)
-
8/11/2019 Introduction to Bayesian SEM
10/127
Lecture 1: Bayesian Estimation
Introducing Prior, Posterior, and Sampling Based Estimation
Using One Variable
-
8/11/2019 Introduction to Bayesian SEM
11/127
The Prior Distribution - 1 - Introduction - Non Informative Prior Distribution
A simple example based on expert elicitation: How many babies are born per
1,000 inhabitants per year in the Netherlands .
... ...
20 13
21 11
21 9
22 11
23 9
24 16
25 16
26 7
27 8
... ...
ID Babies
Data
The mean is 9, the standard error of the mean
is .5 this means that the data tell us that
between 8 and 10 babies are born.
Note I computed a confidence interval for the
mean using 9 +/- 2 x .5. Note that 2 is almost
1.96 a more precise value for the computation
of a confidence interval.
No prior information was used, that is, an
uninformative prior distribution was used.
model:
[babies] (a);
babies (b);
-
8/11/2019 Introduction to Bayesian SEM
12/127
The Prior Distribution - 2 - Introduction - Informative Prior Distribution
Expert Elicitation:
I assume that in each region containing 1,000 persons, the age distribution is
uniform between 0-100 years of age.
This means that each year 200 persons are between 20 and 40 years of age
(the fertile years). Which renders 80 couples and 40 bachelors.
On average I expect each couple to have 2 children, that is, 160 children over
the course of 20 years. This means 8 children per year per region containing
1,000 persons.
In my line of argument Im most uncertain about the uniform age distribution.
I know the amount of elderly is increasing, so maybe there are only 160 persons
between 20 and 40 years of age 64 couples 128 children about 6 children
per year. On the other hand there may still be less elderly than young, so maybe
240 96 couples 192 chidren about 10 per year.
In summary, I expect 8, but my credible interval is between 6 and 10 which means
my personal standard error is 1 (8 +/- 2 x 1 gives my credible interval).
-
8/11/2019 Introduction to Bayesian SEM
13/127
The Prior Distribution - 3 - Introduction
The Normal Prior Distribution Used for means and regression coefficients.
MODEL PRIORS:
a ~ N(8,1);
8
8
106
1410 12
12 14
6
42
42
8 1410 12642
MODEL PRIORS:
a ~ N(8,9);
MODEL PRIORS:
a ~ N(8,100000);
a
a
a
-
8/11/2019 Introduction to Bayesian SEM
14/127
The Prior Distribution - 4 - Introduction
The Inverse Gamma Prior Distribution Used for variances.
MODEL PRIORS:
b ~ IG(.001,.001);
b0
The default in Mpus is
uninformative improper
MODEL PRIORS:
b ~ IG(-1,.0);
b0
Uninformative
proper
h b d
-
8/11/2019 Introduction to Bayesian SEM
15/127
The Posterior Distribution - 1 - Introduction
Combining Data Knowledge and Prior Knowledge
a - Mean Number of Babies
98
Prior Data
Posterior
h i i ib i 2 d i
-
8/11/2019 Introduction to Bayesian SEM
16/127
The Posterior Distribution - 2 - Introduction
The posterior distribution combines the information with respect to the mean
number of babies in the data with the information in the prior distribution. This
combination is executed by Mplus.
Using sampling the information in the posterior distribution with respect to the
mean number of babies is made accesible:
Th P i Di ib i 3 I d i
-
8/11/2019 Introduction to Bayesian SEM
17/127
9.1
7.9
8.3
9.9
7.1
...
...
...
...
a
Estimate:
mean or median
SD
Credible Interval:
central or highest
a
mean
median
2.5% 97.5%
analysis:
estimator = bayes;
process = 2;fbiter = 100000;
point = median;
fbiter
output:
tech1 tech8
standardized(stdyx)
cinterval(hpd);
plot:
type = plot1 plot2 plot3;
Data + Prior
The Posterior Distribution - 3 - Introduction
Th P t i Di t ib ti 4 I t d ti
-
8/11/2019 Introduction to Bayesian SEM
18/127
model:
[babies] (a);
babies (b);
MODEL PRIORS:
a ~ N(8,1);
b ~ IG(.001,.001);
model:[babies] (a);
babies (b);
MODEL PRIORS:
a ~ N(0,100000);
b ~ IG(.001,.001);
Estimate S.D. Lower 2.5% Upper 2.5%
Means
BABIES 9.078 0.443 8.203 9.945
A Non Informative Prior Distribution for the Mean Number of Babies
An Informative Prior Distribution for the Mean Number of Babies
Estimate S.D. Lower 2.5% Upper 2.5%
Means
BABIES 8.904 0.405 8.098 9.688
The Posterior Distribution - 4 - Introduction
-
8/11/2019 Introduction to Bayesian SEM
19/127
Lecture 1: Bayesian Estimation
Introducing Prior, Posterior, and Sampling Based Estimation
Using The Stork Data (three variables) and
Uninformative Priors
Th P i Di t ib ti 5 U i f ti P i Di t ib ti f th St k D t
-
8/11/2019 Introduction to Bayesian SEM
20/127
The Prior Distribution -5 - Uninformative Prior Distributions for the Stork Data
MODEL PRIORS:
a ~ N(0,100000);b ~ N(0,100000);
c ~ N(0,100000);
d ~ N(0,100000);
e ~ N(0,100000);
f ~ IG(.001,.001);
g ~ IG(.001,.001);
model:
urban on stork (a);
babies on urban stork (b c);
[urban] (d);
[babies] (e);
urban (f);
babies (g);
User Specified
Mplus Default
MODEL PRIORS:
a ~ N(0,Infinity);
b ~ N(0,Infinity);
c ~ N(0,Infinity);
d ~ N(0,Infinity);
e ~ N(0,Infinity);
f ~ IG(-1,0);
g ~ IG(-1,0);
The Posterior Distribution 5 Bayesian Estimation Using Markov Chain Monte Carlo Methods
-
8/11/2019 Introduction to Bayesian SEM
21/127
The Posterior Distribution - 5 - Bayesian Estimation Using Markov Chain Monte Carlo Methods
model constraint:
new(indirect);
indirect = a*b;
a b c d e f g indirect
initial values initial values initial values initial values
... ... ... ... ... ... ... ...
.35 1.14 -.11 2.89 4.00 3.46 7.15 .42
.29 1.69 -.32 1.75 5.10 3.01 7.30 .49
... ... ... ... ... ... ... ...fbiter fb fb fb fb fb fb fb
Stork
Urban
Babies
a b
c
f
g
analysis:
estimator = bayes;
process = 2;fbiter = 100000;
point = median;
The Posterior Distribution 6 Output Computed Using the MCMC Sample
-
8/11/2019 Introduction to Bayesian SEM
22/127
output:
tech1 tech8 standardized(stdyx) cinterval(hpd);
plot:
type = plot1 plot2 plot3;
The Posterior Distribution - 6 - Output Computed Using the MCMC Sample
The Posterior Distribution 7 Histograms Estimates and Credible Intervals
-
8/11/2019 Introduction to Bayesian SEM
23/127
Babies on Stork
The Posterior Distribution - 7 - Histograms, Estimates and Credible Intervals
The Posterior Distribution 8 Histograms Estimates and Credible Intervals
-
8/11/2019 Introduction to Bayesian SEM
24/127
Indirect
Note that the credible interval
is not symmetric!
The Posterior Distribution - 8 - Histograms, Estimates and Credible Intervals
The Posterior Distribution 9 Estimates and Credible Intervals
-
8/11/2019 Introduction to Bayesian SEM
25/127
MODEL RESULTS
Posterior One-Tailed 95% C.I.
Estimate S.D. P-Value Lower 2.5% Upper 2.5%
URBAN ON
STORK 0.375 0.072 0.000 0.236 0.517
BABIES ON
URBAN 1.143 0.185 0.000 0.781 1.509STORK -0.111 0.124 0.181 -0.356 0.131
Intercepts
URBAN 2.894 0.460 0.000 1.978 3.787
BABIES 4.007 0.847 0.000 2.320 5.646
Residual Variances
URBAN 3.465 0.653 0.000 2.360 4.828
BABIES 7.159 1.359 0.000 4.937 10.094
New/Additional Parameters
INDIRECT 0.422 0.108 0.000 0.225 0.644
The Posterior Distribution - 9 - Estimates and Credible Intervals
-
8/11/2019 Introduction to Bayesian SEM
26/127
-
8/11/2019 Introduction to Bayesian SEM
27/127
Lecture 1: Bayesian Estimation
INTERMEZZO
P-values
The Posterior Distribution - 10 - The one-tailed p-value
-
8/11/2019 Introduction to Bayesian SEM
28/127
0
90% CI
95%CI
If 90% CI touches 0 the one-tailed p-value is .05.
If 95% CI touches 0 the one-tailed p-value is .025.
For about normal posterior distributions multiplication with 2
renders a two-tailed p-value.
Urban On Stork (a)
The Posterior Distribution 10 The one tailed p value
The Posterior Distribution - 11 - p-values credible intervals and model selection
-
8/11/2019 Introduction to Bayesian SEM
29/127
p-values:
For example .05
Surely God loves the .06 as much as the .05
Publication bias
Multiple hypotheses testing and capitalization on chance
Credible Intervals and Confidence Intervals
What is the value of the parameter of interestIs the parameter positive, negative or is zero also in the ball-park
With multiple parameters still capitalization on chance
Model Selection
Compare a few carefully chosen models
Very power-full in combination with credible intervals andstandardized estimates
The Posterior Distribution 11 p values, credible intervals, and model selection
-
8/11/2019 Introduction to Bayesian SEM
30/127
Lecture 1: Bayesian Estimation
Introducing Prior, Posterior, and Sampling Based Estimation
Using The Stork Data (three variables) and
informative Priors
The Prior Distribution - 6- Informative Based on Historical Data
-
8/11/2019 Introduction to Bayesian SEM
31/127
The Prior Distribution 6 Informative Based on Historical Data
... ... ... ...
20 3 7 13
21 1 4 11
22 2 4 9
23 3 6 11
24 4 6 9
25 8 7 16
26 11 9 16
27 5 3 7
28 5 5 8
29 11 6 14
30 6 6 10
31 7 5 1132 8 8 10
33 9 5 8
34 2 2 1
35 4 4 8
... ... ... ...
ID Stork Urban Babies
1 5 62 11 8
... ... ...
... ... ...
80 0 1
ID Stork Urban
The Current Data Historical Data
model:
urban on stork (a);
[urban] ;
urban;
MODEL RESULTS Estimate S.D.URBAN ON STORK 0.400 0.050
a ~ N(.400,.0025)
-
8/11/2019 Introduction to Bayesian SEM
32/127
The Prior Distribution - 8- Informative Based on Historical Data
-
8/11/2019 Introduction to Bayesian SEM
33/127
MODEL PRIORS:
a ~ N(.400,.0025);
b ~ N(0,100000);
c ~ N(0,100000);
d ~ N(0,100000);e ~ N(0,100000);
f ~ IG(.001,.001);
g ~ IG(.001,.001);
model:
urban on stork (a);
babies on urban stork (b c);
[urban] (d);
[babies] (e);urban (f);
babies (g);
User Specified
Suppose the data are collected by another research group
in the Netherlands in 2010.
e o st but o 8 o at e ased o sto ca ata
The Posterior Distribution - 12- Comparing Results from Uninformative and Informative Priors
-
8/11/2019 Introduction to Bayesian SEM
34/127
MODEL RESULTS Estimate S.D. Lower 2.5% Upper 2.5%
URBAN ON STORK 0.375 0.072 0.236 0.517
INDIRECT 0.422 0.108 0.225 0.644
MODEL RESULTS Estimate S.D. Lower 2.5% Upper 2.5%
URBAN ON STORK 0.391 0.041 0.314 0.473
INDIRECT 0.444 0.086 0.283 0.621
MODEL PRIORS:a ~ N(.400,.0025);
MODEL PRIORS:
a ~ N(0,100000);
The result of using subjective priors is a gain in information. But, do you trust this?
Would you be willing to use and defend this approach?
p g
The Prior Distribution - 9 - Extra Tools for the Specification of Informative Priors
-
8/11/2019 Introduction to Bayesian SEM
35/127
p
MODEL PRIORS:
b ~ N (0, 1);
c ~ N (0, 1);
COVARIANCE (b, c) = 0.5;
output:
tech1 tech3 tech8
standardized(stdyx) cinterval(hpd);
Summary
-
8/11/2019 Introduction to Bayesian SEM
36/127
y
Research Question
Statistical Model
Prior Distribution - Informative Prior Distributions
Posterior Distribution
- Assymetric Credible Intervals
- Small Sample Inferences no Asymptotic Approximations
- No Heywood Cases, Like, for Example, Negative Variances
- Sampling will ofter Work where Maximum Likelihood Fails
References Bayesian Structural Equation Modelling
-
8/11/2019 Introduction to Bayesian SEM
37/127
y q g
A relatively accessible introduction to Bayesian structural equation modeling can be found in:
Kaplan, D. and Depaoli, S. (2012). Bayesian Structural Equation Modeling. In R.H. Hoyle (Ed.),
Handbook of Structural Equation Modeling, pp. 650-673. New York: The Guilford Press.
A classic about the elicitation of prior knowledge is:
OHagan, A., Buck, C.E., Daneshkhah, A., Eiser, J.R., Garthwaithe, P.H., Jenkinson, D.J..,
Oakley, J.E., and Rakow, T. (2006). Uncertain Judgements. Eliciting Experts Probabilities.
Chichester: Wiley.
A classic introduction to Bayesian data analysis is:
Gelman, A., Carlin, J.B., Stern, H.S., and Rubin, D.B. (2004). Bayesian Data Analysis.
Boca Raton, FL: Chapman & Hall/CRC.
The documentation provided by Mplus is:
Muthen, B. (2010). Bayesian analysis in Mplus: A brief introduction.
Asparouhov, T. and Muthen, B. (2010). Bayesian analysis in Mplus: Technical Implementation.
-
8/11/2019 Introduction to Bayesian SEM
38/127
Lecture 2: Bayesian Estimation in the Presence of Missing Data
Introduction
Missing Data - 1 - Introduction
-
8/11/2019 Introduction to Bayesian SEM
39/127
... ... ... ...20 3 7 13
21 1 4 11
22 999 4 9
23 3 6 11
24 4 6 9
25 8 7 1626 11 999 999
27 5 3 7
28 5 5 8
29 999 6 14
30 6 6 10
31 7 5 999
32 8 8 10
33 999 999 8
34 2 2 1
35 4 4 8
... ... ... ...
ID Stork Urban Babies
variable:
names = ID stork urban babies;
usev = stork urban babies;
missing = all (999);
Stork
Urban
Babies
a b
c
f
g
Missing Data - 2 - Introduction
-
8/11/2019 Introduction to Bayesian SEM
40/127
By default Mplus with analysis: estimator = bayes; will use the statistical model
that is specified to impute the missing data.
First I will explain what is meant by imputation of the missing data.
Secondly I will explain why it is usually NOT a good idea to used the statistical model
that is specified to impute the missing data.
One exception occurs if the amount of missing values is very small. A good question iswhat is a small amount of missing values?
Another exception occurs if missings occur in variables that are ONLY a dependent
variable and if the missingness is MAR given the predictors of the dependent variable.
Third of all I will introduceMultiple imputation using a general imputation model
Analysis of each imputed data set using a statistical model that is consistent with the
imputation model
Summarizing the results obtained from the analysis of each imputed data set
-
8/11/2019 Introduction to Bayesian SEM
41/127
Lecture 2: Bayesian Estimation in the Presence of Missing Data
Multiple Imputation
Multiple Imputation Using the Statistical Model - 1
-
8/11/2019 Introduction to Bayesian SEM
42/127
d
af
Multiple Imputation Using the Statistical Model - 2
-
8/11/2019 Introduction to Bayesian SEM
43/127
a b c d e f g 22-S 26-U 26-B 29-U 31-B 33-S 33-U
0 0 0 0 0 1 1 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ...
.35 1.14 -.11 2.89 4.00 3.46 7.15 5 5 12 7 9 2 3
.29 1.69 -.32 1.75 5.10 3.01 7.30 7 3 11 5 10 3 4
... ... ... ... ... ... ... ... ... ... ... ... ... ...
fbiter fb fb fb fb fb fb fb fb fb fb fb fb fb
MODEL RESULTS
Posterior One-Tailed 95% C.I.
Estimate S.D. P-Value Lower 2.5% Upper 2.5%
BABIES ON
URBAN 1.143 0.185 0.000 0.781 1.509
STORK -0.111 0.124 0.181 -0.356 0.131
New/Additional Parameters
INDIRECT 0.422 0.108 0.000 0.225 0.644
-
8/11/2019 Introduction to Bayesian SEM
44/127
Lecture 2: Bayesian Estimation in the Presence of Missing Data
Data that are not Missing at Random
Multiple Imputation Using the Statistical Model- 3 - Data that are NOT Missing at Random
-
8/11/2019 Introduction to Bayesian SEM
45/127
Stork
Urban
Babies
a b
c
f
g
... ... ... ...
20 3 7 1321 1 4 11
22 999 4 9
23 3 6 11
24 4 6 9
25 8 7 16
26 11 999 99927 5 3 7
28 5 5 8
29 7 999 14
30 6 6 10
31 7 5 999
32 8 8 1033 999 999 8
34 2 2 1
35 4 4 8
... ... ... ...
ID Stork Urban Babies
Multiple Imputation Using the Statistical Model- 4 - Data that are NOT Missing at Random
-
8/11/2019 Introduction to Bayesian SEM
46/127
Urban Babies
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
Urban Babies
1 1
2 23 3
4 4
5
6
7
8
Urban Babies
1 1
2 23 3
4 4
2.5 5
2.5 6
2.5 7
2.5 8
Urban Babies
1 1
2 23 3
4 4
5 5
6 6
7 7
8 8
Urban Babies
Model:
Babies on Urban;
Urban;
Model:
Babies with Urban;
Urban Babies
-
8/11/2019 Introduction to Bayesian SEM
47/127
Lecture 2: Bayesian Estimation in the Presence of Missing Data
Data that are Missing at Random
Multiple Imputation Using the Statistical Model- 5 - Data that are Missing at Random
-
8/11/2019 Introduction to Bayesian SEM
48/127
Urban Babies
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
Urban Babies
1 1
2 23 3
4 4
5
6
7
8
Urban Babies
1 1
2 23 3
4 4
5 5
6 6
7 7
8 8
Urban Babies
1 1
2 23 3
4 4
5 5
6 6
7 7
8 8
Urban Babies
Model:
Babies on Urban;
Urban;
Model:
Babies with Urban;
Urban Babies
Multiple Imputation Using a General Imputation Model - 1 - Data that are Missing at Random
-
8/11/2019 Introduction to Bayesian SEM
49/127
... ... ... ...
20 3 7 1321 1 4 11
22 999 4 9
23 3 6 11
24 4 6 9
25 8 7 16
26 11 999 99927 5 3 7
28 5 5 8
29 7 999 14
30 6 6 10
31 7 5 999
32 8 8 1033 999 999 8
34 2 2 1
35 4 4 8
... ... ... ...
ID Stork Urban Babies
Stork
Urban
Babies
model:
stork with urban;stork with babies;
urban with babies;
[stork];
[urban];
[babies];
-
8/11/2019 Introduction to Bayesian SEM
50/127
Multiple Imputation Using a General Imputation Model - 2 - How to do it in Mplus
-
8/11/2019 Introduction to Bayesian SEM
51/127
title: this is an example of multiple imputation
for a set of variables with missing values using
a general statistical model;
data: FILE = storkMI.txt;
variable:
names = ID stork urban babies;
auxiliary = ID;
usevariables = stork urban babies;missing = all (999);
analysis: estimator = bayes;
fbiter = 10000;
proces = 2;
data imputation:
impute = stork urban babies;
ndatasets = 10;
thin = 1000;
save = storkimp*.dat;
model: stork with urban babies;
urban with babies;
[stork];
[urban];
[babies];
output: tech8;
plot: type = plot1 plot2 plot3;
Multiple Imputation Using a General Imputation Model - 3 - Multiple Imputations
-
8/11/2019 Introduction to Bayesian SEM
52/127
... ... ... ...
20 3 7 13
21 1 4 11
22 999 4 9
23 3 6 11
24 4 6 925 8 7 16
26 11 999 999
27 5 3 7
28 5 5 8
29 999 6 14
30 6 6 10
... ... ... ...
ID Stork Urban Babies
... ... ... ...
3 7 13 20
1 4 11 21
4 4 9 22
3 6 11 234 6 9 24
8 7 16 25
11 8 12 26
5 3 7 27
5 5 8 28
9 6 14 29
6 6 10 30
... ... ...
... ... ... ...
3 7 13 20
1 4 11 21
7 4 9 22
3 6 11 234 6 9 24
8 7 16 25
11 9 14 26
5 3 7 27
5 5 8 28
8 6 14 29
6 6 10 30
... ... ... ...
... ... ... ...
3 7 13 20
1 4 11 21
5 4 9 22
3 6 11 234 6 9 24
8 7 16 25
11 8 13 26
5 3 7 27
5 5 8 28
11 6 14 29
6 6 10 30
... ... ... ...
... ... ... ...
3 7 13 20
1 4 11 21
6 4 9 22
3 6 11 234 6 9 24
8 7 16 25
11 5 10 26
5 3 7 27
5 5 8 28
11 6 14 29
6 6 10 30
... ... ... ...
Stork Urban Babies ID Stork Urban Babies IDStork Urban Babies ID Stork Urban Babies ID
m = 1, ..., M
Multiple Imputation Using a General Imputation Model- 4 - Data that are Missing at Random
-
8/11/2019 Introduction to Bayesian SEM
53/127
It can never be ensured that data are missing at random.
Use enough variables in the imputation model to feel confident that
MAR is a reasonable assumption. There may be variables in the imputation
model that do not appear in the statistical model.
Can we in our example think of variables that could be very goodpredictors of missing data and that are not part of the statistical model?
Never use to many variables in the imputation model. A rule of thumb is
1 variable for every 20 cases in the data file. But this is only a rule of thumb!
Creating a good imputation model is partly ART, partly SKILL, and ratherBAYESIAN because it requires carefull prior thinking, that is thinking
without using empirical data.
Multiple Imputation Using a General Imputation Model - 5 - How to do it in Mplus
-
8/11/2019 Introduction to Bayesian SEM
54/127
title:
Mediation Model for the Stork Data;
data:file = storkimplist.dat;
type = imputation;
variable:
names = stork urban babies ID;
usev = stork urban babies;
missing = all (999);
model:
urban on stork (a);
babies on urban stork (b c);
[urban] (d);
[babies] (e);
urban (f);
babies (g);
model constraint:
new(indirect);
indirect = a*b;
analysis:
estimator = ml;
output:
standardized(stdyx);
Note the difference between the imputation model
and the statistical model!!
It is also quite common that the statistical model
contains only a subset of the variables used in the
imputation model.
Multiple Imputation Using a General Imputation Model - 6 - Analyse Each Imputed Data Set
-
8/11/2019 Introduction to Bayesian SEM
55/127
... ... ... ...
20 3 7 13
21 1 4 11
22 999 4 9
23 3 6 11
24 4 6 925 8 7 16
26 11 999 999
27 5 3 7
28 5 5 8
29 999 6 14
30 6 6 10
... ... ... ...
ID Stork Urban Babies
... ... ... ...
3 7 13 20
1 4 11 21
4 4 9 22
3 6 11 234 6 9 24
8 7 16 25
11 8 12 26
5 3 7 27
5 5 8 28
9 6 14 29
6 6 10 30
... ... ...
... ... ... ...
3 7 13 20
1 4 11 21
7 4 9 22
3 6 11 234 6 9 24
8 7 16 25
11 9 14 26
5 3 7 27
5 5 8 28
8 6 14 29
6 6 10 30
... ... ... ...
... ... ... ...
3 7 13 20
1 4 11 21
5 4 9 22
3 6 11 234 6 9 24
8 7 16 25
11 8 13 26
5 3 7 27
5 5 8 28
11 6 14 29
6 6 10 30
... ... ... ...
... ... ... ...
3 7 13 20
1 4 11 21
6 4 9 22
3 6 11 234 6 9 24
8 7 16 25
11 5 10 26
5 3 7 27
5 5 8 28
11 6 14 29
6 6 10 30
... ... ... ...
Stork Urban Babies ID Stork Urban Babies IDStork Urban Babies ID Stork Urban Babies ID
m = 1, ..., M
Estimate SD Estimate SD Intercepts Estimate SD Estimate SD
10.109 1.303 9.843 1.221 BABIES 10.567 1.432 9.992 1.271
Estimate 10.002 SD 1.672 Rate of Missing Information .22
Multiple Imputation Using a General Imputation Model - 7 - Relative Efficiency
-
8/11/2019 Introduction to Bayesian SEM
56/127
Relative efficiency = 1 / (1 + rate/M)
For the example on the previous transparancy:
Relative efficiency = 1 / (1 + .22/10) = .98
Multiple Imputation Using a General Imputation Model - 8 - Summarize the Multiple Analyses
-
8/11/2019 Introduction to Bayesian SEM
57/127
INDIRECT 0.395 0.114 3.462 0.001 0.184
STDYX Standardization Two-Tailed Rate ofEstimate S.E. Est./S.E. P-Value Missing
URBAN ON
STORK 0.536 0.095 5.633 0.000 0.123
BABIES ON
URBAN 0.693 0.110 6.307 0.000 0.234STORK -0.123 0.124 -0.986 0.324 0.152
Intercepts
URBAN 1.335 0.299 4.463 0.000 0.059
BABIES 1.286 0.343 3.755 0.000 0.109
Residual Variances
URBAN 0.712 0.101 7.026 0.000 0.120BABIES 0.593 0.105 5.626 0.000 0.183
R-SQUARE
URBAN 0.288 0.101 2.842 0.004 0.120
BABIES 0.407 0.105 3.867 0.000 0.183
-
8/11/2019 Introduction to Bayesian SEM
58/127
-
8/11/2019 Introduction to Bayesian SEM
59/127
Lecture 2: Bayesian Estimation in the Presence of Missing Data
A Closer Look at the Imputation Model
-
8/11/2019 Introduction to Bayesian SEM
60/127
Multiple Imputation Using a General Imputation Model - 11 - Consistency
-
8/11/2019 Introduction to Bayesian SEM
61/127
Stork Babies
Stork
Urban
Babies
Multiple Imputation Using a General Imputation Model - 12 - Non Consistency
-
8/11/2019 Introduction to Bayesian SEM
62/127
Stork
Urban
Babies
Stork
Urban
Babies
Stork
*
Stork
Multiple Imputation Using a General Imputation Model - 13- Non Consistency
-
8/11/2019 Introduction to Bayesian SEM
63/127
Stork
Urban
Babies
Stork Urban
Babies
Stork
*
Urban
Summary
-
8/11/2019 Introduction to Bayesian SEM
64/127
Imputation model and statistical model
Does the imputation model render data that are missing at random?
Are the imputation model and the statistical model congeneal?
The combination of multiple imputation with estimator = ML is possiblein Mplus. The combination with estimator = Bayes is not possible.
References Missing Data
A non technical introduction to missing data analysis and multiple imputation can be found in:
-
8/11/2019 Introduction to Bayesian SEM
65/127
A non-technical introduction to missing data analysis and multiple imputation can be found in:
Schafer, J.L. And Graham, J.W. (2002). Missing data: Our view of the state of the
art. Psychological Methods, 7, 147-177.
Classic books about missing data analysis and multiple imputation are
Rubin, D.B. (1987). Multiple imputation for nonresponse in surveys.New York: Wiley.
Schafer, J.L. (1997).Analysis of incomplete multivariate data.London: Chapman & Hall.
A contemporary book is:
Van Buuren, S. (2012). Flexible imputation of missing data.Boca Raton: Chapmann & Hall/CRC.
An important paper with respect to consistency is:
Meng, X-L. (2002). Multiple imputation inferences with uncongenial sources of input.
Statistical Science, 9, 538-573.
The documentation provided by Mplus is:
Asparouhov, T. and Muthen, B. (2010). Multiple imputation with Mplus.
Mplusautomation is developed by Michael Hallquist. It can be found at www.statmodel.comunder the tab
How-To choose Using Mplus via R.
http://www.statmodel.com/http://www.statmodel.com/ -
8/11/2019 Introduction to Bayesian SEM
66/127
Lecture 3: Model Fit
Model Fit 1 The Covariance Matrix
-
8/11/2019 Introduction to Bayesian SEM
67/127
... ... ... ...
20 3 7 13
21 1 4 11
22 2 4 9
23 3 6 11
24 4 6 9
25 8 7 16
26 11 9 1627 5 3 7
28 5 5 8
29 11 6 14
30 6 6 10
31 7 5 11
32 8 8 1033 9 5 8
34 2 2 1
35 4 4 8
... ... ... ...
ID Stork Urban Babies
S U B
S 10. 7
U 4. 0 4. 8
B 3. 4 5. 1 12. 2
The observed covariance matrix displays the
relation between each pair of variables in the
data matrix.
The model implied covariance matrix is a
reconstruction of the observed covariance
matrix using the statistical model at hand.
Model Fit 2 What is model fit? Why is it important?
9 model parameters
-
8/11/2019 Introduction to Bayesian SEM
68/127
Stork Urban Babies
Stork
Urban
Babies
Stork
Urban
Babies
S U B
S 10. 7
U 4. 0 4. 8
B 3. 4 5. 1 12. 2
S U B
S 10. 7
U 4. 0 4. 8
B 3. 4 5. 1 12. 2
S U B
S 10. 7
U 0 4. 8
B 0 0 2. 2
Observed = Model Implied
Model Implied
Model Implied
Covariance Matrices9 model parameters
7 model parameters
6 model parameters
Model Fit 3
-
8/11/2019 Introduction to Bayesian SEM
69/127
The chi square test is computed for each statistical model. It is a function of
- The observed covariance matrix- The model implied covariance matrix
- The difference between the number of parameters of the current and the
saturated statistical model.
It is a measure of the size of the difference between the observed and implied
covariance matrices.
The larger the size of the difference, that is, the larger the chi square value, the
less a statistical model is able to reconstruct the observed covariance matrix.
The hypothesis that is tested using the chi square test states that
the observed covariance matrix can adequately be reconstructed bythe current statistical model.
Model Fit 4
Using the observed data
-
8/11/2019 Introduction to Bayesian SEM
70/127
Stork Urban BabiesUsing the observed data
and the statistical model
at hand
Parameters are sampledM-V M-V ... M-V
Used to Replicate Data
and Impute observed
missings
Xobs-Xrep Xobs-Xrep ... Xobs-Xrep
Used to compute the
CHI-test using the
parameters and theobserved-imputed
and replicated data
CHIobs-CHIrep CHIobs-CHIrep ... CHIobs-CHIrep
The proportion of pairs in which CHIrep is larger than CHIobs is the posterior predictive p-value
Model Fit 5
-
8/11/2019 Introduction to Bayesian SEM
71/127
Model Fit 6
-
8/11/2019 Introduction to Bayesian SEM
72/127
Stork Urban Babies
MODEL FIT INFORMATION
Number of Free Parameters 6
Bayesian Posterior Predictive Checking using Chi-Square
95% Confidence Interval for the Difference Between
the Observed and the Replicated Chi-Square Values
48.046 71.430
Posterior Predictive P-Value 0.000
Posterior predictive p-values around .50 indicate a model that
for all practical purposes is well fitting. Note that this approach
provides a rough model check and not a classical evaluation of
an hypothesis using a p-value.
References Model Fit
-
8/11/2019 Introduction to Bayesian SEM
73/127
This model fit test was proposed by:
Scheines, R., Hoijtink, H., and Boomsma, A. (1999). Bayesian Estimation and Testing
of Structural Equation Models. Psychometrika, 64, 37-52.
Who based it on the work by:
Gelman, A., Meng, X-L, and Stern, H. (1996). Posterior predictive assessment of model
fitness via realized discrepancies. Statistica Sinica, 6, 733-807.
The documentation provided by Mplus is:
Asparouhov, T. and Muthen, B. (2010). Bayesian analysis in Mplus: Technical Implementation.
-
8/11/2019 Introduction to Bayesian SEM
74/127
Model Selection 1 Introduction
-
8/11/2019 Introduction to Bayesian SEM
75/127
What is a model?
Stork
Urban
Babies
Stork
Urban
Babies
Stork
Urban
Babies
Model Selection 2 Introduction
-
8/11/2019 Introduction to Bayesian SEM
76/127
What is a model?
IQ
AA
LA
A
A
A A
A
A
L
L
L
L
L
L
h d l?
Model Selection 3 Introduction
-
8/11/2019 Introduction to Bayesian SEM
77/127
What is a model?
Stork Babies
Stork Babies
Babies = a + b stork + error
b
b
MODEL PRIORS:
a ~ N(4,1)
b ~ N(1,1)
MODEL PRIORS:
a ~ N(4,1)
b ~ N(4,1)
-
8/11/2019 Introduction to Bayesian SEM
78/127
Lecture 4: Model Selection Using the Bayes Factor, BIC and DIC
What is the Goal of Model Selection?
Model Selection 4 Introduction
-
8/11/2019 Introduction to Bayesian SEM
79/127
What is the goal of model selection?
To select the best model from the models that are under consideration.
What is the best model?
There are multiple answers to this question. Later in this lecture we will introduce
two options:
The model that has the smallest distance to the true model (DIC)
The model that maximizes the probability of the data (Bayes factor and BIC)
But all answers involve an evaluation of the misfit and complexity of each model.
Model Selection 5 Introduction
-
8/11/2019 Introduction to Bayesian SEM
80/127
What if the models are all wrong?
What if the true model is not in the set of models under consideration?
All models are wrong but some are useful
Should the null-hypothesis be among the models under consideration?
Should the alternative hypothesis be among the models under consideration?
It can serve as a fail-safe for the models under consideration. A model withrestrictions is only a good model if it is better than the corresponding model
without restrictions.
=
Model Selection 6 Introduction
-
8/11/2019 Introduction to Bayesian SEM
81/127
Why is model selection consistent with the empirical cycle?
Observation (exploratory research!!)
Induction: from observations to a theory
Deduction: deriving testable consequences fromthe theory, that is, models or hypotheses
Testing: confrontation of models or hypotheses
with empirical data
Model Selection 7 Introduction
Why is Bayesian inference consistent with the empirical cycle?
-
8/11/2019 Introduction to Bayesian SEM
82/127
Why is Bayesian inference consistent with the empirical cycle?
Observation (exploratory research!!)
Induction: from observations to a theory
Deduction: deriving testable consequences from
the theory, that is, models or hypotheses
Testing: confrontation of models or hypotheses
with empirical data
Prior knowledge and
prior thinking
Plausible models, probably
not the true model
Select the best model =
the current state of knowledge
Remember the earth is flat, the earth is round, and the earth is shaped somewhat
like an American football. This too is sequential theory updating using new data as they
become available.
-
8/11/2019 Introduction to Bayesian SEM
83/127
-
8/11/2019 Introduction to Bayesian SEM
84/127
Lecture 4: Model Selection Using the Bayes Factor, BIC and DIC
Information Criteria
Model Selection 1 Information Criteria
-
8/11/2019 Introduction to Bayesian SEM
85/127
IC = misfit + complexity
The smaller the value of IC the better the model at hand. Because:
We like well-fitting models
We like parsimonious, that is specific, not-complex models because
we can derive good predictions from them
misfit is determined by the posterior distribution
of the model parameters
complexity is a function of the number of parameters in model
and the amount of information in the prior distribution
to illustrate the main features a number of examples will be given
Model Selection 2 Information Criteria
-
8/11/2019 Introduction to Bayesian SEM
86/127
x
y
What is the y-value?
?1
?2
?3
Model Selection 3 Information Criteria
-
8/11/2019 Introduction to Bayesian SEM
87/127
x
y
What is the y-value?
?1
?2
?3
What is the fit of this model?
What is the complexity of this model?
Model Selection 4 Information Criteria
-
8/11/2019 Introduction to Bayesian SEM
88/127
x
y
What is the y-value?
?1
?2
?3
What is the fit of this model?
What is the complexity of this model?
Model Selection 5 Information Criteria Stork can not Predict Babies
-
8/11/2019 Introduction to Bayesian SEM
89/127
Stork Babiespopulation correlation = 0, N=100
Stork Babies Stork Babiescompeting models
DIC = 274.67
misfit = 268.45
par = 3.11
BIC = 282.30
misfit = 268.38
par = 3.00
DIC = 272.23
misfit = 268.65
par = 1.89
BIC = 277.61
misfit = 268.39
par = 2.00
Model Selection 6 Information Criteria Stork can Predict Babies
-
8/11/2019 Introduction to Bayesian SEM
90/127
Stork Babiespopulation correlation = .6, N=100
Stork Babies Stork Babiescompeting models
DIC = 229.54misfit = 223.32
par = 3.11
BIC = 237.07
misfit = 223.25
par = 3.00
DIC = 273.48misfit = 269.70
par = 1.89
BIC = 278.86
misfit = 269.65
par = 2.00
Model Selection 7 Information Criteria DIC and BIC can not Evaluate Models that Differ inthe Prior
-
8/11/2019 Introduction to Bayesian SEM
91/127
TITLE: Illustrate misfit
and complexity;
MONTECARLO:NAMES ARE y x;
NOBSERVATIONS = 10000;
NREPS = 1;
SEED = 123;
MODEL POPULATION:y ON x * .6;
[y * 0];
y * .64;
[x * 0];
x * 1;
analysis:
estimator = bayes;
MODEL PRIORS:
a ~ N(.6,.01);
MODEL: y ON x (a);
OUTPUT: TECH9;
Simulate a data matrix
Analyse the simulated data matrix
Specification of the
simulation model
Specification of the
simulation study
y = a + b x + error and error ~ N(0,s2)
var y = b**2 var x + s2
= .6**2 + .64
= 1.0
Why is b in this setup the correlation:
Model Selection 8 Information Criteria DIC and BIC can not Evaluate Models that Differ inthe Prior
-
8/11/2019 Introduction to Bayesian SEM
92/127
MODEL PRIORS:
b ~ N(.6,.01)
MODEL PRIORS:
b ~ N(0,1000000)
Stork Babiespopulation
correlation = .6
MODEL PRIORS:
b ~ N(0,.01)
N = 10000
DIC = 24060.54
par = 2.98
BIC = 24082.21
par = 3.00
DIC = 24060.33
par = 2.99
BIC = 24081.98
par = 3.00
DIC = 24060.35
par = 3.00
BIC = 24081.98
par = 3.00
N = 500
DIC = 1198.10
par = 2.88
BIC = 1210.95
par = 3.00
DIC = 1194.66
par = 2.91
BIC = 1207.48
par = 3.00
DIC = 1194.90
par = 3.03
BIC = 1207.47
par = 3.00
Model Selection 9 Information Criteria
-
8/11/2019 Introduction to Bayesian SEM
93/127
Summary:
Complexity and (mis) fit
Complexity not adequate for models that differ in the prior but Bayes factor
can deal with this situation. One example will be given during the last day
of this courseDIC or BIC? Depends on missing values present or not. Depends on the error
rates obtained using DIC and BIC.
-
8/11/2019 Introduction to Bayesian SEM
94/127
Lecture 4: Model Selection Using the Bayes Factor, BIC and DIC
Error Rates
Model Selection 1 Error Rates
b
-
8/11/2019 Introduction to Bayesian SEM
95/127
Stork Babiesb
M1: b = 0 DIC = 273
M2: b 0 DIC = 229
The conclusion is that M2 is a better model than M1
But how certain are we about this?
What are the probabilities of making an incorrect decision?
M1: b = 0 BIC = 278
M2: b 0 BIC = 237
deltaDIC = 44 deltaBIC = 41
M2: b 0M1: b= 0Populations:
Model Selection 2 Error Rates - Frequency Evaluations
-
8/11/2019 Introduction to Bayesian SEM
96/127
... ...
Data Matrices
Sampled from
Populations
deltaDIC or deltaBIC xx xx ... xx xx xx ... xx
Model Selection 3 Error Rates Frequency Evaluations
M1: b = 0 DIC = 273M2: b 0 DIC = 229
-
8/11/2019 Introduction to Bayesian SEM
97/127
DIC, 1000 replications
18% > 05% < 0
M2: b 0 DIC = 229
deltaDIC = 44
correlation = 0, N=100 correlation = .3, N=100DIC, 1000 replications
Model Selection 4 Error Rates Frequency Evaluations
M1: b = 0 BIC = 278
M2: b 0 BIC = 237
-
8/11/2019 Introduction to Bayesian SEM
98/127
BIC, 1000 replications
3% > 0
19% < 0
correlation = 0, N=100 correlation = .3, N=100
BIC, 1000 replications
M2: b 0 BIC = 237
deltaBIC = 41
Model Selection 5 Error Rates A Simple Alternative For Frequency Evaluations
TITLE: Error Rates; M1: b = 0 DIC = 273M2: b 0 DIC = 229
M1: b = 0 BIC = 278
M2: b 0 BIC = 237
-
8/11/2019 Introduction to Bayesian SEM
99/127
MONTECARLO:
NAMES ARE y x;
NOBSERVATIONS = 100;
NREPS = 1000;
SEED = 123;
RESULTS = PopH0AnH1.txt;
MODEL POPULATION:
y ON x * .3; !! y ON x * 0;[y * 0];
y * .91; !! y * 1;
[x * 0];
x * 1;
analysis:estimator = bayes;
fbiter = 10000;
MODEL: y ON x; !! y ON x @ 0;
OUTPUT: TECH9;
M2: b 0 DIC = 229
deltaDIC = 44M2: b 0 BIC = 237
deltaBIC = 41
DIC, 1000 replications
correlation = 0, N=100
correlation = .3, N=100
BIC, 1000 replications
deltaDIC = 285.38 - 277.08
= 8.30
deltaBIC = 290.66 - 284.97
= 5.69
deltaDIC = 285.48 286.51= -1.03
deltaBIC = 290.75 - 294.40= -3.65
Model Selection 5 Error Rates
-
8/11/2019 Introduction to Bayesian SEM
100/127
Summary:
How to determine the populations from which to simulate data. Keep power
analysis in the back of you mind. It is closely related.
Mplus does not give the error rates. However, in combination with SPSS
error rates can be computed. In Exercise 7 from the lab-meeting you have the
opportunity to compute error rates in the context of multiple regression.Mplus give a very rough alternative for error rates.
The error rates discussed here are unconditional: What is the probability of
erroneous decisions if data matrices come from M1 or M2.
Very interesting and very Bayesian are conditional error rates: What is the
probability that M1 and M2 are true if deltaBIC is equal to 2.45 for the
observed data. However, these probabilities are beyond the scope of thisworkshop.
References Model Selection
An introduction to model selection can be found in
-
8/11/2019 Introduction to Bayesian SEM
101/127
An introduction to model selection can be found in
Burnham, K.P. And Anderson, D.R. (2002). Model Selection and Multi-Model Inference.
New York: Springer.
The DIC was introduced by
Spiegelhalter, D. J.,Best, N. G., Carlin, B. P., and Linde, A. V. D. (2002). Bayesian Measures
of Model Complexity and Fit.Journal of the Royal Statistical Society, 64, 583639.
The BIC is elaborate in
Kass, R.E. and Raftery, A.E. (1995). Bayes factors.Journal of the American Statistical
Association, 90,773-795.
A comparison and overview can be found in
Hamaker, E.L., Hattum, P. van, Kuiper, R., and Hoijtink, H. (2010). Model Selection based on
information criteria in multilevel modelling. In. J. Hox and K. Roberts. Handbook of
Advanced Multilevel Modelling. London, Taylor and Francis.
-
8/11/2019 Introduction to Bayesian SEM
102/127
Lecture 5: An Application of Model Selection
An Application of Model Selection 1
-
8/11/2019 Introduction to Bayesian SEM
103/127
Introduction of the Twin data
and
Analysis of the first model
An Application of Model Selection 2
title: The Twin Data File;
-
8/11/2019 Introduction to Bayesian SEM
104/127
data: file = twins.txt;
variable:
names = ID sex zygosity mothed fathed income eng1 eng2
math1 math2 socsci1 socsci2 natsci1 natsci2 vocab1 vocab2;
usev = mothed fathed eng1 eng2
math1 math2 socsci1 socsci2 natsci1 natsci2 vocab1 vocab2;
missing = all(999);
model: fac by eng1 eng2 math1 math2 socsci1 socsci2 natsci1
natsci2 vocab1 vocab2;
fac on mothed fathed;
analysis:
estimator = bayes;process = 2;
fbiter = 10000;
point = median;
output: standardized(stdyx) tech1 tech3 tech8 cinterval(hpd);
plot: type = plot1 plot2 plot3;
An Application of Model Selection 3
-
8/11/2019 Introduction to Bayesian SEM
105/127
M-ED F-ED
F
M1 E1 S1 N1 V1M2 E2
S2 N2 V2
Model: 1 Factor and Education
An Application of Model Selection 4
-
8/11/2019 Introduction to Bayesian SEM
106/127
*** WARNING
Data set contains cases with missing on x-variables.These cases were not included in the analysis.
Number of cases with missing on x-variables: 26
1 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS
For model comparison all analyses must be based on the same number of persons.
Therefore you have to deal with the missing data if Mplus excluses persons from the
analysis like it does in this example.
If there are relatively few missings like here a quick solution is to do a single imutation
using a sensible imputation model.
If there are many missings you have to resort to the use of multiple imputation and
DIC4. However, that is beyond the context of this course and also in statistical science
an area that is under development.
An Application of Model Selection 5
title: Single Imputation of the Twin Data File;
-
8/11/2019 Introduction to Bayesian SEM
107/127
data: FILE = twins.txt;
variable:
names = ID sex zygosity mothed fathed income eng1 eng2
math1 math2 socsci1 socsci2 natsci1 natsci2 vocab1 vocab2;
usev = mothed fathed income eng1 eng2
math1 math2 socsci1 socsci2 natsci1 natsci2 vocab1 vocab2;
auxiliary = ID sex zygosity;
missing = all(999);
data imputation:
impute = mothed fathed income eng1 eng2
math1 math2 socsci1 socsci2 natsci1 natsci2 vocab1 vocab2;
ndatasets = 1;
thin = 1000;save = twinimp*.dat;
analysis: estimator = bayes;
fbiter = 10000;
proces = 2;
An Application of Model Selection 6
-
8/11/2019 Introduction to Bayesian SEM
108/127
model: mothed with fathed income eng1 eng2
math1 math2 socsci1 socsci2 natsci1 natsci2 vocab1 vocab2;
fathed with income eng1 eng2math1 math2 socsci1 socsci2 natsci1 natsci2 vocab1 vocab2;
income with eng1 eng2
math1 math2 socsci1 socsci2 natsci1 natsci2 vocab1 vocab2;
eng1 with eng2 math1 math2 socsci1 socsci2 natsci1 natsci2 vocab1 vocab2;
eng2 with math1 math2 socsci1 socsci2 natsci1 natsci2 vocab1 vocab2;
math1 with math2 socsci1 socsci2 natsci1 natsci2 vocab1 vocab2;
math2 with socsci1 socsci2 natsci1 natsci2 vocab1 vocab2;
socsci1 with socsci2 natsci1 natsci2 vocab1 vocab2;
socsci2 with natsci1 natsci2 vocab1 vocab2;
natsci1 with natsci2 vocab1 vocab2;
natsci2 with vocab1 vocab2;
vocab1 with vocab2;
output: tech8;
An Application of Model Selection 7
-
8/11/2019 Introduction to Bayesian SEM
109/127
Analyse the first model using the single imputed data set
An Application of Model Selection 8
title: The Twin Data File;
-
8/11/2019 Introduction to Bayesian SEM
110/127
data: file = twinimp1.dat;
variable:
names = mothed fathed income eng1 eng2 math1 math2
socsci1 socsci2 natsci1 natsci2 vocab1 vocab2 ID sex zygosity;
usev = mothed fathed eng1 eng2
math1 math2 socsci1 socsci2 natsci1 natsci2 vocab1 vocab2;
missing = all(999);
model: fac by eng1 eng2 math1 math2 socsci1 socsci2 natsci1
natsci2 vocab1 vocab2;
fac on mothed fathed;
analysis:
estimator = bayes;process = 2;
fbiter = 10000;
point = median;
output: standardized(stdyx) tech1 tech3 tech8 cinterval(hpd);
plot: type = plot1 plot2 plot3;
An Application of Model Selection 9
-
8/11/2019 Introduction to Bayesian SEM
111/127
In itself these numbers have no meaning. They can only be compared to the
same numbers computed for one or more competing models.
Model: 1 Factor and Education
Information Criterion
Deviance (DIC) 46237.298
Estimated Number of Parameters (pD) 31.861
Bayesian (BIC) 46388.873
An Application of Model Selection 10
-
8/11/2019 Introduction to Bayesian SEM
112/127
M-ED F-ED
F1
M1 E1 S1 N1 V1 M2 E2 S2 N2 V2
Model: 2 Factor and Education
F2
An Application of Model Selection 11
-
8/11/2019 Introduction to Bayesian SEM
113/127
Income
F
M1 E1 S1 N1 V1 M2 E2 S2 N2 V2
Model: 1 Factor and Income
An Application of Model Selection 12
-
8/11/2019 Introduction to Bayesian SEM
114/127
F1
M1 E1 S1 N1 V1 M2 E2 S2 N2 V2
Model: 2 Factor and Income
F2
Income
Model: 1 Factor and Education
An Application of Model Selection 13
Model: 2 Factor and Education
-
8/11/2019 Introduction to Bayesian SEM
115/127
Model: 1 Factor and Education
Information Criterion
Deviance (DIC) 46237.298
Estim number of Par (pD) 31.861
Bayesian (BIC) 46388.873
Model: 2 Factor and Education
Information Criterion
Deviance (DIC) 46008.581
Estim Number of Par (pD) 34.841
Bayesian (BIC) 46174.343
Model: 2 Factor and IncomeModel: 1 Factor and Income
Information Criterion
Deviance (DIC) 46031.495
Estim Number of Par (pD) 32.818
Bayesian (BIC) 46187.846
Information Criterion
Deviance (DIC) 46263.315
Estim Number of Par (pD) 30.940
Bayesian (BIC) 46410.004
An Application of Model Selection 14
-
8/11/2019 Introduction to Bayesian SEM
116/127
Are the differences in BIC and DIC convincing?
Should we determine the error rates?
Should we determine the conditional error rates?
An Application of Model Selection 15
Posterior One-Tailed 95% C.I.
Estimate S D P Value Lower 2 5% Upper 2 5%
-
8/11/2019 Introduction to Bayesian SEM
117/127
Estimate S.D. P-Value Lower 2.5% Upper 2.5%
FAC1 BY
ENG1 0.765 0.016 0.000 0.732 0.796
MATH1 0.691 0.020 0.000 0.651 0.728SOCSCI1 0.862 0.011 0.000 0.840 0.883
NATSCI1 0.770 0.016 0.000 0.738 0.801
VOCAB1 0.850 0.012 0.000 0.827 0.873
FAC2 BY
ENG2 0.748 0.017 0.000 0.713 0.780
MATH2 0.739 0.018 0.000 0.703 0.772SOCSCI2 0.868 0.011 0.000 0.847 0.888
NATSCI2 0.762 0.016 0.000 0.729 0.793
VOCAB2 0.862 0.011 0.000 0.839 0.883
FAC1 ON
MOTHED 0.098 0.042 0.010 0.016 0.180
FATHED 0.236 0.041 0.000 0.154 0.316FAC2 ON
MOTHED 0.088 0.042 0.018 0.006 0.170
FATHED 0.256 0.041 0.000 0.177 0.336
FAC2 WITH
FAC1 0.870 0.013 0.000 0.843 0.895
An Application of Model Selection 16
-
8/11/2019 Introduction to Bayesian SEM
118/127
Posterior One-Tailed 95% C.I.
Estimate S.D. P-Value Lower 2.5% Upper 2.5%
Intercepts
ENG1 3.423 0.140 0.000 3.149 3.698
ENG2 3.733 0.145 0.000 3.445 4.015
MATH1 2.731 0.121 0.000 2.484 2.958
MATH2 2.785 0.126 0.000 2.537 3.033
SOCSCI1 3.450 0.148 0.000 3.151 3.732
SOCSCI2 3.502 0.149 0.000 3.201 3.786
Residual Variances
ENG1 0.415 0.025 0.000 0.367 0.464
ENG2 0.441 0.026 0.000 0.392 0.492
MATH1 0.523 0.027 0.000 0.470 0.577
MATH2 0.455 0.026 0.000 0.405 0.507SOCSCI1 0.256 0.019 0.000 0.220 0.295
SOCSCI2 0.246 0.018 0.000 0.211 0.283
FAC1 0.907 0.020 0.000 0.868 0.944
FAC2 0.900 0.020 0.000 0.858 0.937
An Application of Model Selection 17
R-SQUARE
-
8/11/2019 Introduction to Bayesian SEM
119/127
Posterior One-Tailed 95% C.I.
Variable Estimate S.D. P-Value Lower 2.5% Upper 2.5%
ENG1 0.585 0.025 0.000 0.536 0.633
ENG2 0.559 0.026 0.000 0.508 0.608
MATH1 0.477 0.027 0.000 0.423 0.530
MATH2 0.545 0.026 0.000 0.493 0.595
SOCSCI1 0.744 0.019 0.000 0.705 0.780
SOCSCI2 0.754 0.018 0.000 0.717 0.789
NATSCI1 0.592 0.025 0.000 0.544 0.640
NATSCI2 0.580 0.025 0.000 0.531 0.629
VOCAB1 0.723 0.020 0.000 0.682 0.761
VOCAB2 0.742 0.019 0.000 0.703 0.778
Posterior One-Tailed 95% C.I.
Variable Estimate S.D. P-Value Lower 2.5% Upper 2.5%
FAC1 0.093 0.020 0.000 0.056 0.132
FAC2 0.100 0.020 0.000 0.063 0.142
An Application of Model Selection 18
-
8/11/2019 Introduction to Bayesian SEM
120/127
And now the empirical cycle has to be restarted !!!!
References An Application of Model Selection
-
8/11/2019 Introduction to Bayesian SEM
121/127
Loehlin, J.C. and Nichols, R.C. (1976). Genes, Environment and Personality.
Austin TX: University of Texas Press.
-
8/11/2019 Introduction to Bayesian SEM
122/127
Lecture 6: Model Selection in the Presence of Missing Data
Model Selection and Missing Data 1
ID Stork Urban Babies
-
8/11/2019 Introduction to Bayesian SEM
123/127
... ... ... ...
20 3 7 13
21 1 4 1122 999 4 9
23 3 6 11
24 4 6 9
25 8 7 16
26 11 999 999
27 5 3 728 5 5 8
29 999 6 14
30 6 6 10
31 7 5 999
32 8 8 10
33 999 999 834 2 2 1
35 4 4 8
... ... ... ...
ID Stork Urban Babies
Model Selection and Missing Data 2
Situation 1: The data are MAR when the statistical model is equal to the imputation model
-
8/11/2019 Introduction to Bayesian SEM
124/127
Situation 1: The data are MAR when the statistical model is equal to the imputation model
In Mplus, both the misfit and the complexity of the DIC are computed using only
the observed data, and, parameter values sampled and estimated using the statistical
model to impute the missing values.
This is a valid procedure that can be used without hesitation.
DIC = misfit + complexity = misfit + estimated number of parameters
Model Selection and Missing Data 3
-
8/11/2019 Introduction to Bayesian SEM
125/127
BIC = misfit + complexity = misfit + log N x P
In Mplus in the misfit of the BIC is computed using only the observed data, and,
parameter values sampled and estimated using the statistical model to impute
the missing values.
The complexity is estimated as the log of the number of persons multipliedwith the number of parameters in a statistical model. As to yet it is unknown how
N should be determined in the presence of missing data. Mplus uses the sample
size. But this is an ad-hoc and unmotivated choice.
Currently it is not advised to used the BIC in the presence of missing data.
Model Selection and Missing Data 4
-
8/11/2019 Introduction to Bayesian SEM
126/127
Situation 2: The statistical model is consistent with the imputation model, and, given
the imputation model the missing values are MAR
Using a three step procedure Mplus can be used to compute the DIC accounting
for the fact that some of the data are missing:
1. Multiply impute the data using the imputation model.
2. For each imputed data matrix compute the DIC using Mplus
3. Average the DICs obtained for the M imputed data matrices
The results is DIC4 as discussed by Celeux at al. (2006). This is not the definite answer
to the computation of the DIC in the presence of missing data, but at least there is
some support for this approach in the scientific literature. One is well advised to use
the MonteCarlo approach from Mplus to evaluate in each new situation how well the
DIC4 performs. It is beyond the scope of this course to show how this can be done.
Note that using MplusAutomation this can relatively easily be implemented (as opposed
to doing this manually). However, this is also beyond the scope of this course.
References Model Selection and Missing Data
-
8/11/2019 Introduction to Bayesian SEM
127/127
A paper about the computation of DIC in the presence of missing data
Celeux, G., Forbes, F., Robert, C.P., and Titterington, D.M. (2006). Deviance Information
Criteria for Missing Data Models. Bayesian Analysis, 1, 651-674.
A paper about the difference between the imputation and analysis model in the
context of missing data
Kuiper, R.M. and Hoijtink, H. (2011). How to Handle Missing Data for Predictor Selection in
Regression Models Using the AIC. Statistics Neerlandica, 65,489-506.
Mplusautomation is developed by Michael Hallquist. If you google for CRAN
MPLUSAUTOMATION you will find the website from which the R package anddocumentation can be downloaded.