Mplus for Windows: An Introduction - Oregon State...

Mplus for Windows: An Introduction and OverviewAlan C. Acock

Department of HDFSOregon State University

7/2009

Intro to Mplus—Alan C. Acock 1

Mplus for Windows: An Introduction and OverviewContents

Section 1: Using Mplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Launching Mplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Input and Output Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Mplus Command Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Selected Defaults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5 Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Section 2: Exploratory Factor Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1 EFA with Continuous Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Comparing two Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 EFA with Categorical Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4 Selected Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5 Comparing Two Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.6 Comparison: Categorical & Continuous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Section 3: Confirmatory Factor Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.1 CFA with Continuous Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.2 Output and Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2.1 Missing value summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2.2 Covariances and correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2.3 Model Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.4 Model result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.5 Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.2.6 Modification indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Section 4: EFA as an Alternative to CFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29Section 5: Equality Constaints—Longitudinal CFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5.1 Programs for testing equality constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.2 Selected output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39Section 6: Path Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.1 Model and programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.2 Indirect effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43Section 7: Putting it Together: CFA & SEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 7.1 Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 7.2 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 7.3 Interpretation and modification indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48Section 8: Putting it Together: EFA & SEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 8.1 Program & model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 8.2 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50Section 9: Summary & Resource . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54


Section 1: Using Mplus

1.1 Launching Mplus

1.2. Input and Output Windows

The window shown above is the input window. You write Mplus programs in this window to read the data to be

analyzed and to specify your model of interest. You then save your Mplus program and select Run Mplus from the

Mplus menu to submit your program to the Mplus engine for processing.

► File

► Open

Open ex1.inp. This is located at c:\Mplus Examples\ex1.inp.


We will utilize files from the Mplus manual for many of our examples. These typically involve simulated data. Sometimes we well assign hypothetical variable names to make these somewhat realistic. The Manual itself does not provide a substantial about of explanation of the examples and the specific output so we hope the discussion of them here will be useful at a later time when you are trying to read the manuals.

Here is one screen of the data in ex1.dat (The ANALYSIS: command here is not needed in the current version of Mplus)

We have labeled missing values with a -9. Easiest to pick one value that will work for all variables—can be any number or a dot.

Notice we have one observation, case 13, that has a missing value on all variables.

The data happens to be in a fixed format. Could be comma delimited, cvs file from Excel.


Could be free format, other formats possible, but more complicated

Here is the confirmatory factor analysis model we are estimating


We will explain the program in a moment, but for now we will just run it to see how the interface works.► Mplus ► Run Mplus Or, you can click the Run icon.Once Mplus has finished processing your command program, it opens an output window.

The output window first displays your Mplus program. Below the Mplus program are the Mplus model results. If there is an error in your Mplus program or you want to modify your

Mplus program in any way (e.g., to fit a different model to the data), you must return to the input window and you can then modify the previous commands, save the modified command file, and run Mplus once again to obtain new output.

1.3 Mplus Command Structure

After you have launched Mplus, you may build a command file. There are nine sets of Mplus commands (ususally only a few of these are used, but some have numerous subcommands) :

1. TITLE: (optional unless you want to know what the file is intended to do)

2. DATA: (required), 3. VARIABLE: (required), 4. DEFINE: (some data transformations are available)5. SAVEDATA: (used for specialized applications)6. ANALYSIS: (for special analyses such as EFA7. MODEL: (a series of equations)8. OUTPUT: (many options are available) 9. MONTECARLO: (used for simulations, power analysis)

Rules:


1. All commands (Title, Data, etc.) must begin on a new line.2. All command names must be followed by a colon.3. For e.g., Title: Once you enter the colon, the key word becomes blue.4. Semicolons separate command options—similar to SAS. 5. The records in the input setup must be no longer than 80 columns. 6. They can contain upper and/or lower case letters and tabs. 7. Only variable names are case sensitive. (Y1 and y1 are different

variables)

1.4 Selected Defaults

The current version of Mplus assumes that you either have no missing values or are using full information maximum likelihood estimation and assuming missing values are missing at random (MAR)

Parameters such as loadings can be fixedo Many loadings are fixed at 0.0 in the CFA models because the

item should not load on the factor. o There is no path from F1 to y4 in our figure.

Fixed parameters can be “freed,” meaning you will estimate them. o We could add a path from to Y4 or o Let E1 be correlated with E4

Fixed parameters are required to stay at a specified value, such as 0.0 or 1.0.

All free parameters are put into a vector and iterations change values of these free parameters, until the model’s fit is optimal.

Unless we tell it otherwise, Mplus will fix the first indicator’s loading at 1.0 as the reference indicator (except for EFA).

o For example, F1y1 and F2y4 have fixed loadings of 1.0 by default.

o One way to change reference indicator is to reorder variables, e.g. o F1 by y2 y1 y3 makes y2 reference indicatoro Good to pick a strong indicator as the reference indicator—don’t

get a significance test for reference indicator

1.5 Commands

The TITLE command allows you to specify a title that Mplus will print on each page of the output file.

This can go on and on for many lines and usually should. Intro to Mplus—Alan C. Acock 7

Everything is a Title until a command name appears at the start of a new line.

I like to put the file name as the first line of a title.

The DATA command specifies where Mplus will locate the data, the format of the data, and the names of variables. At present, Mplus will read the following file formats:

tab-delimited text, space-delimited text, and comma-delimited text. The input data file may contain records in free field format or fixed

format. If you are using data stored in another form (e.g., Stata, SAS, SPSS, or

Excel), you will need to convert it to one of the formats with which Mplus can work before you read it into Mplus.

SAS and SPSS require you to write a file out an ASCII (plain text) file.

If you have the data in Stata you can use stata2mplus to set things up for you. You can obtain it using findit stata2mplus and install the program.

Here is the Stata session:stata2mplus using "I:\flash\HDFS630\mplus\classnsfh", replace

This creates two files: classnsfh.inp that will run a basic analysis in Mplus and classnsfh.dat, a comma delimit ASCII file that Mplus can read with

all missing values coded/recoded as -9999.

1,1,1,1,2,1,1,1,1,1,1,1,2,1,13,2,2,3,2,3,2,2,2,2,2,2,2,2,2 1,1,1,-9999,1,2,2,1,2,2,2,2,2,1,13,3,3,3,3,3,3,3,3,3,3,3,3,3,32,2,1,2,1,2,2,1,1,2,1,-9999,1,1,1

The DATA command tells Mplus where the data is stored.


If you store the program and the data in the same folder, you don’t need to include the path.

Recommended to make a separte folder for each project such. A long file reference can exceed the character limits per line in Mplus. Mplus uses all available data by default. If you want to use listwise

deletion you must specify this under the Data command. Listwise = on;

The VARIABLE command names variables. These must be in the identical order to the way Stata/SAS/SPSS wrote

the data file. (common mistake) Mplus variable names may not have more than 8 characters. Change

variable names to be 8 characters or less or you will get error messages.

Variable names are case sensitive. Must be consistent (common mistake)

The ANALYSIS command tells Mplus what type of analysis to perform. Many analysis options are available. Some of these such as Type = EFA make additional commands

unnecessary.

SECTION 2: Exploratory Factor Analysis

2.1 EFA with Continuous Variables

TITLE: efa1.inp This is an example of an exploratory factor analysis with continuous factor indicatorsDATA: FILE IS "c:\Mplus Examples\efa1.dat";VARIABLE: NAMES ARE y1-y12;ANALYSIS: TYPE = EFA 1 4;! ESTIMATOR = ml;! ROTATION = Geomin;OUTPUT: sampstat;

The Type = EFA 1 4 tells Mplus to perform an exploratory factor analysis.


The 1 and 4 following the EFA specification tells Mplus to generate all possible factor solutions between and including 1 and 4.

The ESTIMATOR = ml option has Mplus use the maximum likelihood estimator to perform the factor analysis

o This provides a chi-square goodness of fit test that the number of hypothesized factors is sufficient to account for the correlations among the six variables in the analysis

o This has an exclamation mark in front of it which makes it green. Anything green is a comment and is ignored by the program. This subcommand is not necessary because maximum likelihood estimation is the default.

Mplus uses the geomin rotation which is oblique as its default. More traditional rotations such as varimax are available. See help for a listing of options.

We do not need a MODEL: command because the EFA 1 4 takes care of this.

One useful feature of Mplus is its ability to handle non-normal input data. Recall that the default ml estimator assumes that the input data are

distributed joint multivariate normal. If you have reason to believe that this assumption has not been met

and your sample is reasonably large (e.g., n ≥ 200), you may substitute mlm or mlmv in place of ml on the ESTIMATOR = line.

o The mlm option provides a mean-adjusted chi-square model test statistic whereas the

o mlmv option produces a mean and variance adjusted chi-square test of model fit.

o SEM users who are familiar with Bentler's EQS software program should also note that the mlm chi-square test and standard errors are equivalent to those produced by EQS in its ML;ROBUST method.

You may also add the OUTPUT command following the ANALYSIS and MODEL commands.

The OUTPUT command is used to specify optional output. For this example the keyword sampstat tells Mplus to include sample

statistics as part of its printed output.

OUTPUT: sampstat ; Intro to Mplus—Alan C. Acock 10

You can use Mplus’ Help menu to get a listing of all the options available for each command. You might try this to see what OUTPUT options are available.

Mplus produces the Sample correlations, Root Mean Square Error of Approximation (RMSEA), and the Chi-square test of the one, two, three, and four factor models. Standard errors and z-tests for loadings and correlations of factors.

As you can see from the results, shown below, the chi-square test for a one factor solution is statistically significant, so the null hypothesis that a single factor fits the data is rejected; more factors are required to obtain a non-significant chi-square.

Since the Chi-square test is: Sensitive to sample size (such that large samples often return

statistically significant chi-square values) and Non-normality in the input variables.

Mplus also provides the Root Mean Square Error of Approximation (RMSEA) statistic. The RMSEA is not as sensitive to large sample sizes. According to Hu and Bentler (1999), RMSEA values below .06 indicate satisfactory model fit. Kline indicates a .08 is acceptable.

Run the program and interpret the results.

2.2 Comparing two Solutions

You can test whether the adding additional factors significantly improves the fit to the data.Model 1 chi-square (54 degrees of freedom) = 1052.089; p < .001Model 2 chi-square (43 degrees of freedom) = 723.022; p < .001Model 3 chi-square (33 degrees of freedom) = 341.268; p < .001Model 4 chi-square (24 degrees of freedom) 25.799; p not sign.

Is model 4 better than model 3.

Model 3 chi-square (33 degrees of freedom) = 341.268


Model 4 chi-square (24 degrees of freedom) = 25.799Difference chi-square (9 degrees of freedom 315.469; p < .001

This is significant at the .05 level

With Stata, you can get the probability when this is not in a table. display 1-chi2(df,chi-square). display 1-chi2(9,315.469)0

This is obviously less than .05.

Often you can’t use tables for chi-square because you have lots of degrees of freedom and tables only show significance levels for relatively few degrees of freedom.

Estimate the model and interpret the results.

2.3. EFA with Categorical Outcomes

For the purposes of illustration, suppose that you recode each variable into a replacement variable where all six variables' values at the median or below are assigned a categorical value of 1.00 and all values above the median assigned a value of 2.00.

For categorical variables, Mplus automatically recodes the lowest value to zero with subsequent values increasing in units of 1.00.

While the four underlying latent factors remain continuous, the six categorical observed variables' response values are now ordered dichotomous categories.

You may use the program that appeared in the initial exploratory factor analysis example, with the following modifications, and the new data file that contains the categorical variables ex4.2.dat, as shown below.

There are two estimators. WLSM (Weighted Least Squares) is very fast and reasonably good.

o You should use this for initial runs.


o Running this on a server used by many students, it ran in 1 sectond.

o This is the default MLR (Robust Maximum Liklihood). This is painfully slow, even for a

simple and well behaved example like the one we will estimate.o Save this till you are almost doneo Use this when you need to test for the number of factorso This took 18 minutes to run.o Under the Analysis section you need to specify this estimator as

shown below.

TITLE: ex4.2.inp This is an example of an exploratory

factor analysis with categorical factor indicators

It uses weighted least squares estimationIt computes tetrachoric correlations and does theFactor analysis on them. The RMSEA and chi-squareValues are reported.

DATA: FILE IS ex4.2.dat;VARIABLE: NAMES ARE u1-u12;

CATEGORICAL ARE u1-u12;ANALYSIS: TYPE = EFA 1 4;

ESTIMATOR = MLR;PROCESSORS = 4 ;

You tell Mplus which variables are categorical with the CATEGORICAL subcommand of the VARIABLE command, like this:

CATEGORICAL ARE u1 – u2 ;

You should also change the ESTIMATOR option for the ANALYSIS command. The default estimator for categorical variables is weighted least

squareso With wls it took 2 secondso Could use this for preliminary analysis


I have used MLR, Maximum Likelihood Robust. o This uses a default 7 integration points and is extremely slow to

converge. o This program took almost 20 minutes for a fairly simple modelo It makes it possible to compare models using a likelihood ratio

test.

2.4 Selected results

Mplus begins with a summary of the distribution of the categorical indicators:

Next we get fit statistics for the 1 factor solution


2.5 Comparing Two Solutions

If you use Weighted Least Squares (WLSM) with categorical data you get a RMESA to help compare the models and can do a chi-square test as described at

http://www.statmodel.com/chidiff.shtml

2.6 Comparison of Categorical and Continuous Solutions

One way of evaluating the efficacy of a categorical factor analysis is its ability to reproduce the factors obtained when the data is continuous. The program ex4.1.inp estimates the same factors for the 12 items when they are continuous and we can compare the results. The low loadings on each factor are all low whether we have the continuous variables or have dichotomized the variables. The high loads are all fairly close matches. First, here is the result when the variables are continuous:


http://www.statmodel.com/chidiff.shtml

GEOMIN ROTATED LOADINGS 1 2 3 4 ________ ________ ________ ________ Y1 0.637 0.008 0.074 -0.021 Y2 0.808 0.022 -0.005 0.041 Y3 0.631 -0.042 -0.058 -0.028 Y4 0.027 0.646 -0.002 -0.018 Y5 -0.029 0.760 -0.023 0.017 Y6 0.010 0.674 0.030 -0.012 Y7 -0.006 0.003 0.734 0.018 Y8 -0.040 0.002 0.727 -0.016 Y9 0.049 -0.007 0.707 -0.001 Y10 -0.037 0.006 -0.010 0.692 Y11 0.004 0.013 0.001 0.791 Y12 0.035 -0.036 0.008 0.658

GEOMIN FACTOR CORRELATIONS 1 2 3 4 ________ ________ ________ ________ 1 1.000 2 -0.039 1.000 3 0.007 0.029 1.000 4 -0.002 -0.121 -0.028 1.000

And here are the results for the categorical solution:

GEOMIN ROTATED LOADINGS 1 2 3 4 ________ ________ ________ ________ U1 0.628 -0.004 0.098 -0.061 U2 0.938 0.042 -0.012 0.070 U3 0.690 -0.073 -0.019 -0.036 U4 0.072 0.705 -0.120 -0.005 U5 -0.130 0.805 0.018 0.016 U6 0.015 0.602 0.089 -0.045 U7 0.026 -0.014 0.805 0.009 U8 -0.036 -0.003 0.720 -0.029 U9 0.017 0.034 0.669 0.025 U10 -0.066 -0.017 0.057 0.654 U11 -0.005 0.040 -0.056 0.872 U12 0.110 -0.069 0.022 0.624

GEOMIN FACTOR CORRELATIONS 1 2 3 4 ________ ________ ________ ________ 1 1.000 2 -0.026 1.000


3 0.025 0.032 1.000 4 -0.029 -0.150 -0.059 1.000

Item F1 (con)

F1(cat)

F2(con)

F2(cat)

F3(con)

F3(cat)

F4(con)

F4(cat)

1 .637 .6282 .808 .9383 .631 .6904 .646 .7055 .760 .8056 .674 .6027 .734 .8058 .727 .7209 .707 .66910 .692 .65411 .791 .87212 .658 .624

There are several notes worth keeping in mind when you perform exploratory factor analysis with categorical outcome variables.

Although one or more of the observed variables may be categorical, any latent variables in the model are assumed to be continuous

The analysis specification and interpretation of the output, e.g., loadings & factor correlations, is the same whether one, a subset, or all observed variables are categorical.

Categorical observed variables may be dichotomous or ordered categorical outcomes of more than two levels), but nominal level observed variables with more than two categories may not be used in the analysis as outcome variables using this strategy.

Sample size requirements are somewhat more stringent than for continuous variables; typically you want a minimum of 200 cases (preferably more) to perform any analysis with categorical outcome variables.

Mplus provides standard errors and z-tests for all loadings and correlations.


SECTION 3: Confirmatory Factor Analysis

What if you had an a priori hypothesis that the visual perception (Y1), cubes (Y2), and lozenges (Y3) variables belonged to a single factor whereas the paragraph (Y4), sentence (Y5), and word meaning (Y6) variables belonged to a second factor? The diagram shown below illustrates the model visually.

You can test this hypothesized factor structure using confirmatory factor analysis, as shown in the next section.

The first thing you want to do is look at the correlation matrix:

Y1 Y2 Y3 Y4 Y5 Y6Y1 1.000Y2 0.524 1.000Y3 0.475 0.533 1.000Y4 -0.004 -0.032 -0.007 1.000Y5 -0.029 -0.040 -0.048 0.431 1.000Y6 0.023 -0.012 0.037 0.369 0.419 1.000


Y1-Y3 are highly correlated with each other so they might form a factor Y4-Y5 are highly correlated with each other so they might form a factor Y1-Y3 are weakly correlated with each other so there is factor

separation Y2 is slighly more negative correlated with Y4 than is Y1. Y2 is slightly

more negatively correlated with Y5 than is Y1. Y2 is slightly negatively correlated with Y6 and Y1 is slightly positive correlated.

Compare this to the following. Y2 and Y1 have a different pattern with Y4-Y6. The single correlation between F1 and F2 could not handle this.

The fit will not be very good.

Y1 Y2 Y3 Y4 Y5 Y6Y1 1.000Y2 0.524 1.000Y3 0.475 0.533 1.000Y4 0.200 -0.100 -0.007 1.000Y5 -0.100 0.200 -0.048 0.431 1.000Y6 0.300 0.100 0.037 0.369 0.419 1.000

Consider the following correlation matrix:

Y1 Y2 Y3 Y4 Y5 Y6Y1 1.000Y2 0.524 1.000Y3 0.475 0.533 1.000Y4 -0.004 -0.032 0.400 1.000Y5 -0.029 -0.040 -0.048 0.431 1.000Y6 0.023 -0.012 0.037 0.369 0.419 1.000

Y3 and Y4 are too correlated to be on separate factors. Factorial confounding will mean that Y1-Y4 load on F1 and Y3-Y4 load

on F2. Therefore, Y3 & Y 4 are factorially confounded.

3.1 CFA with Continuous Variables

TITLE: ex1.inp


CFA with continuous factor indicators There are Missing values DATA: FILE IS "ex1.dat" ;VARIABLE: NAMES ARE y1-y6; MISSING ARE all (-9) ;MODEL: f1 BY y1-y3; f2 BY y4-y6;! f1 WITH f2;OUTPUT:

sampstat standardized residual patterns mod(3.84);

When Mplus sees EFA it sets up the relationship in a certain way, but in a CFA, Mplus needs you to provide a MODEL: to tell it how to set up the relationships that you wish to confirm).

The model is general in the sense that o You must define what parameters are estimated; o All other parameters are assumed to be fixed. o Fixed parameters are either zero or some value you set.

Under VARIABLE we have defined what code is used to represent missing values.

You do not need an ANALYSIS section, since we use the MODEL section to specify the model and are going with the default analysis.

o This assumes full information maximum likelihood. o To do listwise deletion we would specify this in the DATA

command Open Help Under data you see that you would enter Listwise = on—make sure you put it under DATA

The MODEL command allows you to specify the parameters of your model.

o The first line of the MODEL command shown above defines a latent factor for the first factor.


o The BY keyword (an abbreviation for "measured by") is used to define the latent variables;

o The latent variable name appears on the left-hand side of the BY keyword whereas the measured variables appear on the right-hand side of the BY keyword.

o Mplus will fix the loading for the first indicator at 1.0 unless you tell it otherwise. Put the “best” indicator first.

Similarly, in the second line of the MODEL: command a latent factor called verbal has three indicators: Y1, Y2, and Y3. The third line of MODEL: command uses the WITH keyword to correlate the F1 latent factor with the F2 latent factor.

By è Measured byWith è Correlated with

We do not need F1 with F2 because that is the default. If we wanted to see how the model did with these fixed we would add the line F1 with F2@0 ;

Finally, the OUTPUT command contains an added keyword, standardize. This option instructs Mplus to output standardized parameter estimate values in addition to the default unstandardized values. Selected output from the analysis appears below.

Why is one loading fixed at 1.0?

The default fixes the unstandardized loading of the first item after BY at 1.0

This has to do with model identification. In exploratory factor analysis the variance of the factor (latent

variable) is fixed at 1.0 by the program. Given this, the program estimates the loadings.

With CFA, you need to set a variance for the latent variable because the size of the loadings are scaled from the size of the variance.

Setting the variance of the latent variable (factor) at 1.0 solves this problem with EFA and is an option with CFA and you get standardized loadings. But, Mplus suggests a more general approach in which you fix one of the loadings of each latent


variable (factor) at 1.0.

Why is this more general?

One group might be more variable than another. We might find that girls not only have higher verbal skills than boys,

but that they are either more homogeneous or more heterogeneous in these skills.

An intervention that not only improves the mean outcome, but does so in a way that makes the distribution more homogeneous is preferred.

In some cases we are interested in the variances of the latent variables as an important topic and we could not study that if we fixed the variance at 1.0.

Regardless of which item you pick to fix the loading at 1, the standardized solution will always be the same because that solution rescales the variance of the latent variable to be 1 and the fully standardized solution also rescales the variance of each indicator to be 1.

We should pick the strongest indicator at 1.0.

This makes the results less confusing to readers because all of the loadings will be less than 1.0.

If you fixed a weak indicator at 1.0, an indicator that was twice as strong would have a loading of 2.0 and that would be confusing to readers.

You do not need to fix the loadings at 1, any number will identify the model equally well.

3.2 Output and Interpret

3.2.1 Missing value summarySUMMARY OF DATA Number of patterns 8

SUMMARY OF MISSING DATA PATTERNS

MISSING DATA PATTERNS


1 2 3 4 5 6 7 8 Y1 x x x x x x x Y2 x x x x x x x Y3 x x x x x x x Y4 x x x x x Y5 x x x x x x x Y6 x x x x x

MISSING DATA PATTERN FREQUENCIES Pattern Frequency Pattern Frequency Pattern Frequency 1 473 4 1 7 2 2 15 5 1 8 3 3 3 6 1

COVARIANCE COVERAGE OF DATAMinimum covariance coverage value 0.100

PROPORTION OF DATA PRESENT Covariance Coverage Y1 Y2 Y3 Y4 Y5 ________ ________ ________ ________ ________ Y1 0.994 Y2 0.990 0.996 Y3 0.992 0.994 0.998 Y4 0.984 0.986 0.988 0.990 Y5 0.992 0.994 0.996 0.990 0.998 Y6 0.960 0.962 0.964 0.960 0.966

Covariance Coverage Y6 ________ Y6 0.966

3.2.2 Covariances and correlations

Covariances Y1 Y2 Y3 Y4 Y5 ________ ________ ________ ________ ________ Y1 1.948 Y2 1.020 1.968 Y3 0.930 1.043 1.968 Y4 -0.020 -0.064 -0.037 2.070 Y5 -0.076 -0.091 -0.133 0.810 1.664 Y6 0.021 -0.035 0.053 0.710 0.716

Covariances


Y6 ________ Y6 1.684

Correlations Y1 Y2 Y3 Y4 Y5 ________ ________ ________ ________ ________ Y1 1.000 Y2 0.521 1.000 Y3 0.475 0.530 1.000 Y4 -0.010 -0.032 -0.019 1.000 Y5 -0.042 -0.051 -0.073 0.437 1.000 Y6 0.012 -0.019 0.029 0.380 0.428

Correlations Y6 ________ Y6 1.000

3.2.3 Model Fit

THE MODEL ESTIMATION TERMINATED NORMALLY

TESTS OF MODEL FITChi-Square Test of Model Fit Value 3.895 Degrees of Freedom 8 P-Value 0.8665

Chi-Square Test of Model Fit for the Baseline Model Value 589.067 Degrees of Freedom 15 P-Value 0.0000

CFI/TLI CFI 1.000 TLI 1.013

Loglikelihood H0 Value -4850.279 H1 Value -4848.331

Information Criteria Number of Free Parameters 19 Akaike (AIC) 9738.558 Bayesian (BIC) 9818.597


Sample-Size Adjusted BIC 9758.290 (n* = (n + 2) / 24)

RMSEA (Root Mean Square Error Of Approximation) Estimate 0.000 90 Percent C.I. 0.000 0.027 Probability RMSEA <= .05 0.995

SRMR (Standardized Root Mean Square Residual) Value 0.015

3.2.4 Model result

We are usually interested in the fully standardized results but the unstandardized results appear first.

MODEL RESULTS--Unstandardized

Two-Tailed Estimate S.E. Est./S.E. P-Value

F1 BY Y1 1.000 0.000 999.000 999.000 Y2 1.123 0.098 11.430 0.000 Y3 1.019 0.088 11.532 0.000

F2 BY Y4 1.000 0.000 999.000 999.000 Y5 1.032 0.129 7.972 0.000 Y6 0.869 0.105 8.316 0.000

F2 WITH F1 -0.033 0.053 -0.621 0.534

Intercepts Y1 -0.017 0.063 -0.267 0.790 Y2 0.030 0.063 0.478 0.633 Y3 0.037 0.063 0.590 0.555 Y4 -0.022 0.065 -0.336 0.737 Y5 -0.012 0.058 -0.209 0.835 Y6 0.066 0.059 1.120 0.263


Variances F1 0.912 0.125 7.308 0.000 F2 0.786 0.138 5.677 0.000

Residual Variances Y1 1.041 0.095 10.977 0.000 Y2 0.803 0.100 8.044 0.000 Y3 1.012 0.095 10.612 0.000 Y4 1.287 0.123 10.449 0.000 Y5 0.861 0.112 7.664 0.000 Y6 1.077 0.098 10.992 0.000

STANDARDIZED MODEL RESULTS

STDYX Standardization


F1 BY Y1 0.683 0.035 19.573 0.000 Y2 0.767 0.034 22.537 0.000 Y3 0.695 0.035 20.011 0.000

F2 BY Y4 0.616 0.046 13.498 0.000 Y5 0.702 0.047 14.916 0.000 Y6 0.596 0.045 13.112 0.000

F2 WITH F1 -0.039 0.062 -0.622 0.534

Intercepts Y1 -0.012 0.045 -0.267 0.790 Y2 0.021 0.045 0.478 0.633 Y3 0.026 0.045 0.590 0.555 Y4 -0.015 0.045 -0.336 0.737 Y5 -0.009 0.045 -0.209 0.835


Y6 0.051 0.045 1.120 0.263

Variances F1 1.000 0.000 999.000 999.000 F2 1.000 0.000 999.000 999.000

Residual Variances Y1 0.533 0.048 11.174 0.000 Y2 0.411 0.052 7.868 0.000 Y3 0.517 0.048 10.699 0.000 Y4 0.621 0.056 11.057 0.000 Y5 0.507 0.066 7.677 0.000 Y6 0.645 0.054 11.887 0.000

STDY Standardization –ommitted—

R-SQUARE

Observed Two-Tailed Variable Estimate S.E. Est./S.E. P-Value

Y1 0.466 0.049 9.483 0.000 Y2 0.582 0.054 10.849 0.000 Y3 0.483 0.050 9.700 0.000 Y4 0.387 0.056 6.883 0.000 Y5 0.496 0.065 7.605 0.000 Y6 0.369 0.055 6.747 0.000

Each unstandardized estimate represents the amount of change in the outcome variable as a function of a single unit change in the variable causing it.

Different measures often have different scales, so you will often find it useful to examine the standardized coefficients when you want to compare the relative strength of associations across observed variables that are measured on different scales.

Mplus provides two standardized coefficients. The first, labeled StdYX, standardizes based on latent and observed variables' variances. This standardized coefficient represents the amount of


standardized change in an outcome variable per standard deviation unit of a predictor variable.

Finally, the r-square output illustrates the amount of variance accounted for in the indicators.

3.2.5 Residuals

Mplus output reports a residual for each variance and covariance. To simplify interpretation, it also reports a z-test (normalized residual for each variance and covarinace. Normalized Residuals for Covariances/Correlations/Residual Correlations Y1 Y2 Y3 Y4 Y5 ________ ________ ________ ________ ________ Y1 -0.001 Y2 0.000 0.000 Y3 0.007 -0.005 0.000 Y4 0.333 -0.071 0.159 0.000 Y5 -0.290 -0.400 -0.955 -0.027 -0.001 Y6 0.790 0.182 1.180 0.046 -0.006

Normalized Residuals for Covariances/Correlations/Residual Correlations Y6 ________ Y6 0.000

Because there are many tests, it would not make sense to use the 1.96 value as a significant failure. Still, we should look for a large z-score as an indicator that our model does not explain some relationship.

3.2.6 Modification indices

Finally, Mplus reports modification indices because we specified mod(3.84). The 3.84 corresponds to the .05 level Use this with caution, especially on a large sample These are perameters we fixed that could improve the fit if they

were free. We have no path from F1 to y6, for example The M.I. is an estimate of how much chi-square for the model would

be reduced if a single parameter is made free—one at a time.Intro to Mplus—Alan C. Acock 28

Nothing would improve the fit of our model.MODEL MODIFICATION INDICES

Minimum M.I. value for printing the modification index 3.840

M.I. E.P.C. Std E.P.C. StdYX E.P.C.

No modification indices above the minimum value.

As is the case with exploratory factor analysis of continuous outcome variables, you may want to use the mlm or mlmv estimators in lieu of the default ml estimator if your input data are not distributed joint multivariate normal by using the ESTIMATOR = option on the ANALYSIS command. The mlm option provides a mean-adjusted chi-square model test statistic whereas the mlmv option produces a mean and variance adjusted chi-square test of model fit; both options also induce Mplus to produce robust standard errors displayed in the model results table that are used to compute z tests of significance for individual parameter estimates.


SECTION 4: Exploratory Factor Analysis as an Alternative to CFA

Most often, when doing a CFA, a researcher uses modification indexes to modify the matrix by allowing some fixed parameters to be free.

We may allow an item load on more than one factor or We may allow two items to have correlated errors.

When we change a model this way it is no longer confirmatory, but exploratory. We are combining the modification indexes with our own judgement to change the model.

In Mplus 5.1 and EFA alternative was introduced that can challenge CFA. The rotation will find the optimal solution and this will be a better fit than we can do

by looking at a few indexes and using our own judgment. However, if the optimal solution makes no sense, then we have a different problem.


With CFA we fix several paths at a value of 0.0. This results in very clean factors who have a clear meaning. However, the best guess my be a loading that is not exactly zero.

Suppose you had two latent variables measured for both the husband and wife.

Alternatively, you might think of these as two latent variables measured for the same person, but at two times, say one year apart.

You believe that the first three measures are indicators of the first factor and the second three are indicators of the second factor.

However, it is usually unreasonable to assume that all the cross loadings are 0.000. You expect them to be small, but there is no necessity to say they must be exactly zero.

Consider andolescents who have three beliefs about the certainty that they will be caught and three beliefs about the severity of punishment if they are caught. You measure them at two time points, at age 15 and again at age 17.


Discuss how this is different from a CFA model

What results would support your thinking?

1. The three beliefs about the certainty of being caught (Y1 – Y3) would load strongly on factors 1 (measured the first year) and the same 3 measured a year later (Y7 – Y9), but have weak loadings on factors 2 (measured the first year) and 4 (measured a year later).

2. Conversely, the beliefs about the severity of punishment (Y4 – Y6; Y10 – Y12) should load strongly on factors 2 and 4 but should have relatively weak loadings on factors 1 and 3.

3. The loadings of Y1 – Y6 should be identical to the corresponding loadings of Y7 – Y12 4. The errorst E1 – E6 should be correlated with the corresponding errors in E7 – E12.

Mplus calls this exploratory factor analysis because we are not fixing values at particular values, but clearly we are putting enormous constraints on the model.

Mplus VERSION 5.1MUTHEN & MUTHEN06/30/2008 6:14 PM

INPUT INSTRUCTIONS

TITLE: example3.inp this is an example of an EFA at two timepoints with factor loading invariance and correlated residuals across time DATA: FILE IS example3.dat; VARIABLE: NAMES ARE y1-y12; MODEL: f1-f2 BY y1-y6 (*t1 1); f3-f4 BY y7-y12 (*t2 1); f3-f4 WITH f1-f2; y1-y6 PWITH y7-y12; OUTPUT: TECH1 STANDARDIZED;

Estimator MLRotation GEOMINRow standardization COVARIANCEType of rotation OBLIQUETHE MODEL ESTIMATION TERMINATED NORMALLY

TESTS OF MODEL FIT

Chi-Square Test of Model Fit

Value 43.990 Degrees of Freedom 42 P-Value 0.3873

Chi-Square Test of Model Fit for the Baseline Model


Value 1265.442 Degrees of Freedom 66 P-Value 0.0000

CFI/TLI

CFI 0.998 TLI 0.997

Loglikelihood

H0 Value -9396.403 H1 Value -9374.409

Information Criteria

Number of Free Parameters 48 Akaike (AIC) 18888.807 Bayesian (BIC) 19091.108 Sample-Size Adjusted BIC 18938.753 (n* = (n + 2) / 24)

RMSEA (Root Mean Square Error Of Approximation)

Estimate 0.010 90 Percent C.I. 0.000 0.032 Probability RMSEA <= .05 1.000

SRMR (Standardized Root Mean Square Residual)

Value 0.027

MODEL RESULTS


F1 BY Y1 0.744 0.062 11.959 0.000 Y2 0.896 0.072 12.523 0.000 Y3 0.726 0.055 13.103 0.000 Y4 0.014 0.039 0.370 0.712 Y5 -0.098 0.060 -1.619 0.106 Y6 0.013 0.034 0.398 0.690

F2 BY Y1 0.052 0.058 0.898 0.369 Y2 -0.016 0.053 -0.304 0.761 Y3 0.005 0.018 0.256 0.798 Y4 0.734 0.061 12.082 0.000 Y5 0.908 0.072 12.626 0.000 Y6 0.749 0.064 11.760 0.000

F3 BY


Y7 0.744 0.062 11.959 0.000 Y8 0.896 0.072 12.523 0.000 Y9 0.726 0.055 13.103 0.000 Y10 0.014 0.039 0.370 0.712 Y11 -0.098 0.060 -1.619 0.106 Y12 0.013 0.034 0.398 0.690

F4 BY Y7 0.052 0.058 0.898 0.369 Y8 -0.016 0.053 -0.304 0.761 Y9 0.005 0.018 0.256 0.798 Y10 0.734 0.061 12.082 0.000 Y11 0.908 0.072 12.626 0.000 Y12 0.749 0.064 11.760 0.000

F3 WITH F1 0.414 0.066 6.241 0.000 F2 0.310 0.067 4.592 0.000

F4 WITH F1 0.299 0.069 4.364 0.000 F2 0.289 0.070 4.116 0.000 F3 0.494 0.085 5.823 0.000

F2 WITH F1 0.451 0.069 6.571 0.000

Y1 WITH Y7 0.397 0.061 6.463 0.000

Y2 WITH Y8 0.128 0.058 2.220 0.026




F1 BY Y1 0.584 0.043 13.475 0.000 Y2 0.707 0.049 14.449 0.000 Y3 0.569 0.038 15.086 0.000 Y4 0.011 0.030 0.370 0.712 Y5 -0.077 0.048 -1.614 0.107 Y6 0.010 0.026 0.398 0.690

F2 BY Y1 0.041 0.046 0.898 0.369 Y2 -0.013 0.042 -0.304 0.761 Y3 0.004 0.014 0.256 0.798 Y4 0.568 0.042 13.617 0.000 Y5 0.720 0.051 14.120 0.000


Y6 0.570 0.043 13.177 0.000

F3 BY Y7 0.586 0.046 12.690 0.000 Y8 0.728 0.045 16.053 0.000 Y9 0.605 0.040 15.166 0.000 Y10 0.011 0.030 0.370 0.711 Y11 -0.079 0.049 -1.621 0.105 Y12 0.011 0.027 0.398 0.691

F4 BY Y7 0.041 0.046 0.897 0.369 Y8 -0.013 0.043 -0.304 0.762 Y9 0.004 0.015 0.256 0.798 Y10 0.565 0.041 13.651 0.000 Y11 0.730 0.054 13.443 0.000 Y12 0.586 0.040 14.556 0.000

F3 WITH F1 0.403 0.059 6.820 0.000 F2 0.301 0.063 4.758 0.000

Beginning Time: 18:14:23 Ending Time: 18:14:24 Elapsed Time: 00:00:01

MUTHEN & MUTHEN3463 Stoner Ave.Los Angeles, CA 90066

Tel: (310) 391-9971Fax: (310) 391-8971Web: www.StatModel.comSupport: [email protected]

Copyright (c) 1998-2008 Muthen & Muthen

Section 5: Equality Constraints—Longitudinal CFA

We often need to test for equality constraints:1. Are items truly interchangeable. Alpha assumes that all items are equally salient to the

concept being measured. That is you weight each item equally with a 1.0 weight. CFA can extend this and test it:

tau equivalence—All loadings are constrained to be equal. o Compare fit of this model to a model in which they are unconstrained

Parallel equavalence. Tau equivalence plus all error terms are equalo Very hard to achieve and often we can proceed without this condition


2. Compare marital satisfaction of women and men Tau equivalence

o Women may weigh emotional support more than meno Men may weight sexual satisfaction more than women

If tau equivalence holds the latent variable has the same meaning in both groups.o Without this equivalence we are compareing apples and oranges. Why compare

means if the concent has a different meaning for each group?o Men may be more satisfied than women

3. We will focus on longitudinal CFA equivalence as this is most salient to growth models. This section summaries the example in Brown’s book on CFA

A little algebra.In regresson we wrote:

Y =a+bX + eWe solved for the intercept, a, using


a =MY −bMX

rearranging this we can say

MY =a+bMX

If we examine the figure we see that each observed variable, we will call it X, has a similar set of equations where the latent variable is the predictor. For each X

X =τ X + Λξ +θe

Where tau is the intercept, lambda the matrix of loadings, ksi is the score on the latent variable and theta is the error, and the mean of each X will be

M X =τ X + λXκ

where kappa is the mean of the latent variable. This adds two sets of parameters that are not shown in the figure.

Each indicator has an intercept, tau, and each latent variable has a mean, kappa.

This adds 10 parameters we need to estimate, 8 intercepts and 2 latent variable means. To estimate these 10 new parameters we need to

Include the means along with the covariance matrix Make some additional restrictions because we just added 8 known means, but need to

estimate 10 new parameters.

There are two ways of identifying these parameters.1. We could fix the latent variable means at one time at zero and estimate the latent mean at

the second, third, etc. time2. We could fix one intercept at each wave at zero.

We will use the second approach and fix the intercept of the first indicator (A1, A5) at zero. This scales satisfaction at time 1 to the mean of the first indicator at time 1, A1 This scales satisfaction at time 2 to the mean of the first indicator at time 2, B1


Instead of entering raw data (that would be fine), we will enter a row of means, a row of standard deviations, and a correlation matrix.

1.500 1.320 1.450 1.410 6.600 6.420 6.560 6.3101.940 2.030 2.050 1.990 2.610 2.660 2.590 2.5501.0000.736 1.0000.731 0.648 1.0000.771 0.694 0.700 1.0000.685 0.512 0.496 0.508 1.0000.481 0.638 0.431 0.449 0.726 1.0000.485 0.442 0.635 0.456 0.743 0.672 1.0000.508 0.469 0.453 0.627 0.759 0.689 0.695 1.000

We need the means because we are estimating means of latent variables We need the standard deviations so Mplus can convert the correlation matrix to a

covariance matrix.

We estimate four models, each of which includes estimating means. The first model estimates the means imposing the same form for the model at both waves. We are not restricting loadings to be equal, intercepts to be equal, or errors to be equal. This model doesn’t make a lot of sense because if at least the loadings aren’t equal, then we are back to comparing apples to oranges. With unequal loadings, the very meaning of satisfaction changes over time with some indicators becoming more salient and others less salient. This can be interesting as, for example, sexual satisfaction may become less central and emotional supportmay become more satisfying in more mature marriages. (a number of the comments apply to latter models)

5.1 Programs for testing equality constraints

Model 1TITLE: MPLUS PROGRAM FOR TIME1-TIME2 MSMT MODEL OF JOB SATISFACTION This has equality constraints on everything. This is from Brown's CFA book

Equal form! Equal factor loadings! Equal indicator intercepts! Equal indicator errorsDATA: FILE IS FIG7.2.DAT; TYPE IS MEANS STD CORR; ! INDICATOR MEANS ALSO INPUTTED ! Raw data would work equally well NOBS ARE 250;VARIABLE: NAMES ARE A1 B1 C1 D1 A2 B2 C2 D2;ANALYSIS: ESTIMATOR=ML;! TYPE=MEANSTRUCTURE; ! ANALYSIS OF MEAN STRUCTURE this is the defaultMODEL: SATIS1 BY A1 B1 C1 D1; SATIS2 BY A2 B2 C2 D2; A1 WITH A2; B1 WITH B2; C1 WITH C2; D1 WITH D2; ! Correlated errors


[A1@0]; [A2@0]; ! FIXES THE A INDICATOR INTERCEPTS TO ZERO [SATIS1*]; [SATIS2*]; ! FREELY ESTIMATES FACTOR MEANS! [B1 B2] (4); [C1 C2] (5); [D1 D2] (6); ! Equal intercepts! A1 A2 (7); B1 B2 (8); C1 C2 (9); D1 D2 (10); ! Equal errorsOUTPUT: SAMPSTAT MODINDICES(4.00) STAND RESIDUAL;! Notes: The Model command uses numbers to create equality constraints.! parameters followed by (1) are equal; (2) are equal, etc. Thus,! C! equals C2 because both share a (2) and D1 = D2 because both have (3)! The first line is confusing. The first variable is fixed at 1.0 by! Default. Hence A1 = A2 = 1, but the (1) does not apply to either A1 or A2! Things in [] are either means or intercepts depending on context! A1 with A2 means correlate the errors! A1 A2 (7) means equal error terms for A1 and A2

Next, we estimate the model imposing equal factor loadings. I consider this the minimum equality constraint to meaningfull comparison of means. Others would disagree with many want more constraints and with Muthén okay with what he calls “partial” invariance.

Model 2

TITLE: MPLUS PROGRAM FOR TIME1-TIME2 MSMT MODEL OF JOB SATISFACTION This has equality constraints on everything. This is from Brown's CFA book

Equal form Equal factor loadings! Equal indicator intercepts! Equal indicator errorsDATA: FILE IS FIG7.2.DAT; TYPE IS MEANS STD CORR; ! INDICATOR MEANS ALSO INPUTTED ! Raw data would work equally well NOBS ARE 250;VARIABLE: NAMES ARE A1 B1 C1 D1 A2 B2 C2 D2;ANALYSIS: ESTIMATOR=ML;! TYPE=MEANSTRUCTURE; ! ANALYSIS OF MEAN STRUCTURE this is the defaultMODEL: SATIS1 BY A1 B1 (1); C1 (2); D1 (3); SATIS2 BY A2 B2 (1) C2 (2); D2 (3); A1 WITH A2; B1 WITH B2; C1 WITH C2; D1 WITH D2; ! Correlated errors [A1@0]; [A2@0]; ! FIXES THE A INDICATOR INTERCEPTS TO ZERO [SATIS1*]; [SATIS2*]; ! FREELY ESTIMATES FACTOR MEANS! [B1 B2] (4); [C1 C2] (5); [D1 D2] (6); ! Equal intercepts! A1 A2 (7); B1 B2 (8); C1 C2 (9); D1 D2 (10); ! Equal errorsOUTPUT: SAMPSTAT MODINDICES(4.00) STAND RESIDUAL;! Notes: The Model command uses numbers to create equality constraints.! parameters followed by (1) are equal; (2) are equal, etc. Thus,! C! equals C2 because both share a (2) and D1 = D2 because both have (3)! The first line is confusing. The first variable is fixed at 1.0 by! Default. Hence A1 = A2 = 1, but the (1) does not apply to either A1 or A2! Things in [] are either means or intercepts depending on context! A1 with A2 means correlate the errors


! A1 A2 (7) means equal error terms for A1 and A2

The third model imposes equal intercepts:


Equal form Equal factor loadings Equal indicator intercepts! Equal indicator errorsDATA: FILE IS FIG7.2.DAT; TYPE IS MEANS STD CORR; ! INDICATOR MEANS ALSO INPUTTED ! Raw data would work equally well NOBS ARE 250;VARIABLE: NAMES ARE A1 B1 C1 D1 A2 B2 C2 D2;ANALYSIS: ESTIMATOR=ML;! TYPE=MEANSTRUCTURE; ! ANALYSIS OF MEAN STRUCTUREMODEL: SATIS1 BY A1 B1 (1) C1 (2) D1 (3); SATIS2 BY A2 B2 (1) C2 (2) D2 (3); A1 WITH A2; B1 WITH B2; C1 WITH C2; D1 WITH D2; ! Correlated errors [A1@0]; [A2@0]; ! FIXES THE A INDICATOR INTERCEPTS TO ZERO [SATIS1*]; [SATIS2*]; ! FREELY ESTIMATES FACTOR MEANS [B1 B2] (4); [C1 C2] (5); [D1 D2] (6); ! Equal intercepts! A1 A2 (7); B1 B2 (8); C1 C2 (9); D1 D2 (10); ! Equal errorsOUTPUT: SAMPSTAT MODINDICES(4.00) STAND RESIDUAL;! Notes: The Model command uses numbers to create equality constraints.! parameters followed by (1) are equal; (2) are equal, etc. Thus,! C! equals C2 because both share a (2) and D1 = D2 because both have (3)! The first line is confusing. The first variable is fixed at 1.0 by! Default. Hence A1 = A2 = 1, but the (1) does not apply to either A1 or A2! Things in [] are either means or intercepts depending on context! A1 with A2 means correlate the errors! A1 A2 (7) means equal error terms for A1 and A2

The fourth model adds the final constraint of equal errors on the indicator variables. This is an extreme level of invariance.


Equal form Equal factor loadings Equal indicator intercepts Equal indicator errorsDATA: FILE IS FIG7.2.DAT; TYPE IS MEANS STD CORR; ! INDICATOR MEANS ALSO INPUTTED ! Raw data would work equally well NOBS ARE 250;


VARIABLE: NAMES ARE A1 B1 C1 D1 A2 B2 C2 D2;ANALYSIS: ESTIMATOR=ML;! TYPE=MEANSTRUCTURE; ! ANALYSIS OF MEAN STRUCTUREMODEL: SATIS1 BY A1 B1 (1) C1 (2) D1 (3); SATIS2 BY A2 B2 (1) C2 (2) D2 (3); A1 WITH A2; B1 WITH B2; C1 WITH C2; D1 WITH D2; ! Correlated errors [A1@0]; [A2@0]; ! FIXES THE A INDICATOR INTERCEPTS TO ZERO [SATIS1*]; [SATIS2*]; ! FREELY ESTIMATES FACTOR MEANS [B1 B2] (4); [C1 C2] (5); [D1 D2] (6); ! Equal intercepts A1 A2 (7); B1 B2 (8); C1 C2 (9); D1 D2 (10); ! Equal errorsOUTPUT: SAMPSTAT MODINDICES(4.00) STAND RESIDUAL;! Notes: The Model command uses numbers to create equality constraints.! parameters followed by (1) are equal; (2) are equal, etc. Thus,! C! equals C2 because both share a (2) and D1 = D2 because both have (3)! The first line is confusing. The first variable is fixed at 1.0 by! Default. Hence A1 = A2 = 1, but the (1) does not apply to either A1 or A2! Things in [] are either means or intercepts depending on context! A1 with A2 means correlate the errors! A1 A2 (7) means equal error terms for A1 and A2

We can summaries these as follows:

Model 2 Df 2 diff df RMSEA CFI TLI SRMREqual form 2.09 15 .000 1.00 1.01 .010Equal factor loadings 3.88 18 1.79 3 .000 1.00 1.01 .014Equal intercepts 7.25 21 3.37 3 .000 1.00 1.01 .026Equal error variances 90.73*** 25 83.48*** 4 .103 .96 .96 .037

We cannot go all the way to equal indicator error variances, but we can go all the way to equal indicator intercepts before chi-square increases significantly. Here are selected results for the equal indicator intercepts model:

MODEL RESULTS


SATIS1 BY A1 1.000 0.000 999.000 999.000 B1 0.989 0.017 56.699 0.000 C1 0.993 0.017 60.078 0.000 D1 0.962 0.016 60.427 0.000

SATIS2 BY A2 1.000 0.000 999.000 999.000 B2 0.989 0.017 56.699 0.000 C2 0.993 0.017 60.078 0.000 D2 0.962 0.016 60.427 0.000


SATIS2 WITH SATIS1 2.547 0.321 7.923 0.000

A1 WITH A2 0.723 0.117 6.187 0.000

B1 WITH B2 1.023 0.162 6.298 0.000

C1 WITH C2 1.031 0.158 6.515 0.000

D1 WITH D2 0.785 0.133 5.918 0.000

Means SATIS1 1.500 0.121 12.421 0.000 Note, these = M of A1 & A2 SATIS2 6.617 0.160 41.422 0.000

Intercepts A1 0.000 0.000 999.000 999.000 B1 -0.156 0.098 -1.583 0.113 C1 -0.032 0.099 -0.327 0.743 D1 -0.039 0.089 -0.436 0.663 A2 0.000 0.000 999.000 999.000 B2 -0.156 0.098 -1.583 0.113 C2 -0.032 0.099 -0.327 0.743 D2 -0.039 0.089 -0.436 0.663

Variances SATIS1 2.936 0.291 10.099 0.000 SATIS2 5.013 0.499 10.050 0.000

Residual Variances A1 0.711 0.101 7.060 0.000 B1 1.391 0.152 9.149 0.000 C1 1.427 0.155 9.185 0.000 D1 1.070 0.124 8.655 0.000 A2 1.428 0.188 7.609 0.000 B2 2.363 0.260 9.082 0.000 C2 2.066 0.236 8.753 0.000 D2 1.839 0.214 8.610 0.000Why would wave 2 have bigger error variances? This should be explored




SATIS1 BY A1 0.897 0.016 57.727 0.000 Equality only meaningful B1 0.821 0.020 40.566 0.000 for the unstandardized C1 0.818 0.020 40.304 0.000 solution. These will also


D1 0.847 0.019 45.498 0.000 be equal only in the case

SATIS2 BY A2 0.882 0.016 53.469 0.000 that all the item B2 0.822 0.020 40.444 0.000 variances are equal. C2 0.840 0.019 43.777 0.000 D2 0.846 0.019 45.040 0.000

SATIS2 WITH SATIS1 0.664 0.039 17.084 0.000

A1 WITH A2 0.717 0.050 14.314 0.000

B1 WITH B2 0.564 0.052 10.827 0.000

C1 WITH C2 0.601 0.050 12.023 0.000

D1 WITH D2 0.560 0.055 10.149 0.000

Means SATIS1 0.876 0.083 10.584 0.000 SATIS2 2.955 0.161 18.373 0.000

Intercepts A1 0.000 0.000 999.000 999.000 B1 -0.075 0.048 -1.588 0.112 C1 -0.016 0.048 -0.328 0.743 D1 -0.020 0.046 -0.436 0.663 A2 0.000 0.000 999.000 999.000 B2 -0.058 0.036 -1.587 0.112 C2 -0.012 0.037 -0.328 0.743 D2 -0.015 0.035 -0.436 0.663

Variances SATIS1 1.000 0.000 999.000 999.000 SATIS2 1.000 0.000 999.000 999.000

Residual Variances A1 0.195 0.028 6.995 0.000 B1 0.326 0.033 9.822 0.000 C1 0.330 0.033 9.929 0.000 D1 0.282 0.032 8.955 0.000 A2 0.222 0.029 7.615 0.000 B2 0.325 0.033 9.741 0.000 C2 0.295 0.032 9.145 0.000 D2 0.284 0.032 8.918 0.000


SECTION 6: Path Analysis

6.1 Model and Program

TITLE: ex3.11 This is an example of a path analysis

with continuous dependent variablesDATA: FILE IS ex3.11.dat;VARIABLE: NAMES ARE y1-y3 x1-x3;MODEL: y1 y2 ON x1 x2 x3;

y3 ON y1 y2 x2; MODEL indirect:

y2 ind x1; y2 ind x2; y2 ind x3; y3 ind x1; y3 ind x2;


y3 ind x3;OUTPUT: standardized mod(3.84);

6.2. Indirect Effects

The MODEL INDIRECT: subcommand estimates indirect effects for you You get the Total indirect effect that combines as many specific

indirect effects as there are in the model Specific indirect effects of x1 go y3 include

o x1 y1 y3o x1 y2 y3

Tests of significant for both specific and total indirect effects

Estimate and interpret the output:

SECTION 7: Putting it Together—CFA & SEM

Interpret the figure.

Notice indirect effects.


7.1 Program

TITLE: example2cfa This is an example of a SEM with

CFA factors with continuous factor indicatorsAnd Indirect Effects

DATA: FILE IS example2.dat;VARIABLE: NAMES ARE y1-y12;MODEL: f1 BY y1-y3; f2 by y4-y6;

f3 by y7-y9; f4 BY y10-y12;

f3 ON f1-f2; f4 ON f3;

MODEL INDIRECT: f4 ind f1;

f4 ind f2;OUTPUT: standardized mod(3.84)

7.2 Output Mplus VERSION 5.1MUTHEN & MUTHEN06/30/2008 8:12 PM

SUMMARY OF ANALYSIS

Number of groups 1Number of observations 500Number of dependent variables 12Number of independent variables 0Number of continuous latent variables 4

Observed dependent variables Continuous Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 Y12

Continuous latent variables F1 F2 F3 F4

Estimator ML


Information matrix OBSERVEDMaximum number of iterations 1000Convergence criterion 0.500D-04Maximum number of steepest descent iterations 20

TESTS OF MODEL FIT

Chi-Square Test of Model Fit Value 53.492 Degrees of Freedom 50 P-Value 0.3417Chi-Square Test of Model Fit for the Baseline Model Value 4600.240 Degrees of Freedom 66 P-Value 0.0000CFI/TLI CFI 0.999 TLI 0.999Loglikelihood H0 Value -6483.831 H1 Value -6457.085Information Criteria Number of Free Parameters 40 Akaike (AIC) 13047.662 Bayesian (BIC) 13216.247 Sample-Size Adjusted BIC 13089.284 (n* = (n + 2) / 24)RMSEA (Root Mean Square Error Of Approximation) Estimate 0.012 90 Percent C.I. 0.000 0.032 Probability RMSEA <= .05 1.000SRMR (Standardized Root Mean Square Residual) Value 0.019

MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value F1 BY Y1 1.000 0.000 999.000 999.000 Y2 1.103 0.062 17.881 0.000 Y3 0.942 0.058 16.346 0.000 F2 BY Y4 1.000 0.000 999.000 999.000 Y5 1.006 0.057 17.691 0.000 Y6 1.023 0.060 17.064 0.000 F3 BY Y7 1.000 0.000 999.000 999.000


Y8 0.894 0.021 41.937 0.000 Y9 0.902 0.021 42.479 0.000 F4 BY Y10 1.000 0.000 999.000 999.000 Y11 0.734 0.028 26.424 0.000 Y12 0.684 0.028 24.405 0.000 F3 ON F1 0.640 0.069 9.271 0.000 F2 0.912 0.074 12.399 0.000 F4 ON F3 0.546 0.032 16.975 0.000 F2 WITH F1 0.297 0.038 7.767 0.000

Variances F1 0.599 0.061 9.766 0.000 F2 0.618 0.064 9.717 0.000

Residual Variances Y1 0.367 0.033 11.044 0.000 Y2 0.296 0.033 8.946 0.000 Y3 0.412 0.033 12.309 0.000 Y4 0.400 0.034 11.640 0.000 Y5 0.340 0.031 10.888 0.000 Y6 0.392 0.034 11.370 0.000 Y7 0.183 0.019 9.799 0.000 Y8 0.191 0.017 11.268 0.000 Y9 0.181 0.017 10.812 0.000 Y10 0.240 0.027 8.746 0.000 Y11 0.183 0.017 10.791 0.000 Y12 0.213 0.018 11.998 0.000 F3 0.525 0.049 10.636 0.000 F4 0.565 0.049 11.488 0.000

STANDARDIZED MODEL RESULTSSTDYX Standardization Two-Tailed Estimate S.E. Est./S.E. P-Value

F1 BY Y1 0.787 0.023 34.084 0.000 Y2 0.843 0.020 41.362 0.000 Y3 0.751 0.025 30.614 0.000 F2 BY Y4 0.779 0.023 34.055 0.000 Y5 0.805 0.021 37.480 0.000 Y6 0.789 0.022 35.305 0.000


F3 BY Y7 0.948 0.006 153.038 0.000 Y8 0.934 0.007 131.246 0.000 Y9 0.938 0.007 136.242 0.000 F4 BY Y10 0.902 0.013 70.200 0.000 Y11 0.869 0.014 59.982 0.000 Y12 0.835 0.017 50.003 0.000 F3 ON F1 0.388 0.039 10.057 0.000 F2 0.561 0.036 15.610 0.000 F4 ON F3 0.680 0.027 24.795 0.000 F2 WITH F1 0.488 0.043 11.343 0.000Variances F1 1.000 0.000 999.000 999.000 F2 1.000 0.000 999.000 999.000Residual Variances Y1 0.380 0.036 10.450 0.000 Y2 0.289 0.034 8.397 0.000 Y3 0.437 0.037 11.862 0.000 Y4 0.393 0.036 11.018 0.000 Y5 0.352 0.035 10.197 0.000 Y6 0.378 0.035 10.713 0.000 Y7 0.101 0.012 8.572 0.000 Y8 0.128 0.013 9.598 0.000 Y9 0.120 0.013 9.287 0.000 Y10 0.186 0.023 8.010 0.000 Y11 0.244 0.025 9.698 0.000 Y12 0.302 0.028 10.828 0.000 F3 0.322 0.031 10.421 0.000 F4 0.538 0.037 14.433 0.000

TOTAL, TOTAL INDIRECT, SPECIFIC INDIRECT, AND DIRECT EFFECTSSTANDARDIZED TOTAL, TOTAL INDIRECT, SPECIFIC INDIRECT, AND DIRECT EFFECTSSTDYX Standardization Two-Tailed Estimate S.E. Est./S.E. P-Value

Effects from F1 to F4 Total 0.263 0.028 9.269 0.000 Total indirect 0.263 0.028 9.269 0.000 Specific indirect F4 F3 F1 0.263 0.028 9.269 0.000


Effects from F2 to F4 Total 0.382 0.029 12.999 0.000 Total indirect 0.382 0.029 12.999 0.000 Specific indirect F4 F3 F2 0.382 0.029 12.999 0.000

MODEL MODIFICATION INDICES

Minimum M.I. value for printing the modification index 3.840 M.I. E.P.C. Std E.P.C. StdYX E.P.C.BY StatementsF3 BY Y1 5.980 0.103 0.131 0.134WITH StatementsY3 WITH Y2 6.126 0.091 0.091 0.260Y4 WITH Y3 5.405 0.053 0.053 0.130Y5 WITH Y3 5.265 -0.049 -0.049 -0.132Y8 WITH Y2 5.695 -0.037 -0.037 -0.155Y9 WITH Y6 4.801 0.035 0.035 0.133


7.3 Interpretation of modification indices

We could reduce Chi-square, which now is Chi-square(50) = 53.492, by about 5.265 if we allowed the error term for Y5 to be correlated with the error term for Y3.

The correlation of the two errors would be about -.132—does this make sense?

We would do these one at a time We would only do it if it made sense. Say Y5 and Y3 are pen and

pencil tests and all the others are face to face interviews. There might be a method effect that we could incorporate as an error term

We might not have much to gain even if there is a big modification index if the fit is already good.

New Chi-square would be approximately Chi-square(49) = 53.492 – 5.265. A reduction in Chi-square of 5.265 with one degree of freedom would be highly significant. Not much need to improve on a CFI = .997; RMSEA = .012;


SECTION 8: Putting it Together—EFA & SEM

We may have a situation where we are sufficiently confident to have F3 and F4 represented by a CFA model, but not that confident about F1 and F2 for which we want to do an EFA.

8.1 Program & model

Here are the program and results:

The (*1) in the Model line for f1-f2 by y1-y6 (*1); is included so Mplus knows this is an EFA set. We expect y1-y3 to have strong loadings on f1 and weak loadings on f2. We expect y4-y6 to have weak loadings on f1 and strong loadings on f2. Still, we are not sufficiently confident of this to impose the restriction that these loadings are exactly 0.000.Intro to Mplus—Alan C. Acock 51

8.2 Output

TITLE: example2.inp This is an example of a SEM with EFA and CFA factors with continuous

factor indicatorsDATA: FILE IS example2.dat;VARIABLE: NAMES ARE y1-y12;MODEL: f1-f2 BY y1-y6 (*1);

f3 BY y7-y9; f4 BY y10-y12; f3 ON f1-f2; f4 ON f3;

MODEL INDIRECT: f4 ind f1; f4 ind f2;OUTPUT: Standardized mod(3.84)

Mplus VERSION 5.1MUTHEN & MUTHEN06/30/2008 8:32 PM

TESTS OF MODEL FITChi-Square Test of Model Fit Value 51.353 Degrees of Freedom 46 P-Value 0.2720Chi-Square Test of Model Fit for the Baseline Model Value 4600.240 Degrees of Freedom 66 P-Value 0.0000CFI/TLI CFI 0.999 TLI 0.998Loglikelihood H0 Value -6482.762 H1 Value -6457.085Information Criteria Number of Free Parameters 44 Akaike (AIC) 13053.524 Bayesian (BIC) 13238.966 Sample-Size Adjusted BIC 13099.308 (n* = (n + 2) / 24)RMSEA (Root Mean Square Error Of Approximation) Estimate 0.015 90 Percent C.I. 0.000 0.034


Probability RMSEA <= .05 1.000SRMR (Standardized Root Mean Square Residual) Value 0.018

MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value F1 BY Y1 0.751 0.048 15.608 0.000 Y2 0.858 0.042 20.467 0.000 Y3 0.736 0.045 16.353 0.000 Y4 0.036 0.051 0.711 0.477 Y5 -0.028 0.049 -0.568 0.570 Y6 0.002 0.004 0.627 0.530 F2 BY Y1 0.034 0.045 0.755 0.450 Y2 -0.002 0.016 -0.150 0.881 Y3 -0.008 0.035 -0.220 0.826 Y4 0.763 0.050 15.367 0.000 Y5 0.810 0.048 16.837 0.000 Y6 0.802 0.041 19.461 0.000 F3 BY Y7 1.000 0.000 999.000 999.000 Y8 0.894 0.021 41.937 0.000 Y9 0.902 0.021 42.479 0.000 F4 BY Y10 1.000 0.000 999.000 999.000 Y11 0.734 0.028 26.424 0.000 Y12 0.684 0.028 24.405 0.000 F3 ON F1 0.493 0.058 8.461 0.000 F2 0.721 0.057 12.752 0.000 F4 ON F3 0.546 0.032 16.975 0.000 F2 WITH F1 0.479 0.053 9.094 0.000 Variances F1 1.000 0.000 999.000 999.000 F2 1.000 0.000 999.000 999.000 Residual Variances Y1 0.376 0.034 11.064 0.000 Y2 0.290 0.035 8.239 0.000 Y3 0.406 0.034 11.817 0.000 Y4 0.408 0.035 11.742 0.000 Y5 0.329 0.033 10.046 0.000 Y6 0.393 0.035 11.073 0.000 Y7 0.183 0.019 9.796 0.000


Y8 0.191 0.017 11.269 0.000 Y9 0.181 0.017 10.812 0.000 Y10 0.240 0.027 8.746 0.000 Y11 0.183 0.017 10.791 0.000 Y12 0.213 0.018 11.998 0.000 F3 0.527 0.049 10.644 0.000 F4 0.565 0.049 11.488 0.000

STANDARDIZED MODEL RESULTSSTDYX Standardization Two-Tailed Estimate S.E. Est./S.E. P-Value

F1 BY Y1 0.764 0.037 20.741 0.000 Y2 0.848 0.024 34.915 0.000 Y3 0.758 0.033 23.068 0.000 Y4 0.036 0.051 0.711 0.477 Y5 -0.028 0.050 -0.568 0.570 Y6 0.002 0.003 0.627 0.530 F2 BY Y1 0.034 0.046 0.755 0.450 Y2 -0.002 0.015 -0.150 0.881 Y3 -0.008 0.036 -0.220 0.826 Y4 0.756 0.037 20.282 0.000 Y5 0.825 0.035 23.257 0.000 Y6 0.787 0.023 33.668 0.000 F3 BY Y7 0.948 0.006 153.043 0.000 Y8 0.934 0.007 131.230 0.000 Y9 0.938 0.007 136.226 0.000 F4 BY Y10 0.902 0.013 70.200 0.000 Y11 0.869 0.014 59.982 0.000 Y12 0.835 0.017 50.002 0.000 F3 ON F1 0.386 0.043 8.914 0.000 F2 0.565 0.038 14.919 0.000 F4 ON F3 0.680 0.027 24.796 0.000 F2 WITH F1 0.479 0.053 9.094 0.000 Intercepts Y1 0.008 0.045 0.183 0.855 Y2 0.031 0.045 0.688 0.491 Y3 0.007 0.045 0.146 0.884 Y4 0.074 0.045 1.657 0.098


Y5 0.071 0.045 1.590 0.112 Y6 0.068 0.045 1.528 0.126 Y7 0.044 0.045 0.983 0.326 Y8 0.050 0.045 1.115 0.265 Y9 0.056 0.045 1.252 0.211 Y10 0.008 0.045 0.170 0.865 Y11 0.028 0.045 0.616 0.538 Y12 0.025 0.045 0.554 0.580 Variances F1 1.000 0.000 999.000 999.000 F2 1.000 0.000 999.000 999.000 Residual Variances Y1 0.390 0.037 10.510 0.000 Y2 0.283 0.036 7.793 0.000 Y3 0.431 0.038 11.385 0.000 Y4 0.401 0.036 11.149 0.000 Y5 0.341 0.036 9.459 0.000 Y6 0.378 0.036 10.467 0.000 Y7 0.101 0.012 8.570 0.000 Y8 0.128 0.013 9.599 0.000 Y9 0.120 0.013 9.287 0.000 Y10 0.186 0.023 8.009 0.000 Y11 0.244 0.025 9.698 0.000 Y12 0.302 0.028 10.828 0.000 F3 0.323 0.031 10.432 0.000 F4 0.538 0.037 14.433 0.000

R-SQUARE Latent Two-Tailed Variable Estimate S.E. Est./S.E. P-Value F3 0.677 0.031 21.887 0.000 F4 0.462 0.037 12.398 0.000

STANDARDIZED TOTAL, TOTAL INDIRECT, SPECIFIC INDIRECT, AND DIRECT EFFECTSSTDYX Standardization Two-Tailed Estimate S.E. Est./S.E. P-ValueEffects from F1 to F4 Total 0.263 0.031 8.353 0.000 Total indirect 0.263 0.031 8.353 0.000 Specific indirect F4 F3 F1 0.263 0.031 8.353 0.000

Effects from F2 to F4


Total 0.384 0.030 12.590 0.000 Total indirect 0.384 0.030 12.590 0.000 Specific indirect F4 F3 F2 0.384 0.030 12.590 0.000

MODEL MODIFICATION INDICESMinimum M.I. value for printing the modification index 3.840 M.I. E.P.C. Std E.P.C. StdYX E.P.C.BY StatementsF3 BY Y1 6.537 0.144 0.184 0.187WITH StatementsY3 WITH Y2 5.262 0.115 0.115 0.336Y4 WITH Y1 4.954 -0.052 -0.052 -0.133Y4 WITH Y3 5.288 0.055 0.055 0.135Y5 WITH Y3 4.367 -0.049 -0.049 -0.133Y8 WITH Y2 5.716 -0.037 -0.037 -0.157Y9 WITH Y6 4.853 0.036 0.036 0.133


Section 9: Summary & Resources

This provides a brief introduction to Mplus. We have not covered any of the statistical theory underlying Mplus, but this should be enough for you to read the Manual and follow more complex explications of Mplus and SEM.

Key things to remember:

1. BY means measured by and is the path (loading) between latent variables and their indicators.

2. ON is the structural path between variables. In last example, F4 depends ON F3, F3 depends ON both F1 and F2.

3. WITH means correlated with. Two uses include: a. For exogenous variables WITH means the exogenous variables

are correlated. In last example, F1 is correlated WITH F2.


b. For indicators WITH means the errors/residuals are correlated. In last examples, the modification indices suggest we might correlate the error for Y3 WITH the error for Y5.

Additional introductory content is available at:

http://www.ats.ucla.edu/stat/mplus/

The Mplus webpage has a wealth of support that includes articles applying Mplus that serve as models and extensive on line videos

http://www.statmodel.com

A very thorough, but gentle discussion of CFA (including sample progrms) is

Brown, Timothy A. (2006). Confirmatory Factor Analysis for Applied Research. N.Y.: Guilford

A gentle introduction to SEM is

Kline, Rex B. (2005). Principles and practice of structural equation modeling (2nd ed.). N.Y.: Guildford Press.


http://www.statmodel.com/

http://www.ats.ucla.edu/stat/mplus/

Mplus for Windows: An Introduction - Oregon State...

Documents

Transcript of Mplus for Windows: An Introduction - Oregon State...