Brm statwiki

Data screening

Data screening (sometimes referred to as "data screaming") is the process of ensuring your data is clean and ready to go before you conduct further statistical analyses. Data must be screened in order to ensure the data is useable, reliable, and valid for testing causal theory. In this section I will focus on six specific issues that need to be addressed when cleaning (not cooking) your data.

Missing Data

If you are missing much of your data, this can cause several problems. The most apparent problem is that there simply won't be enough data points to run your analyses. The EFA, CFA, and path models require a certain number of data points in order to compute estimates. This number increases with the complexity of your model. If you are missing several values in your data, the analysis just won't run.

Additionally, missing data might represent bias issues. Some people may not have answered particular questions in your survey because of some common issue. For example, if you asked about gender, and females are less likely to report their gender than males, then you will have male-biased data. Perhaps only 50% of the females reported their gender, but 95% of the males reported gender. If you use gender in your causal models, then you will be heavily biased toward males, because you will not end up using the unreported responses.

To find out how many missing values each variable has, in SPSS go to Analyze, then Descriptive Statistics, then Frequencies. Enter the variables in the variables list. Then click OK. The table in the output will show the number of missing values for each variable. See the screenshots below.

The threshold for missing data is flexible, but generally, if you are missing more than 10% of the responses on a particular variable, or from a particular respondent, that variable or respondent may be problematic. There are several ways to deal with problematic variables.

Just don't use that variable.

If it makes sense, impute the missing values. This should only be done for continuous or interval data (like age or Likert-scale responses), not for categorical data (like gender).

If your dataset is large enough, just don't use the responses that had missing values for that variable. This may create a bias, however, if the number of missing responses is greater than 10%.

To impute values in SPSS, go to Transform, Replace Missing Values; then select the variables that need imputing, and hit OK. See the screenshots below. In this screenshot, I use the Mean replacement method. But there are other options, including Median replacement. Typically with Likert-type data, you want to use median replacement, because means are less meaningful in these scenarios. For more information on when to use which type of imputation,

Handling problematic respondents is somewhat more difficult. If a respondent did not answer a large portion of the questions, their other responses may be useless when it comes to testing causal models. For example, if they answered questions about diet, but not about weight loss, for this individual we cannot test a causal model that argues that diet has a positive effect on weight loss. We simply do not have the data for that person. My recommendation is to first determine which variables will actually be used in your model (often we collect data on more variables than we actually end up using in our model), then determine if the respondent is problematic. If so, then remove that respondent from the analysis.

Outliers

Outliers can influence your results, pulling the mean away from the median. Two types of outliers exist: outliers for individual variables, and outliers for the model.

UnivariateTo detect outliers on each variable, just produce a boxplot in SPSS (as demonstrated in the video). Outliers will appear at the extremes, and will be labeled, as in the figure below. If you have a really high sample size, then you may want to remove the outliers. If you are working with a smaller dataset, you may want to be less liberal about deleting records. However, this is a trade-off, because outliers will influence small datasets more than large ones. Lastly, outliers do not really exist in Likert-scales. Answering at the extreme (1 or 5) is not really representative outlier behavior.

Another type of outlier is an unengaged respondent. Sometimes respondents will enter '3, 3, 3, 3,...' for every single survey item. This participant was clearly not engaged, and their responses will throw off your results. Other patterns indicative of unengaged respondents are '1, 2, 3, 4, 5, 1, 2, ...' or '1, 1, 1, 1, 5, 5, 5, 5, 1, 1, ...'. There are multiple ways to identify and eliminate these unengaged respondents:

Include attention traps that request the respondent to "answer somewhat agree for this item if you are paying attention". I usually include two of these in opposite directions (i.e., one says somewhat agree and one says somewhat disagree) at about a third and two-thirds of the way through my surveys. I am always astounded at how many I catch this way...

http://statwiki.kolobkreations.com/wiki/File:Boxplot.png

See if the participant answered reverse-coded questions in the same direction as normal questions. For example, if they responded strongly agree to both of these items, then they were not paying attention: "I am very hungry", "I don't have much appetite right now".

Examine the standard deviation of their responses (if all on the same scale (like 1-5)). If they exhibit a very low standard deviation (like less than 0.500 on a 5-point scale, or 0.700 on a 7-point scale), then they were probably not paying attention, and their responses are useless to us anyway since they don't exhibit any variance.

MultivariateMultivariate outliers refer to records that do not fit the standard sets of correlations exhibited by the other records in the dataset, with regards to your causal model. So, if all but one person in the dataset reports that diet has a positive effect on weight loss, but this one guy reports that he gains weight when he diets, then his record would be considered a multivariate outlier. To detect these influential multivariate outliers, you need to calculate the Mahalanobis d-squared. This is a simple matter in AMOS. See the video tutorial for the particulars. As a warning however, I almost never address multivariate outliers, as it is very difficult to justify removing them just because they don't match your theory. Additionally, you will nearly always find multivariate outliers, even if you remove them, more will show up. It is a slippery slope.

Normality

Normality refers to the distribution of the data for a particular variable. We usually assume that the data is normally distributed, even though it usually is not! Normality is assessed in many different ways: shape, skewness, and kurtosis (flat/peaked).

Shape: To discover the shape of the distribution in SPSS, build a histogram (as shown in the video tutorial) and plot the normal curve. If the histogram does not match the normal curve, then you likely have normality issues. You can also look at the boxplot to determine normality.

Skewness: Skewness means that the responses did not fall into a normal distribution, but were heavily weighted toward one end of the scale. Income is an example of a commonly right skewed variable; most people make between 20 and 70 thousand dollars in the USA, but there is smaller group that makes between 70 and 100, and an even smaller group that makes between 100 and 150, and a much smaller group that makes between 150 and 250, etc. all the way up to Bill Gates and Mark Zuckerberg. Skewness is much less meaningful on short-interval ordinal measures (like 5-point Likert scales), as any severity in skewness is better captured through kurtosis. Addressing skewness may require transformations of your data, or removing influential outliers. There are two rules on Skewness:

(1)If your skewness value is greater than 1 then you are positive (right) skewed, if it is less than -1 you are negative (left) skewed, if it is in between, then you are fine.

(2)If the absolute value of the skewness is less than three times the standard error, then you are fine; otherwise you are skewed.

Using these rules, we can see from the table below, that all three variables are fine using the first rule, but using the second rule, they are all negative (left) skewed.

Skewness looks like this:

Kurtosis:Kurtosis refers to the peakedness or flatness of the distribution of data (i.e., is there sufficient and normal variance?). Data that is distributed tightly around the mean (has very small standard deviation) has kurtosis issues. Data that is distantly distributed (has very large standard deviation) also has kurtosis issues. The rule for evaluating whether or not your kurtosis is problematic is the same as rule two above:

If the absolute value of the kurtosis is less than three times the standard error, then you are fine; otherwise you have kurtosis issues. Although a looser rule is an overall kurtosis score of 2.200 or less (rather than 1.00) (Sposito et al., 1983).

Kurtosis looks like this:

Bimodal:One other issue you may run into with the distribution of your data is a bimodal distribution. This means that the data has multiple (two) peaks, rather than peaking at the mean. This may indicate there are moderating variables effecting this data. A bimodal distribution looks like this:

http://statwiki.kolobkreations.com/wiki/File:Skew.png

http://statwiki.kolobkreations.com/wiki/File:Skewness.png

http://statwiki.kolobkreations.com/wiki/File:Kurtosis.png

Transformations:When you have extremely non-normal data, it will influence your regressions in SPSS and AMOS. In such cases, if you have non-Likert-scale variables (so, variables like age, income, revenue, etc.), you can transform them prior to including them in your model. Gary Templeton has published an excellent article on this and created a YouTube video showing how to conduct the transformation. He also references his article in the video.

Linearity

Linearity refers to the consistent slope of change that represents the relationship between an IV and a DV. If the relationship between the IV and the DV is radically inconsistent, then it will throw off your SEM analyses. There are dozens of ways to test for linearity. Perhaps the most elegant (easy and clear-cut, yet rigorous), is the deviation from linearity test available in the ANOVA test in SPSS. In SPSS go to Analyze, Compare Means, Means. Put the composite IVs and DVs in the lists, then click on options, and select "Test for Linearity". Then in the ANOVA table in the output window, if the Sig value for Deviation from Linearity is less than 0.05, the relationship between IV and DV is not linear, and thus is problematic (see the screenshots below). Issues of linearity can sometimes be fixed by removing outliers (if the significance is borderline), or through transforming the data. In the screenshot below, we can see that the first relationship is linear (Sig = .268), but the second relationship is nonlinear (Sig = .003).

http://statwiki.kolobkreations.com/wiki/File:Bimodal.png

If this test turns up odd results, then simply perform an OLS linear regression between each IV->DV pair. If the sig value is less than 0.05, then the relationship can be considered "sufficiently" linear. While this approach is somewhat less rigorous, it has the benefit of working every time! You can also do a curve-linear regression ("curve estimation") to see if the relationship is more linear than non-linear.

Homoscedasticity

Homoscedasticity is a nasty word that means that the variable's residual (error) exhibits consistent variance across different levels of the variable. There are good reasons for desiring this. For more information, see Hair et al. 2010 chapter 2. :) A simple way to determine if a relationship is homoscedastic is to do a simple scatter plot with the variable on the y-axis and the variable's residual on the x-axis. To see a step by step guide on how to do this, watch the video tutorial. If the plot comes up with a consistent pattern - as in the figure below, then we are good - we have homoscedasticity! If there is not a consistent pattern, then the relationship is considered heteroskedastic. This can be fixed by transforming the data or by splitting the data by subgroups (such as two groups for gender). You can read more about transformations in Hair et al. 2010 ch. 4.

Schools of thought on homoscedasticity are still out. Some suggest that evidence of heteroskedasticity is not a problem (and is actually desirable and expected in moderated models), and so we shouldn't worry about testing for homoscedasticity. I never conduct this test unless specifically requested to by a reviewer.

Multicollinearity

Multicollinearity is not desirable. It means that the variance our independent variables explain in our dependent variable are are overlapping with each other and thus not each explaining unique variance in the dependent variable. The way to check this is to calculate a Variable Inflation Factor (VIF) for each independent variable after running a multivariate regression. The rules of thumb for the VIF are as follows:

VIF < 3: not a problem VIF > 3; potential problem VIF > 5; very likely problem VIF > 10; definitely problem

http://statwiki.kolobkreations.com/wiki/File:Homo2.png

The tolerance value in SPSS is directly related to the VIF, and values less than 0.10 are strong indications of multicollinearity issues. For particulars on how to calculate the VIF in SPSS, watch the step by step video tutorial. The easiest method for fixing multicollinearity issues is to drop one of problematic variables. This won't hurt your R-square much because that variable doesn't add much unique explanation of variance anyway.

Exploratory Factor Analysis

Exploratory Factor Analysis (EFA) is a statistical approach for determining the correlation among the variables in a dataset. This type of analysis provides a factor structure (a grouping of variables based on strong correlations). In general, an EFA prepares the variables to be used for cleaner structural equation modeling. An EFA should always be conducted for new datasets. The beauty of an EFA over a CFA (confirmatory) is that no a priori theory about which items belong to which constructs is applied. This means the EFA will be able to spot problematic variables much more easily than the CFA.

Rotation types

Rotation causes factor loadings to be more clearly differentiated, which is often necessary to facilitate interpretation. Several types of rotation are available for your use.

Orthogonal

Varimax (most common)

minimizes number of variables with extreme loadings (high or low) on a factor makes it possible to identify a variable with a factor

Quartimax

minimizes the number of factors needed to explain each variable tends to generate a general factor on which most variables load with medium to high vales not very helpful for research

Equimax

combination of Varimax and QuartimaxOblique

The variables are assessed for the unique relationship between each factor and the variables (removing relationships that are shared by multiple factors).

Direct oblimin (DO)

factors are allowed to be correlated diminished interpretability

Promax (Use this one if you're not sure)

computationally faster than DO used for large datasetsFactoring methods

There are three main methods for factor extraction.

Principal Component Analysis (PCA)

Use for a softer solution

Considers all of the available variance (common + unique) (places 1’s on diagonal of correlation matrix).

Seeks a linear combination of variables such that maximum variance is extracted—repeats this step.

Use when there is concern with prediction, parsimony and you know the specific and error variance are small.

Results in orthogonal (uncorrelated factors).Principal Axis Factoring (PAF)

Considers only common variance (places communality estimates on diagonal of correlation matrix).

Seeks least number of factors that can account for the common variance (correlation) of a set of variables.

PAF is only analyzing common factor variability; removing the uniqueness or unexplained variability from the model.

PAF is preferred because it accounts for co-variation, whereas PCA accounts for total variance.

Maximum Likelihood (ML)

Use this method if you are unsure

Maximizes differences between factors. Provides Model Fit estimate.

This is the approach used in AMOS, so if you are going to use AMOS for CFA and structural modeling, you should use this one during the EFA.

Appropriateness of data (adequacy)

KMO Statistics

Marvelous: .90s Meritorious: .80s Middling: .70s Mediocre: .60s Miserable: .50s Unacceptable: <.50Bartlett’s Test of Sphericity

Tests hypothesis that correlation matrix is an identity matrix.

Diagonals are ones Off-diagonals are zeros

A significant result (Sig. < 0.05) indicates matrix is not an identity matrix; i.e., the variables do relate to one another enough to run a meaningful EFA.

Communalities

A communality is the extent to which an item correlates with all other items. Higher communalities are better. If communalities for a particular variable are low (between 0.0-0.4), then that variable may struggle to load significantly on any factor. In the table below, you should identify low values in the "Extraction" column. Low values indicate candidates for removal after you examine the pattern matrix.

http://statwiki.kolobkreations.com/wiki/File:KMO.png

Factor Structure

Factor structure refers to the intercorrelations among the variables being tested in the EFA. Using the pattern matrix below as an illustration, we cab see that variables group into factors - more precisely, they "load" onto factors. The example below illustrates a very clean factor structure in which convergent and discriminant validity are evident by the high loadings within factors, and no cross-loadings between factors.

http://statwiki.kolobkreations.com/wiki/File:Communalities.png

Convergent validity

Convergent validity means that the variables within a single factor are highly correlated. This is evident by the factor loadings. Sufficient/significant loadings depend on the sample size of your dataset. The table below outlines the thresholds for sufficient/significant factor loadings. Generally, the smaller the sample size, the higher the required loading. We can see that in the pattern matrix above, we would need a sample size of 60-70 at a minimum to achieve significant loadings for variables loyalty1 and loyalty7. Regardless of sample size, it is best to have loadings greater than 0.500 and averaging out to greater than 0.700 for each factor.

http://statwiki.kolobkreations.com/wiki/File:Patternmatrix.png

Discriminant validity

Discriminant validity refers to the extent to which factors are distinct and uncorrelated. The rule is that variables should relate more strongly to their own factor than to another factor. Two primary methods exist for determining discriminant validity during an EFA. The first method is to examine the pattern matrix. Variables should load significantly only on one factor. If "cross-loadings" do exist (variable loads on multiple factors), then the cross-loadings should differ by more than 0.2. The second method is to examine the factor correlation matrix, as shown below. Correlations between factors should not exceed 0.7. A correlation greater than 0.7 indicates a majority of shared variance (0.7 * 0.7 = 49% shared variance). As we can see from the factor correlation matrix below, factor 2 is too highly correlated with factors 1, 3, and 4.

Face validity

http://statwiki.kolobkreations.com/wiki/File:LoadingsThresholds.png

http://statwiki.kolobkreations.com/wiki/File:FCM.png

Face validity is very simple. Do the factors make sense? For example, are variables that are similar in nature loading together on the same factor? If there are exceptions, are they explainable? Factors that demonstrate sufficient face validity should be easy to label. For example, in the pattern matrix above, we could easily label factor 1 "Trust in the Agent" (assuming the variable names are representative of the measure used to collect data for this variable). If all the "Trust" variables in the pattern matrix above loaded onto a single factor, we may have to abstract a bit and call this factor "Trust" rather than "Trust in Agent" and "Trust in Company".

Reliability

Reliability refers to the consistency of the item-level errors within a single factor. Reliability means just what it sounds like: a "reliable" set of variables will consistently load on the same factor. The way to test reliability in an EFA is to compute Cronbach's alpha for each factor. Cronbach's alpha should be above 0.7; although, ceteris paribus, the value will generally increase for factors with more variables, and decrease for factors with fewer variables. Each factor should aim to have at least 3 variables, although 2 variables is sometimes permissible.

Formative vs. Reflective

Specifying formative versus reflective constructs is a critical preliminary step prior to further statistical analysis. Specification follows these guidelines:

Formative

Direction of causality is from measure to construct No reason to expect the measures are correlated Indicators are not interchangeable

Reflective

Direction of causality is from construct to measure Measures expected to be correlated Indicators are interchangeable

An example of formative versus reflective constructs is given in the figure below.

http://statwiki.kolobkreations.com/wiki/File:CronbachsAlpha.png

Common EFA Problems

1. EFA that results in too many or too few factors (contrary to expected number of factors).

This happens all the time when you extract based on eigenvalues. I encourage students to use eigenvalues first, but then also to try constraining to the exact number of expected factors. Concerns arise when the eigenvalues extract fewer than expected, so constraining ends up extracting factors with very low eigenvalues (and therefore not very useful factors).

2. EFA with low communalities for some items.

This is a sign of low correlation and is usually corroborated by a low pattern matrix loading. I tell students not to remove an item just because of a low communality, but to watch it carefully throughout the rest of the EFA.

3. EFA with a 2nd order construct involved, as well as several first order constructs.

Often when there is a 2nd order factor in an EFA, the subdimensions of that factor will all load together, instead of in separate factors. In such cases, I recommend doing a separate EFA for the items of that 2nd order factor. Then, if that EFA results in removing some items to achieve discriminant validity, you can try putting the EFA back together with the remaining items (although it still might not work).

4. EFA with Heywood cases

Sometimes loadings are greater than 1.00. I don’t address these until I’ve addressed all other problems. Once I have a good EFA solution, then if the Heywood case is still there (usually it resolves itself), then I try a different rotation method (Varimax will fix it every time).

http://statwiki.kolobkreations.com/wiki/File:Specification.png

Some Thoughts on Messy EFAs

Let us say that you are doing an EFA and your pattern matrix ends up a mess. Let’s say that the items from one or two constructs do not load as expected no matter how you manipulate the EFA. What can you do about it? There is no right answer (this is statistics after all), but you do have a few options:

1. You can remove those constructs from the model and move forward without them.

This option is not recommended as it is usually the last course of action to take. You should always do everything in your power to retain constructs that are key to your theory.

2. You can run the EFA using a more exploratory approach without regard to expected loadings. For example, if you expected item foo3 to load with items foo1 and foo2, but instead it loaded with items moo1-3, then you should just let it. Then rename you factors according to what loaded on them.

This option is acceptable, but will lead you to produce a model that is probably somewhat different from the one you had expected to end up with.

3. You can say to yourself, “Why am I doing an EFA? These are established scales and I already know which items belong to which constructs. I do not need to explore the relationships between the items because I already know the relationships. So shouldn’t I be doing a CFA instead – to confirm these expectations?” And then you would simply report that you conducted an EFA out of due diligence, and these are the Cronbach alpha scores, etc, but since you were using established scales, and things were a bit of a mess in the EFA, you decided to jump to the CFA to refine your measurement model (but that you will return to your EFA after your CFA).

Surveys are built with a priori constructs and theory in mind – or surveys are built from existing scales that have been validated in previous literature. Thus, we are less inclined to “explore” and more inclined to “confirm” when doing factor analysis. If you take this course of action, then you need to justify it as I have done above. The point of a factor analysis is to show that you have distinct constructs (discriminant validity) that each measures a single thing (convergent validity), and that are reliable (reliability). This can all be achieved in the CFA. However, you should then go back to the EFA and "confirm" the CFA in the EFA by setting up the EFA as your CFA turned out.

Why do I bring this up? Mainly because your EFAs are nearly always going to run messy, and because you can endlessly mess around with an EFA and if you believe everything your EFA is

telling you, you will end up throwing away items and constructs unnecessarily and thus you will end up letting statistics drive your theory, instead of letting theory drive your theory. EFAs are exploratory and they can be treated as such. We want to retain as much as possible and still be producing valid results. I don’t know if this is emphasized enough in our quant courses. I also bring this up because I ran an EFA recently and got something I could not salvage without hacking a couple constructs. However, after running the CFA with the full model (ignoring the EFA), I was able to retain all constructs by only removing a few items (and not the ones I expected based on the EFA!). I now have excellent reliability, convergent validity, and only a minor issue with discriminant validity that I’m willing to justify for the greater good of the model. I can now go back and reconcile my CFA with an EFA.

For a very rocky but successful demonstration of handling a troublesome EFA, watch my SEM Boot Camp 2014 Day 3 Afternoon Video towards the end. The link below will start you at the right time position. In this video, I take one of the seminar participant's data, which I had never seen before, and with which he had been unable to arrive at a clean EFA, and I struggle through it until we arrive at something valid and usable.

Confirmatory Factor AnalysisConfirmatory Factor Analysis (CFA) is the next step after exploratory factor analysis to determine the factor

structure of your dataset. In the EFA we explore the factor structure (how the variables relate and group based

on inter-variable correlations); in the CFA we confirm the factor structure we extracted in the EFA.

Model Fit

Model fit refers to how well our proposed model (in this case, the model of the factor structure) accounts for the correlations between variables in the dataset. If we are accounting for all the major correlations inherent in the dataset (with regards to the variables in our model), then we will have good fit; if not, then there is a significant "discrepancy" between the correlations proposed and the correlations observed, and thus we have poor model fit. Our proposed model does not "fit" the observed or "estimated" model (i.e., the correlations in the dataset). Refer to the CFA video tutorial for specifics on how to go about performing a model fit analysis during the CFA.

MetricsThere are specific measures that can be calculated to determine goodness of fit. The metrics that ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is inversely related to sample size and the number of variables in the model. Thus, the thresholds below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al. 2010 on page 654. The thresholds listed in the table below are from Hu and Bentler (1999).

Modification indicesModification indices offer suggested remedies to discrepancies between the proposed and estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix model fit, as all regression lines between latent and observed variables are already in place. Therefore, in a CFA, we look to the modification indices for the covariances. Generally, we should not covary error terms with observed or latent variables, or with other error terms that are not part of the same factor. Thus, the most appropriate modification available to us is to covary error terms that are part of the same factor. The figure below illustrates this rule. In general, you want to address the largest modification indices before addressing more minor ones. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter:

http://statwiki.kolobkreations.com/wiki/File:GOFMetrics1.png

Standardized Residual CovariancesStandardized Residual Covariances (SRCs) are much like modification indices; they point out where the discrepancies are between the proposed and estimated models. However, they also indicate whether or not those discrepancies are significant. A significant standardized residual covariance is one with an absolute value greater than 2.58. Significant residual covariances significantly decrease your model fit. Fixing model fit per the residuals matrix is similar to fixing model fit per the modification indices. The same rules apply. For a more specific run-down of how to calculate and locate residuals, refer to the CFA video tutorial. It should be noted however, that in practice, I never address SRCs unless I cannot achieve adequate fit via modification indices, because addressing the SRCs requires the removal of items.

Validity and Reliability

It is absolutely necessary to establish convergent and discriminant validity, as well as reliability, when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving on to test a causal model will be useless - garbage in, garbage out! There are a few measures that are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance Extracted (AVE), Maximum Shared Variance (MSV),

http://statwiki.kolobkreations.com/wiki/File:Modelfit.png

and Average Shared Variance (ASV). The video tutorial will show you how to calculate these values. The thresholds for these values are as follows:

Reliability

CR > 0.7Convergent Validity

AVE > 0.5Discriminant Validity

MSV < AVE ASV < AVE Square root of AVE greater than inter-construct correlationsIf you have convergent validity issues, then your variables do not correlate well with each other within their parent factor; i.e, the latent factor is not well explained by its observed variables. If you have discriminant validity issues, then your variables correlate more highly with variables outside their parent factor than with the variables within their parent factor; i.e., the latent factor is better explained by some other variables (from a different factor), than by its own observed variables.

If you need to cite these suggested thresholds, please use the following:

Hair, J., Black, W., Babin, B., and Anderson, R. (2010). Multivariate data analysis (7th ed.): Prentice-Hall, Inc. Upper Saddle River, NJ, USA.

AVE is a strict measure of convergent validity. Malhotra and Dash (2011) note that "AVE is a more conservative measure than CR. On the basis of CR alone, the researcher may conclude that the convergent validity of the construct is adequate, even though more than 50% of the variance is due to error.” (Malhotra and Dash, 2011, p.702).

Common Method Bias (CMB)

Common method bias refers to a bias in your dataset due to something external to the measures. Something external to the question may have influenced the response given. For example, collecting data using a single (common) method, such as an online survey, may introduce systematic response bias that will either inflate or deflate responses. A study that has significant common method bias is one in which a majority of the variance can be explained by a single factor. To test for a common method bias you can do a few different tests. Each will be described below. For a step by step guide, refer to the video tutorials.

Harman’s single factor testA Harman's single factor test tests to see if the majority of the variance can be explained by a single factor. To do this, constrain the number of factors extracted in your EFA to be just one (rather than extracting via eigenvalues). Then examine the unrotated solution. If CMB is an issue, a single factor will account for the majority of the variance in the model (as in the figure below).

Common Latent FactorThis method uses a common latent factor (CLF) to capture the common variance among all observed variables in the model. To do this, simply add a latent factor to your AMOS CFA model (as in the figure below), and then connect it to all observed items in the model. Then compare the standardized regression weights from this model to the standardized regression weights of a model without the CLF. If there are large differences (like greater than 0.200) then you will want to retain the CLF as you either impute composites from factor scores, or as you move in to the structural model. The CLF video tutorial demonstrates how to do this.

Marker VariableThis method is simply an extended, and more accurate way to do the common latent factor method. For this method, just add another latent factor to the model (as in the figure below), but make sure it is something that you would not expect to correlate with the other latent factors in the model (i.e., the observed variables for this new factor should have low, or no, correlation with the observed variables from the other factors). Then add the common latent factor. This method teases out truer common variance than the basic common latent factor method because it is finding the common variance between unrelated latent factors. Thus, any common variance is likely due to a common method bias, rather than natural correlations. This method is demonstrated in the common method bias video tutorial.

http://statwiki.kolobkreations.com/wiki/File:Harman.png

http://statwiki.kolobkreations.com/wiki/File:CMF.png

Measurement Model Invariance

Before creating composite variables for a path analysis, configural and metric invariance should be tested during the CFA to validate that the factor structure and loadings are sufficiently equivalent across groups, otherwise your composite variables will not be very useful (because they are not actually measuring the same underlying latent construct for both groups).

ConfiguralConfigural invariance tests whether the factor structure represented in your CFA achieves adequate fit when both groups are tested together and freely (i.e., without any cross-group path constraints). To do this, simply build your measurement model as usual, create two groups in AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as usual (here’s a reminder: Model Fit). If the resultant model achieves good fit, then you have configural invariance. If you don’t pass the configural invariance test, then you may need to look at the modification indices to improve your model fit or to see how to restructure your CFA.

MetricIf we pass the test of configural invariance, then we need to test for metric invariance. To test for metric invariance, simply perform a chi-square difference test on the two groups just as you would for a structural model. The evaluation is the same as in the structural model invariance test: if you have a significant p-value for the chi-square difference test, then you have evidence of differences between groups, otherwise, they are invariant and you may proceed to make your composites from this measurement model (but make sure you use the whole dataset when you create composites, instead of using the split dataset).

An even simpler and less time-consuming approach to metric invariance is to conduct a multigroup moderation test using critical ratios for differences in AMOS. Below is a video to

http://statwiki.kolobkreations.com/wiki/File:Marker.png

explain how to do this. The video is about a lot of things in the CFA, but the link below will start you at the time point for testing metric invariance with critical ratios.

Contingency PlansIf you do not achieve invariant models, here are some appropriate approaches in the order I would attempt them.

1. Modification indices: Fit the model for each group using the unconstrained measurement model. You can toggle between groups when looking at modification indices. So, for example, for males, there might be a high MI for the covariance between e1 and e2, but for females this might not be the case. Go ahead and add those covariances appropriately for both groups. When adding them to the model, it does it for both groups, even if you only needed to do it for one of them. If fitting the model this way does not solve your invariance issues, then you will need to look at differences in regression weights.

2. Regression weights: You need to figure out which item or items are causing the trouble (i.e., which ones do not measure the same across groups). The cause of the lack of invariance is most likely due to one of two things: the strength of the loading for one or more items differs significantly across groups, or, an item or two load better on a factor other than their own for one or more groups. To address the first issue, just look at the standardized regression weights for each group to see if there are any major differences (just eyeball it). If you find a regression weight that is exceptionally different (for example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then you may need to remove that item if possible. Retest and see if invariance issues are solved. If not, try addressing the second issue (explained next).

3. Standardized Residual Covariances: To address the second issue, you need to analyze the standardized residual covariances (check the residual moments box in the output tab). I talk about this a little bit in my video called “Model fit during a Confirmatory Factor Analysis (CFA) in AMOS” around the 8:35 mark. This matrix can also be toggled between groups. Here is a small example for CSRs and BCRs. We observe that for the BCR group rd3 and q5 have high standardized residual covariances with sw1. So, we could remove sw1 and see if that fixes things, but SW only has three items right now, so another option is to remove rd3 or q5 and see if that fixes things, and if not, then return to this matrix after rerunning things, and see if there are any other issues. Remove items sparingly, and only one at a time, trying your best to leave at least three items with each factor, although two items will also sometimes work if necessary (two just becomes unstable). If you still have issues, then your groups are exceptionally different… This may be due to small sample size for one of the groups. If such is the case, then you may have to list that as a limitation and just move on.

2nd Order Factors

Handling 2nd order factors in AMOS is not difficult, but it is tricky. And, if you don't get it right, it won't run. The pictures below offer a simple example of how you would model a 2nd order

factor in a measurement model and in a structural model. The YouTube video tutorial above demonstrates how to handle 2nd order factors, and explains how to report them.

http://statwiki.kolobkreations.com/wiki/File:2ndOrderCFA.png

Common CFA Problems

1. CFA that reaches iteration limit.

2. CFA that shows CMB = 0 (sometimes happens when paths from CLF are constrained to be equal)

The best approach to CMB is just to not constrain them to be equal. Instead, it is best to do a chi-square difference test between the unconstrained model (with CLF and marker if available) and the same model but with all paths from the CLF constrained to zero. This tells us whether the common variance is different from zero.

3. CFA with negative error variances

This shouldn’t happen if all data screening and EFA worked out well, but it still happens… In such cases, it is permitted to constrain the error variance to a small positive number (e.g., 0.001)

4. CFA with negative error covariances (sometimes shows up as “not positive definite”)

In such cases, there is usually a measurement issue deeper down (like skewness or kurtosis, or too much missing data, or a variable that is nominal). If it cannot be fixed by addressing these deeper down issues, then you might be able to correct it by moving the latent variable path constraint (usually 1) to another path. Usually this issue accompanies the negative error variance, so we can usually fix it by fixing the negative error variance first.

5. CFA with Heywood cases

This often happens when we have only two items for a latent variable, and one of them is very dominant. First try moving the latent variable path constraint to a different path. If this doesn’t work then, move the path constraint up to the latent variable variance constraint AND constrain the paths to be equal (by naming them both the same thing, like “aaa”).

6. CFA with discriminant validity issues

This shouldn’t happen if the EFA solution was satisfactory. However, it still happens sometimes when two latent factors are strongly correlated. This strong correlation is a sign of overlapping traits. For example, confidence and self-efficacy. These two traits are too similar. Either one could be dropped, or you could create a 2nd order factor out of them

7. CFA with “missing constraint” error

Sometimes the CFA will say you need to impose 1 additional constraint (sometimes it says more than this). This is usually caused by drawing the model incorrectly. Check to see if all latent variables have a single path constrained to 1 (or the latent variable variance constrained to 1).

Structural Equation Modeling

http://statwiki.kolobkreations.com/wiki/File:2ndOrderSEM.png

“Structural equation modeling (SEM) grows out of and serves purposes similar to multiple regression, but in a more powerful way which takes into account the modeling of interactions, nonlinearities, correlated independents, measurement error, correlated error terms, multiple latent independents each measured by multiple indicators, and one or more latent dependents also each with multiple indicators. SEM may be used as a more powerful alternative to multiple regression, path analysis, factor analysis, time series analysis, and analysis of covariance. That is, these procedures may be seen as special cases of SEM, or, to put it another way, SEM is an extension of the general linear model (GLM) of which multiple regression is a part.“ http://www.pire.org/

SEM is an umbrella concept for analyses such as mediation and moderation. This wiki page provides general instruction and guidance regarding how to write hypotheses for different types of SEMs, what to do with control variables, mediation, interaction, multi-group analyses, and model fit for structural models. Videos and slides presentations are provided in the subsections.Hypotheses

Hypotheses are a keystone to causal theory. However, wording hypotheses is clearly a struggle for many researchers (just select at random any article from a good academic journal, and count the wording issues!). In this section I offer examples of how you might word different types of hypotheses. These examples are not exhaustive, but they are safe.

Direct effects"Diet has a positive effect on weight loss"

"An increase in hours spent watching television will negatively effect weight loss"

Mediated effectsFor mediated effects, be sure to indicate the direction of the mediation (positive or negative), the degree of the mediation (partial, full, or simply indirect), and the direction of the mediated relationship (positive or negative).

"Exercise positively and partially mediates the positive relationship between diet and weight loss"

"Television time positively and fully mediates the positive relationship between diet and weight loss"

"Diet affects weight loss positively and indirectly through exercise"

Interaction effects"Exercise positively moderates the positive relationship between diet and weight loss"

"Exercise amplifies the positive relationship between diet and weight loss"

"TV time negatively moderates (dampens) the positive relationship between diet and weight loss"

Multi-group effects"Body Mass Index (BMI) moderates the relationship between exercise and weight loss, such that for those with a low BMI, the effect is negative (i.e., you gain weight - muscle mass), and for those with a high BMI, the effect is positive (i.e., exercising leads to weight loss)"

"Age moderates the relationship between exercise and weight loss, such that for age < 40, the positive effect is stronger than for age > 40"

"Diet moderates the relationship between exercise and weight loss, such that for western diets the effect is positive and weak, for eastern (asia) diets, the effect is positive and strong"

Mediated ModerationAn example of a mediated moderation hypothesis would be something like:

“Ethical concerns strengthen the negative indirect effect (through burnout) between customer rejection and job satisfaction.”

In this case, the IV is customer rejection, the DV is job satisfaction, burnout is the mediator, and the moderator is ethical concerns. The moderation is conducted through an interaction. However, if you have a categorical moderator, it would be something more like this (using gender as the moderator):

“The negative indirect effect between customer rejection and job satisfaction (through burnout) is stronger for men than for women.”

Handling controlsWhen including controls in hypotheses (yes, you should include them), simply add at the end of any hypothesis, "when controlling for...[list control variables here]" For example:

"Exercise positively moderates the positive relationship between diet and weight loss when controlling for TV time and diet"

"Diet has a positive effect on weight loss when controlling for TV time and diet"

Another approach is to state somewhere above your hypotheses (while you're setting up your theory) that all your hypotheses take into account the effects of the following controls: A, B, and C. And then make sure to explain why.

Supporting HypothesesGetting the wording right is only part of the battle, and is mostly useless if you cannot support your reasoning for WHY you think the relationships proposed in the hypotheses should exist. Simply saying X has a positive effect on Y is not sufficient to make a causal statement. You must then go an explain the various reasons behind your hypothesized relationship. Take Diet and Weight loss for example. The hypothesis is, "Diet has a positive effect on weight loss". The supporting logic would then be something like:

Weight is gained as we consume calories. Diet reduces the number of calories consumed. Therefore, the more we diet, the more weight we should lose (or the less weight we should gain).

Controls

LESSON: ControlsControls are potentially confounding variables that we need to account for, but that don’t drive our theory. For example, in Dietz and Gortmaker 1985, their theory was that TV time had a

negative effect on school performance. But there are many things that could effect school performance, possibly even more than the amount of time spent in front of the TV. So, in order to account for these other potentially confounding variables, the authors control for them. They are basically saying, that regardless of IQ, time spent reading for pleasure, hours spent doing homework, or the amount of time parents spend reading to their child, an increase in TV time still significantly decreases school performance. These relationships are shown in the figure below.

As a cautionary note, you should nearly always include some controls; however, these control variables still count against your sample size calculations. So, the more controls you have, the higher your sample size needs to be. Also you get a higher R square but with increasingly smaller gains for each added control. Sometimes you may even find that adding a control “drowns out” all the effects of the IV’s, in such a case you may need to run your tests without that control variable (but then you can only say that your IVs, though significant, only account for a small amount of the variance in the DV). With that in mind, you can’t and shouldn't control for everything, and as always, your decision to include or exclude controls should be based on theory.

Handling controls in AMOS is easy, but messy (see the figure below). You simply treat them like the other exogenous variables (the ones that don’t have arrows going into them), and have them regress on whichever endogenous variables they may logically affect. In this case, I have valShort, a potentially confounding variable, as a control, with regards to valLong. And I have LoyRepeat as a control on LoyLong. I’ve also covaried the Controls with each other and with the other exogenous variables. When using controls in a moderated mediation analysis, go ahead and put the controls in at the very beginning. Covarying control variables with the other exogenous variables can be done based on theory, rather than as default.

http://statwiki.kolobkreations.com/wiki/File:Books.jpg

When reporting the model, you do need to include the controls in all your tests and output, but you should consolidate them at the bottom where they can be out of the way. Also, just so you don’t get any crazy ideas, you would not test for any mediation between a control and a dependent variable. However, you may report how the control effects a dependent variable differently based on a moderating variable. For example, valshort may have a stronger effect on valLong for males than for females. This is something that should be reported, but not necessarily focused on, as it is not likely a key part of your theory. Lastly, even if effects from controls are not significant, you do not need trim them from your model (although, there are also other schools of thought on this issue).

Mediation

ConceptMediation models are used to describe chains of causation. Mediation is often used to provide a more accurate explanation for the causal effect the antecedent has on the dependent variable. The mediator is usually that variable that is the missing link in a chain of causation. For example, Intelligence leads to increased performance - but not in all cases, as not all intelligent people are high performers. Thus, some other variable is needed to explain the reason for the inconsistent relationship between IV and DV. This other variable is called a mediator. In this example, work effectiveness, may be a good mediator. We would say that work effectiveness fully and positively mediates the relationship between intelligence and performance. Thus, the direct relationship between intelligence and performance isbetter explained through the mediator of work effectiveness. The logic is, even if you are intelligent, if you don't work smarter, then you won't perform well. However, intelligent people tend to work smarter (but not always). Thus, when intelligence leads to working smarter, then we observe greater performance.

http://statwiki.kolobkreations.com/wiki/File:ControlsIQ.png

TypesThere are three main types of simple mediation: 1) partial, 2) full, and 3) indirect. Partial mediation means that both the direct and indirect effects from the IV to DV are significant. Full means that the direct effect drops out of significance when the mediator is present, and that the indirect effect is significant. Indirect means that the direct effect never was significant, but that the indirect effect is. The figure below illustrates these types of mediation. Please refer to the step by step guide listed above for determining significance of the mediation.

There is one less common form of mediation called "competitive" mediation. In this case, the direct effect between IV and DV is “neutralized” when the mediator is absent. When the mediator is added, the direct effect becomes significant (often to the researchers’ surprise) and is usually in the opposite direction theorized, while the indirect path is observed to be significant and in the theorized direction. In such cases, the IV has dual effects on the DV which can only be separated when the mediator, acting somewhat like a prism, bifurcates the competing (and neutralizing) effects. Zhao et al. (2010 "reconsidering Baron and Kenny") discuss this.

http://statwiki.kolobkreations.com/wiki/File:ControlsAMOS.png

Interaction

ConceptIn factorial designs, interaction effects are the joint effects of two predictor variables in addition to the individual main effects. This is another form of moderation (along with multi-grouping) – i.e., the X to Y relationship changes form (gets stronger, weaker, changes signs) depending on the value of another explanatory variable (the moderator). So, for example

you lose 1 pound of weight for every hour you exercise you lose 1 pound of weight for every 500 calories you cut back from your regular diet but when you exercise while dieting, the you lose 2 pounds for every 500 calories you cut

back from your regular diet, in addition to the 1 pound you lose for exercising for one hour; thus in total, you lose three pounds

So, the multiplicative effect of exercising while dieting is greater than the additive effects of doing one or the other. Here is another simple example:

Chocolate is yummy Cheese is yummy but combining chocolate and cheese is yucky!The following figure is an example of a simple interaction model.

TypesInteractions enable more precise explanation of causal effects by providing a method for explaining not only how X affects Y, but also under what circumstances the effect of X changes depending on the moderating variable of Z. Interpreting interactions is somewhat tricky. Interactions should be plotted (as demonstrated in the tutorial video). Once plotted, the interpretation can be made using the following four examples (in the figures below) as a guide. My most recent Stats Tools Package provides these interpretations automatically.

http://statwiki.kolobkreations.com/wiki/File:Mediation.png

http://statwiki.kolobkreations.com/wiki/File:MediationTypes.png

Model fit again

You already did model fit in your CFA, but you need to do it again in your structural model in order to demonstrate sufficient exploration of alternative models. The method is the same: look at modification indices, residuals, and standard fit measures like CFI, RMSEA etc. The one thing that should be noted here in particular, however, is logic that should determine how you apply the modification indices to error terms.

If the correlated variables are not logically causally correlated, but merely statistically correlated, then you may covary the error terms in order to account for the systematic statistical correlations without implying a causal relationship. e.g., burnout from customers is highly correlated with burnout from management We expect these to have similar values (residuals) because they are logically similar and

have similar wording in our survey, but they do not necessarily have any causal ties. If the correlated variables are logically causally correlated, then simply add a regression line.

e.g., burnout from customers is highly correlated with satisfaction with customers We expect burnC to predict satC, so not accounting for it is negligent.

Lastly, remember, you don't need to create the BEST fit, just good fit. If a BEST fit model (i.e., one in which all modification indices are addressed) isn't logical, or does not fit with your theory, you may need to simply settle for a model that has worse (yet sufficient) fit, and then explain

http://statwiki.kolobkreations.com/wiki/File:Interaction.png

why you did not choose the better fitting model. For more information on when it is okay to covary error terms (because there are other appropriate reasons), refer to David Kenny's thoughts on the matter: David's website

Multi-group

Multi-group moderation is a special form of moderation in which a dataset is split along values of a categorical variable (such as gender), and then a given model is tested with each set of data. Using the gender example, the model is tested for males and females separately. The use of multi-group moderation is to determine if relationships hypothesized in a model will differ based on the value of the moderator (e.g., gender). Take the diet and weight loss hypothesis for example. A multi-group moderation model would answer the question: does dieting effect weight loss differently for males than for females? In the videos above, you will learn how to set up a multigroup model in AMOS, and test it using chi-square differences, and using critical ratios. Really, using critical ratios takes about a one minute after the model is set up, and it involves no room for human error, whereas using the chi-square method can take upwards of 30 minutes and it involves a lot of room for human error. So, I recommend the easy method!

From Measurement Model to Structural Model

Many of the examples in the videos so far have taught concepts using a set of composite variables (instead of latent factors with observed items). Many will want to utilize the full power of SEM by building true structural models (with latent factors). This is not a difficult thing. Simply remove the covariance arrows from your measurement model (after CFA), then draw single-headed arrows from IVs to DVs. Make sure you put error terms on the DVs, then run it. It's that easy. Refer to the video for a demonstration.

Creating Composites from Latent Factors

If you would like to create composite variables (as used in many of the videos) from latent factors, it is an easy thing to do. However, you must remember two very important caveats:

You are not allowed to have any missing values in the data used. These will need to be imputed beforehand in SPSS or excel (I have two tools for this in my Stats Tools Package - one for imputing, and one for simply removing the entire row that has missing data).

Latent factor names must not have any spaces or hard returns in them. They must be single continuous strings ("FactorOne" or "Factor_One" instead of "Factor One").

After those two caveats are addressed, then you can simply go to the Analyze menu, and select Data Imputation. Select Regression Imputation, and then click on the Impute button. This will create a new SPSS dataset with the same name as the current dataset except it will be followed by an "_C". This can be found in the same folder as your current dataset.

http://statwiki.kolobkreations.com/wiki/File:InteractionTypes.png

PLS

Partial Least Squares is another method for testing causal models (in addition to Covariance Based Methods used in AMOS and Lisrel). This section on PLS is intended to demystify the process of conducting an analysis, start to finish, using PLS-graph. Trust me, it needs demystification! I am not going to get into the deep logic and math behind the methods I outline here. This wiki is simply intended to be used as a "How To" for pls-graph. For more references and technical explanations, please refer to Wynne Chin's website: http://www.plsgraph.com/. I have also listed several videos for SmartPLS 2.0 at the bottom of this page, and an article about when to choose PLS and how to use it.

Installing PLS-graph

Simply installing PLS-graph is a complex process. Instructions for getting the full version can be found here: for the demo version. The demo version is rather limiting, and only allows you to test models with ten variables or less. In order to obtain a more useful license, you will have to contact Wynne Chin directly: [email protected]. Sadly I am not allowed to distribute it freely, and requests directed toward me for the full license must be rejected.

IMPORTANT Once you have PLS-graph installed (by clicking on that link and running the file), make sure you have a "license.dat" or "license" file in your plsgraph folder, typically found at: C:\Program Files\plsgraph, but sometimes also found at C:\Program Files x86\plsgraph if you are running a 64bit machine.

Troubleshooting

Opening PLS-graphIf you get one of the following errors, then you either don't have a valid license from Wynne Chin, or you have not placed the license in the proper directory. See the installation section above for more details.

Linking DataTo "open" or "link" a dataset in PLS-graph, you need to click on File --> Links NOT File --> Open. From File --> Links, then browse for your dataset.

Your data must be in .raw format or else it will not show up in the browse window.To get your data in .raw format, you need to follow the guidelines in this quick tutorial: Creating .raw files

Basically, you need to:

Save your dataset as "tab delimited" (.dat) from SPSS or Excel, or whatever program you are using to view your data.

Then change the file extension from .dat to .raw (say yes if an error pops up)

If you get the following error when linking data, then there is a problem in the dataset:

The problem is most likely one of the following:

[1]You have blank or missing values that have not been recoded. [2]You have non-numeric values (other than variable names in the first row) [3]You have excessively large numbers (e.g., 0.978687677664826355281) [4]You have scientific notation (e.g., 3.23E-08 instead of 0.0000000323)Fixes for these issues:

[1]Replace all missing values in your dataset with a constant that is otherwise unused in the dataset (something like -1). You can do this in Excel or SPSS by doing a quick Find and Replace (Control+H). Or, you can impute those missing values (if appropriate) in SPSS using the Replace missing values function in the Transform menu.

[2]If you have non-numeric data, you need to convert it to numbers (if appropriate). For example, if you have values like "Low" "Medium" "High", instead you need to use something like "1" "2" "3", where 1=Low, etc. This can also be done with a find and replace. You may also need to simply remove some columns from your dataset because they cannot be used in PLS. For example, if you have email addresses or usernames in your dataset, those can simply be removed because they cannot be meaningfully converted into numeric data.

[3]If your numbers are too large in Excel, then simply decrease the number of decimals using the button. If you are using SPSS then you need to do some fancy copy and paste work. Copy the offending columns into Excel, reduce the number or decimals, then copy and paste the new values into those same columns in SPSS.

[4]If you have scientific notation, this is probably because you were using Excel at some point and the numbers were either formatted explicitly as "Scientific" or were inferred to be Scientific but formatted by default as "General". To fix this, simply change the formatting to "Number". See the picture below for how to access these formats in Excel 2010.

http://statwiki.kolobkreations.com/wiki/File:Error1.png


CrashesPLS-graph tends to crash quite frequently if you are testing a complex model (over 30 variables) and/or have a large sample size (over 500). To fix this, you need to manually increase the amount of memory allocated for running the PLS algorithms. Go to Options --> Memory and then add a couple zeros to each row. In the picture below, I've added two zeros to each row.

You may also want to just wait for a few seconds after the program runs before hitting the Okay button. This will give it time to settle, and will result in fewer crashes.

Above all SAVE OFTEN!Sample Size Rule

PLS has the great advantage over Covariance Based Methods (CBM) because it requires fewer datapoints to accurately estimate loadings. The rule for CBM is 10 times the number of parameters or variables in the model. So if you have 20 variables then you need 200 usable rows in your dataset. In PLS, the rule is much looser. In PLS you need 10 times the number of indicators for the most predicted construct. So for example, if you have a latent construct that is predicted by 6 indicators, another predicted by 3, and another predicted by 4, then you would only need 6 times 10, or 60 usable rows. If a construct is also being predicted in a causal model by other latent constructs, then those need to be considered as well. So for example, in the model below, the required sample size would be 90: 70 for the measured indicators and 20 for the latent indicators.


http://statwiki.kolobkreations.com/wiki/File:Decimal.png

Factor Analysis

By factor analysis I mean the measurement and estimation of latent constructs, excluding any causal relationship between latent constructs. In PLS latent constructs can be estimated formatively or reflectively; whereas in CBM all constructs are measured reflectively. The difference between these two types of models is important and should not be disregarded. For more information on reflective versus formative measures and models please refer to the section on Formative vs. Reflective models. As for how to conduct a factor analysis in PLS-graph, it would be much simpler just to show you, so please see the video above.

Testing Causal Models

The first video listed above demonstrates the entire process of testing a causal model. The second video explains how to fix your bootstrapping options so that you can increase your t-statistic (and as a result, decrease your p-value).

The basic steps for testing causal models are as follows:

After doing a factor analysis in PLS-graph, connect the constructs using the connector

tool Change the inner weightings to Path (instead of Factor) - this is in the Options -->

Run menu. Run the model Trim weak paths in the model Run a bootstrap in order to obtain t-statistics, composite reliabilities, and AVEs. Trim indicators based on the t-statistics Compute p-values from t-statistic (using Excel's function: =T.DIST.2T(x,deg_freedom))The basic steps for increasing your t-statistic are as follows:

http://statwiki.kolobkreations.com/wiki/File:Numformat.png

http://statwiki.kolobkreations.com/wiki/File:Memory1.png

http://statwiki.kolobkreations.com/wiki/File:Memory2.png

Go to Options --> Resampling Change the Number of Samples to be a number greater than your sample size Change the Cases per Sample to be a number that is a majority of your sample size. Or, just

put a zero there and the bootstrap will use the entire sample size, but include replacements (or estimated values) for removed cases. The latter option here (using zero) will usually give you the highest t-statistic.

Run the bootstrap now as usual.Effect Strength (f-squared)The t-statistic produced in pls-graph and used to calculate p-values is easily inflated when using a large sample size (something greater than 300). So you can run a model and have a path coefficient of 0.048 and yet the t-statistic will be significant. But a path coefficient of 0.048 is not practically significant, only statistically significant. In cases like these, the best thing to do is to calculate an f-squared to demonstrate the actual strength of the effect. The f-squared relies on the change in the r-squared, rather than on the size or significance of the path coefficient. The f-squared is calculated as follows:

I have made a quick tool for calculating this in Excel. It is in the EffectSize tab of the Stats Tools Package. Looks like this:

Testing Group Differences

Testing for differences between groups for a given causal model is a big pain in PLS... just as it is in CBM software like AMOS. Hopefully the video tutorial I've made will demystify the process. You will also need the Stats Tools Package Excel workbook referenced on the StatWiki homepage. The basic steps are as follows:

Use case selection to run the model for one group at a time. This can be found in the Options --> Run menu. For example, you might use gender as the grouping variable. In your dataset,

http://statwiki.kolobkreations.com/wiki/File:Samplesize.png

http://statwiki.kolobkreations.com/wiki/File:Connector.png

gender should be indicated by a 1 or 2, where 1 = male and 2 = female. Then in PLS-graph you can select gender as the selection variable and specify a value (such as 1) to test for just one gender at a time. (see the picture below)

Then obtain the regression weight from running the model To obtain the standard errors, you need to run a bootstrap using a separate dataset for

each group. Bootstrapping in pls-graph does not take into account specifying a case selection (as in the picture below).

Plug these values, along with the sample size for each group, into the Stats Tools Package X2 Threshold tab.

This will calculate a t-statistic and p-value for you. The larger the sample sizes, the stronger the p-value

Handling Missing Data

PLS-graph cannot handle missing values that are left blank. If there are blank portions of your dataset, the data simply will not load. To avoid having to remove or impute all these missing values, you can just recode them using some constant number that is never used elsewhere in the dataset. For example, if your data comes from surveys that used 7-point Likert scales, then you could use the number 8 as the proxy for missing values, or the number -1, or 1,111,111,001, or whatever you wanted, as long as it wasn't a number 1 through 7. Common practice is to use -1. In SPSS or Excel, just hit Control+H and replace all blanks with a -1. WARNING in SPSS, this will only replace missing values within a specified column, whereas in Excel it will replace missing values for the entire dataset. In SPSS it looks like this, with the Find value blank, and the Replace value set to -1:

http://statwiki.kolobkreations.com/wiki/File:F2.png

http://statwiki.kolobkreations.com/wiki/File:Fsquared.png

Then, in PLS-graph, you need to specify the value for missing data. This is done in the Options --> Run menu, as shown below.

Reliability and Validity

So how do you test for reliability and validity in PLS-graph? And what do you do with formative measures? There are different schools of thought, and different approaches. In the video above, I will show you one of these that is usually acceptable (depending on reviewers). The basic guidelines are as follows:

Reliability: This is demonstrated by Composite Reliability greater than 0.700. Convergent Validity: This is demonstrated by loadings greater than 0.700, AVE greater than

0.500, and Communalities greater than 0.500 Discriminant validity: This is demonstrated by the square root of the AVE being greater than

any of the inter-construct correlations.

http://statwiki.kolobkreations.com/wiki/File:Grouping.png

Formative Measures: Like I said, different schools of thought. Some say that Reliability and Convergent validity are actually flawed metrics when evaluating formative measures because formative measures do not necessarily have highly correlated indicators. However, the formative measure should have some common theme. Thus, I argue that for formative measures, high loadings and communalities should still be present in order to have a strong construct. Nevertheless, if you don't achieve the recommended thresholds, you can probably argue your case.

In the end, you want a table that looks something like this:

Common Method Bias

There are several different methods for testing whether the use of a common method introduced a bias into your data. My preferred method, and probably the most accurate, but most stringent, is to use a marker variable to draw out the common variance with theoretically unrelated constructs, which would point to some systematic variance explained by an external factor (such as a common method of data collection). To employ a marker variable in PLS-graph, you need to create a latent construct that is theoretically dissimilar to the other constructs in the model. For example, if I am doing a factor analysis with the following variables: Satisfaction, Burnout, Rejection, and Ethical Concerns, I can choose a marker variable like Apathy, and then look at the correlations between the other constructs and this construct. The correlations should be low - like less than 0.300. Squaring the highest correlation between the Marker and another construct will give you the maximum percentage of shared variance. Additionally, you can look at the correlations between the other factors. None of those correlations should be greater than 0.700 (for discriminant validity) and definitely no greater than 0.900 for common method bias.

So, given the correlation matrix below (from the .lst output), we can say that the maximum shared variance with the Marker variable is less than 1% (.075 squared), and none of the other correlations begin to approach the 0.900 threshold. Thus, there is no evidence that a common method bias exists.

http://statwiki.kolobkreations.com/wiki/File:Replace.png

http://statwiki.kolobkreations.com/wiki/File:MissingValue.png

Interaction

To perform an interaction in PLS-graph, you need to create an Interaction Construct that is composed of the products of the indicators for the IV and the moderating variable. The picture below on the left is the conceptual model we are testing. The picture below on the right is the way we measure it in PLS-graph.

Standardizing variables before multiplying them for interactions is no longer considered necessary, as the assumed benefit of reducing multicollinearity has been debunked in several recent articles.

To test the significance of the effect, just do a bootstrap like you would for any other effect, then calculate the p-value from the t-statistic as discussed in the Testing Causal Models section.

Guidelines

On this wiki page I share my 10 Steps to building a good quantitative model, as well as some general guidelines for structuring a quantitative model building/testing paper. These are just off the top of my head and do not come from any sort of published work. However, I have found them useful and hope you do as well.

http://statwiki.kolobkreations.com/wiki/File:Reliability.png

Contents

[hide]

1 Example Analysis2 Ten Stepso 2.1 Ten Steps for Formulating a Decent Quantitative Model

3 Order of Operationso 3.1 Some general guidelines for the order to conduct each procedure

4 Structuring a Quantitative Papero 4.1 Standard outline for quantitative model building/testing paper

Example Analysis

I've created an example of some quantitative analyses. The most useful part of this example is probably the wording. It is often difficult to figure out how to word your findings, or to figure out how much space to use on findings, or which measures to report and how to report them. This offers just one example of how you might do it.

Click here to access the example analysis. Ten Steps

Ten Steps for Formulating a Decent Quantitative Model

1. Identify and define your dependent variables. These should be the outcome(s) of the phenomenon you are interested in better understanding. They should be the effected thing(s) in your research questions.

2. Figure out why explaining and predicting these DVs is important.1. Why should we care?2. For whom will it make a difference?3. What can we possibly contribute to knowledge that is not already known?4. If these are all answerable and suggest continuing the study, then go to #3,

otherwise, go to #1 and try different DVs.3. Form one or two research questions around explaining and predicting these DVs.

1. Scoping your research questions may also require you to identify your population.4. Is there some existing theory that would help explore these research questions?

http://statwiki.kolobkreations.com/wiki/File:CMB.png

http://statwiki.kolobkreations.com/wiki/File:InteractionPLS.png

1. If so, then how can we adopt it for specifically exploring these research questions?

2. Does that theory also suggest other variables we are not considering?5. What do you think (and what has research said) impacts the DVs we have chosen?

1. These become IVs.6. What is it about these IVs that is causing the effect on the DVs?

1. These become Mediators.7. Do these relationships depend on other factors, such as age, gender, race, religion,

industry, organization size and performance, etc.?1. These become Moderators

8. What variables could potentially explain and predict the DVs, but are not directly related to our interests?

1. These become control variables. These are often some of those moderators like age and gender, or variables in extant literature.

9. Identify your population.1. Do you have access to this population?2. Why is this population appropriate to sample in order to answer the research

questions?10.Based on all of the above, but particularly #4, develop an initial conceptual model

involving the IVs, DVs, Mediators, Moderators, and Controls.1. If tested, how will this model contribute to research (make us think differently)

and practice (make us act differently)?Order of Operations

Some general guidelines for the order to conduct each procedure

1. Develop a good theoretical model1. See the Ten Steps above2. Develop hypotheses to represent your model

2. Case Screening1. Missing data in rows2. Unengaged responses3. Outliers

3. Variable Screening1. Missing data in columns

2. Skewness (for continuous like age, income) & Kurtosis (for ordinal like Likert-scales)

4. Exploratory Factor Analysis1. Iterate until you arrive at a clean pattern matrix2. Adequacy3. Convergent validity4. Discriminant validity5. Reliability

5. Confirmatory Factor Analysis1. Obtain a roughly decent model quickly (cursory model fit, validity)2. Do configural and metric invariance tests (if using multigroup moderator in causal

model)3. Validity and Reliability check4. Common method bias (marker if possible, CLF otherwise)5. Final measurement model fit6. Optionally, impute composites

6. Structural Models1. Multivariate Assumptions

1. Linearity2. Multicollinearity3. Homoscedasticity

2. Include control variables in all of the following analyses3. Mediation

1. Check direct effects without mediator2. Add mediator and bootstrap it3. If you have multiple mediators, then use a sobel test

4. Interactions1. Optionally standardize constituent variables2. Compute new product terms3. Plot significant interactions

5. Multigroup Moderation1. Create multiple models2. Assign them the proper group data

3. Test significance of moderation via critical ratios (or chi-square difference test)

7. Report findings in a concise table8. Write paper

1. See guidelines belowStructuring a Quantitative Paper

Standard outline for quantitative model building/testing paper

Title (something catchy and accurate) Abstract (concise – 150-250 words – to explain paper): roughly one sentence each:

What is the problem? Why does it matter? How do you address it? What did you find? How does this change practice (what people in business do), and how does it change

research (existing or future)? Keywords (4-10 keywords that capture the contents of the study) Introduction (2-4 pages)

What is the problem and why does it matter? And what have others done to try to address this problem? (1-2 paragraphs)

What is your DV(s) and what is the context you are studying it in? Also briefly define the DV(s). (1-2 paragraphs)

One sentence about sample (e.g., "377 undergraduate university students using Excel"). How does studying this DV(s) in this context adequately address the problem? (1-2

paragraphs) What existing theory/theories do you leverage, if any, to pursue this study, and why are

these appropriate? (1-2 paragraphs) Briefly discuss the primary contributions of this study without discussing exact findings. How is the rest of the paper organized? (1 paragraph)

Lit review (1-3 pages) Fully define your dependent variable(s) and summarize how it has been studied in

existing literature within your broader context (like Information systems, or, Organizations, etc.).

If you are basing your model on an existing theory/model, use this next space to explain that theory (1 page) and then explain how you have adapted that theory to your study.

If you are not basing your model on an existing theory/model, then use this next space to explain how existing literature in your field has tried to predict your DV(s).

Explain what other constructs you suspect will help predict your DV(s) and why. Inclusion of a construct should have good logical/theoretical and/or literature support. For example, “we are including construct xyz because the theory we are basing our model on includes xyz.” Or, “we are including construct xyz because the following logic (abc) constrains us to include this variable lest we be careless”.

Briefly discuss control variables and why they are being included. Theory section (take what space you need, but try to be parsimonious)

Briefly summarize your conceptual model and show it with the Hypotheses labeled (if possible).

Begin supporting H1 then state H1 formally. Support should include strong logic and literature.

H2, H3, etc. If you have sub-hypotheses, list them as H1a, H1b, H2a, H2b, etc. Method (keep it as brief as possible)

Explanation of study design (e.g., pretest, pilot, and online survey about software usage) Explanation of sample (some descriptive statistics, like demographics, sample size,

computer experience, etc.), don`t forget to discuss response rate (number of responses as a percentage of number of people invited to do the study).

Mention that IRB exempt status was granted and protocols were followed. Method for testing hypotheses (e.g., structural equation modeling in AMOS). If you

conducted multi-group moderation, mediation, and/or interaction, explain how you kept them all straight and how you went about analyzing them. For example, if you did mediation, what approach did you take (Baron and Kenny, Sobel test, Bootstrapping, or all three)? Were there multiple models tested, or did you keep all the variables in for all analyses? If you did interaction, did you add that in afterward, or was it in from the beginning?

Analysis (1-3 pages) Data Screening EFA (report pattern matrix and Cronbach`s alphas in appendix) – mention if items were

dropped.

CFA (just mention that you did it and bring up any issues you found) – mention any items dropped during CFA. Report model fit for the final measurement model.

Mention CMB approach and results and actions taken if any. Report the correlation matrix, CR and AVE (you can include MSV and ASV if you

want), and briefly discuss any issues with validity and reliability – if any. Report whether you used the full SEM, or if you imputed composites for a path model. Report the final structural model(s) (include R-squares and betas) and the model fit for

the model(s). Findings (1-2 pages)

Report the results for each hypothesis (supported or not, with evidence). Point out any unsupported or counter-supported (opposite direction) hypotheses. Provide a table that concisely summarizes your findings.

Discussion (2-5 pages) Summarize briefly the study and its intent and findings (one paragraph). What insights did we gain from the study that we could not have gained without doing

the study? How do these insights change the way practitioners do their work? How do these insights shed light on existing literature and shape future research in this

area? What limitations is our study subject to (e.g., surveying students, just survey rather than

experiment, statistical limitations like CMB etc.)? What are some opportunities for future research based on the insights of this study?

Conclusion (1-2 paragraphs) Summarize the insights gained from this study and how they address existing gaps or

problems. Explain the primary contribution of the study. Express your vision for moving forward or how you hope this work will affect the world.

References (Please use a reference manager like EndNote) Appendices (Any additional information, like the instrument and measurement model stuff)

Brm statwiki

Documents

Transcript of Brm statwiki