Partial Least Squares Methodology for Analysis: a Primer€¦ · Partial Least Squares Methodology...
Transcript of Partial Least Squares Methodology for Analysis: a Primer€¦ · Partial Least Squares Methodology...
Partial Least Squares Methodology for Analysis:
a PrimerResearch Methods Seminar
Michael Curry, DBA [email protected]
Agenda
• What is PLS and why would one use it?• My research framework (as an example for demonstrating PLS)• Using PLS to conduct analysis
2
What is PLS?
• A type of Structured Equation Modeling (SEM)• SEM is a multivariate analysismethod similar to principal component and linear regression (Hair et al., 2012).
• Two choices for SEM analysis• Covariance approach (e.g. LISREL, EQS, COSAN, AMOS, & SEPATH) –has two problems: factor indeterminacy & inadmissible solutions (Chin et al., 2003)
• Variance approach –partial least squares (PLS) assumes all measured variance is useful for estimating interaction and main effects. (Chin et al., 2003)
• PLS Uses an iterative estimating technique based on algorithms by Wold(1985) and Lohmöller (1984)• Encompasses correlation, redundancy analysis, multiple regression, multivariate analysis of variance, and principle components
3
Reasons for using PLS (Henseler, et al., 2009)
1. Non‐normal data (68.8%). PLS has minimal restrictions in terms of distributional assumptions and sample size (Chin et al., 2003)
2. Small sample size (53.1%). Rule of thumb: ten times the largest number of paths directed at a construct in the structural model (Chin et al., 2003).
3. Formative measures (31.3%). Unlike SEM, PLS avoids improper solutions removing factor indeterminacy, simplifying formative model development
4. Focus on prediction (31.3%). PLS is especially powerful for large complex models (Fornell, 1990; Wold, 1985).
4
PLS Software
• SmartPLS (www.smartpls.de) • The most feature rich(?), with strong community support
• PLS Graph (www.plsgraph.com) • Developed by the Wynne Chin, a widely cited author of PLS analysis
• PLS for R package (http://mevik.net/work/software/pls.html)• SIMCA‐P http://www.softlookup.com/download.asp?id=16012• XL‐STAT (www.xlstat.com) • Stata (http://econpapers.repec.org/software/bocbocode/s456810.htm)
5
Some Terminology
Construct 1
m4
m3
m2
m1
Manifest variables (MV)
Latent variable (LV)
Reflective construct (MVs are indicators of LV) Construct 4
m8
m9
m10
Formative construct (MVs are assessments of LV)Construct 2
m5
m6
m7
Model dependent variable
6
Some Terminology
Construct 1
m4
m3
m2
m1
Construct 4
m8
m9
m10
Construct 2
m5
m6
m7
Construct 3
7
2nd order construct, more broadly generalizes interaction
Higher order factors let PLS measure alternative patterns of
covariance (Wetzels et al., 2009, Chin et al., 2003).
Some Terminology
Construct 1
m4
m3
m2
m1
Construct 4
m8
m9
m10
Construct 2
m5
m6
m7
Construct 3
Outer (measurement) Model
Inner (structural) model
8
Some Terminology
Construct 1
m4
m3
m2
m1
Manifest variables (MV)
Latent variable (LV)
Reflective construct (MVs are indicators of LV) Construct 4
m8
m9
m10
Formative construct (MVs are assessments of LV)Construct 2
m5
m6
m7
Construct 3
2nd order construct, more broadly generalizes interaction
Model dependent variable
Outer (measurement) Model
Inner (structural) model
9
Higher order factors let PLS measure alternative patterns of
covariance (Wetzels et al., 2009, Chin et al., 2003).
Research Example: A Norms Based Approach to IT Effectiveness• The Gap:
• Many approaches to IT effectiveness* are based on formal IT best practices (e.g. COBIT, ITIL and ISO/IEC frameworks).
• These approaches emphasize:• Mechanics of IT operations. • Metrics of IT performance. • Conformance with best practices.
• Which are hard to measure and interpret (Brown, 1998, Albayrak et al., 2009).• And especially difficult for smaller resource‐constrained business with limited (or no) IT staff (Huang, Zmud, & Price, 2010; Devos, 2007; Tagliavini, Ravarini, & Antonelli, 2001; W.H. DeLone, 1988; )
*IT effectiveness: approaches to assess IT and improve its contribution to business goals 10
Our Approach (Curry, Marshall & Kawalek, 2013)
•We distilled COBIT’s IT best practice collection into a set of informal norms.•We call these “IT effectiveness efforts.”
• Three efforts operationalized for study:1. Identify IT risks and establish offsetting
controls (RC).2. Continuously improve IT processes (CI).3. Align business and IT strategy (BIS).
11
Motivating Individual Action
• When employees go beyond complying with IT best practices and internalize their spirit we expect this to improve IT quality.
• We call this “IT effectiveness subscription.”• Operationalized to assess intent as well as behaviour consistent with IT effectiveness efforts.
12
Evidence that IT effectiveness norms improve IT quality would support our claim that norms can result in many benefits that frameworks like COBIT, ITIL and ISO/IEC offer.
Defining IT Quality • The DeLone and McLean (1992, 2003)model offers a widely used dependent variable for IS research• 16 confirmatory studies (Delone and McLean, 2003); also extended to SMEs (DeLone, 1988, Gengatharen and Standing, 2003)
•Two suitable proxies for IT quality: 1. Satisfying users of IT systems (IS)2. Effective organisational impact of IT (OS)
13
Research Model
14Figure 1 Research Model
IT Effectiveness Efforts
IT Effectiveness Subscription
IT Quality
H1: ITE make a difference in IT quality.
H2: ITE influences ITE subscription
H3: ITE subscription can also improve IT quality.
IT best practice ‘compliance’ norms (adapted from COBIT)
‘Spirit’ norms which internalize IT best
practice
Adapted from DeLoneand McLean (1992, 2003)
Methodology
• All constructs operationalized in a survey instrument• Many items tested in previous studies (Curry, Marshall, Kawlek, 2013; Marshall, Curry, Reitsma, 2011;)
• Administered to students in business IS classes • Completed projects requiring application of IS concepts using IT artifacts (writing programs, creating databases, collaborating with SharePoint, etc.)
• Instructed to think of the university as “their organization” • 65 valid responses
15
Statistical Analysis
• Goal to show that the framework is a reasonable explanation of the observations and all constructs contributed significantly as theorized
• How I did this in past studies:• Data exploration, validation• Principal component analysis (PCA)
• Identify items which pull together cohesively, discard items which do not• Combine items into normalized synthetic variables for each construct
• Conduct linear regression with PCA synthetic variables to estimate explained variance, effect size and statistical significance
• Use SEM to verify the full interaction model
16
The SEM often resulted in inadmissible solutions. Only by manipulating the error and latent terms (randomly) was a solution reached.
PLS Methodology
• Data exploration, validation (using SPSS, or…)• Use survey items as MVs to model theoretical interaction of LVs• Evaluate the model’s quality using internal consistency tests:
• Cronbach’s alpha• Redundancy• Cross‐loading• R2• Statistical significance (p)
• Make (theoretically supported) refinements to the model and re‐evaluate
17
Created a new project
Import data (as CSV)
Validate
Missing data is the primary reason for invalid data file; code missing data (e.g. ‐1) in Excel 18
Open model (initially blank)
Insert LVs using this tool
IT effectiveness effort: Risk and Control
IT effectiveness effort: Business and IT Strategy
IT effectiveness effort: Commitment to Improvement
IT effectiveness subscription
IT quality
19
Use connector tool to create relationships
MVs turn from red to blue indicating the model is now valid
Run PLS algorithm
In most cases, the defaults settings are fine. Press finish to run
21
Regression weights
R2
Click here to view full report
While the values displayed on the model are helpful, to fully assess, we need to review the report 22
AVE is the average communality, which measures the quality of the measurement model. For reflective LVs, the AVE should be greater than .5 (Wetzels et al., 2009, Hair et al., 2012), which implies
that at least 50% of the variance is explained by the
modeled LV.
Composite reliability is an internal consistency check similar to AVE and for reflectively modeled LVs should be greater than .6 for exploratory research (or .7 for confirmatory
research) (Chin et al., 2003, Hair et al., 2012)
Cronbach’s Alpha is a measure of scale reliability and for
reflectively modeled LVs should be greater than or equal to .7 (Wetzels et al., 2009)
Unlike SEM, there are no GoFmeasures for PLS. Instead we
review internal consistency values
23
Assessing significance: There are several methods for assessing significance: jackknife, blindfolding, and bootstrap. Bootstrap is commonly cited in IS research.
The bootstrap method uses a sub set of the sample to create replacement values which are then compared to the original data using t‐statistics (Tenenhaus, 2005).
24
t values
Click here to run bootstrap
t values are significant if t > 1.96 @ p0.05, t >2.576 @ p<0.01, t >3.29 @ p<0.001 for two tailed tests
Click here to view full report
While the values displayed on the model are helpful, to fully assess, we need to review the report 25
I copy the values (PLS and bootstrap) into Excel, for review instead of using the built in report
Colored areas indicate issues to investigate
I use a formula to flag significance
Regression weights26
Cross loading report, similar to PCA is helpful for assessing quality of MVs in each LV
Better viewed in Excel
Values below .5 may be considered for removal.
BUT… unlike regression, PLS does better with more information, so be careful not to remove too many MVs
27
While there is no set range for cross loading, the narrower the range and higher lowest loading, leads to
greater convergent reliability
In the revised model, all internal consistency measures are at or above minimum acceptable levels
But the low regression weights and lack of significance indicates not all modeled paths are valid
28
The model is revised by adding a 2nd order formative IT effectiveness factor. This channels the interaction through one LV
This was a very simple change to make. Doing regression analysis of
this complexity might take considerably longer. Doing this in SEM requires estimating latent
scores and would likely result in one or more inadmissible solutions.
29
And also adding a 2nd order reflective IT quality factor.
Higher order factors let PLS measure alternative patterns of covariance (Wetzels et al., 2009,
Chin et al., 2003).
Solid R2 values;
Balanced regression weights along theorized paths
All paths statistically significant
“I am hopeful, though I don't believe in happy endings because I don't believe in endings.” ― Edward Abbey 30
AVE here can be ignored because this is a formative construct
Contribution
• A norms‐based IT quality predictive model was developed and validated• Assessments of IT effectiveness norms have a direct and indirect influence on assessments IT quality
• An instrument with items to assess constructs was developed; shown statistically reliable; and, verified with qualitative interviews
• An alternative (less technical) approach to better IT quality introduced• Suitable for easily assessing IT operations• Provides actionable recommendations• Helps bridge the communication gap between IT and non‐IT function• Smaller resource‐constrained organizations may find this an easier approach than formal IT best practice adoption
31
Take Away
• PLS is rapidly gaining favor as an analysis tool –thanks in part to new easy to use software tools• Described as a “silver bullet for estimating causal models in many empirical data situations” (Hair, et al, 2011, p 148)
• Without equal in creating large complex models (Henseler, et al., 2009)
• PLS made my job of analyzing the data and developing an explanatory model much easier than it was with linear regression or SEM• I like the SmartPLS program (but your mileage may vary)• Give it a try; review the references; I’m happy to help if you have questions
32
References• Curry, M. Marshall, B., Kawalek, P. (2013). IT Artifact Bias‐ How Affordance Perceptions Influence IT Assessments (currently under review).
• Chin, W. W., Marcolin, B. L., & Newsted, P. R. (2003). A partial least squares latent variable modeling approach for measuring interaction effects: Results from a Monte Carlo simulation study and an electronic‐mail emotion/adoption study. Information systems research, 14(2), 189‐217.
• Fornell, C., Lorange, P., & Roos, J. (1990). The cooperative venture formation process: A latent variable structural modeling approach.Management Science,36(10), 1246‐1255.
• Hair, Joe F., Christian M. Ringle, and Marko Sarstedt. "PLS‐SEM: Indeed a silver bullet." The Journal of Marketing Theory and Practice 19.2 (2011): 139‐152.
• Henseler, J., Ringle, C., & Sinkovics, R. (2009). The use of partial least squares path modeling in international marketing. Advances in International Marketing (AIM), 20, 277‐320.
• Lohmöller, J.‐B. (1984). LVPLS Program Manual: Latent Variables Path Analysis with Partial Least‐Squares Estimation, Köln: Zentralarchiv für empirische Sozialforschung.
• Marshall, B., Curry, M., Reitsma, R. (2011). Organizational Information Technology Norms and IT Quality. Communications of the IIMA, 11(4).
• Tenenhaus, Michel, et al. "PLS path modeling." Computational statistics & data analysis 48.1 (2005): 159‐205.
• Wold, H. (1985). "Partial Least Squares," in S. Kotz and N. L. Johnson (Eds.), Encyclopedia of Statistical Sciences (Vol. 6), New York: Wiley, 581‐591.
33