Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External
-
Upload
kevin-mcgrew -
Category
Technology
-
view
2.507 -
download
4
description
Transcript of Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External
![Page 1: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/1.jpg)
The Art and Science of Test Development—Part G
Psychometric/technical statistical analysis: External
The basic structure and content of this presentation is grounded extensively on the test development procedures developed by Dr. Richard Woodcock
Kevin S. McGrew, PhD.
Educational Psychologist
Research DirectorWoodcock-Muñoz Foundation
![Page 2: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/2.jpg)
“In god we trust….all others must show data” (unknown source)
Test authors and publishers have standards-based
responsibility to provide supporting psychometric technical information re:
tests and battery
Typically in the form of a series of technical chapters in manual or a
separate technical manual
![Page 3: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/3.jpg)
Calculate psychometric/measurement statistics for technical manual/chapters
Use Joint Test Standards as a guide
With external measures
![Page 4: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/4.jpg)
g
Gf Gv Glr Gs
Gc Gsm Ga
Theoretical Domain - CHC
Measurement or empirical domain
External evidence is focused on
relations between test battery
variables (measures or latent constructs)
and otherexternal (outside
of battery)constructs,
measures, or criteria
![Page 5: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/5.jpg)
External Stage of Test Development
Purpose Examine the external relations among the focal construct (i.e., intelligence or cognitive abilities) and other constructs and/or subject characteristics
Questions asked Do the focal constructs and observed measures “fit” within a network of expected construct relations (i.e., the nomological network)
Method and concepts • Group differentiation• Structural equation modeling• Correlation of observed measures with other measures• Multitrait-Multimethod matrix
Characteristics of strong test validity program
• Focal constructs vary in theorized ways with other constructs• Measures of the constructs differentiate existing groups that
are known to differ on the constructs• Measures of focal constructs correlate with other validated
measures of the same constructs• Theory-based hypotheses are supported, particularly when
compared to rival hypotheses
![Page 6: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/6.jpg)
External Stage of Test Development
Purpose Examine the external relations among the focal construct (i.e., intelligence or cognitive abilities) and other constructs and/or subject characteristics
Questions asked Do the focal constructs and observed measures “fit” within a network of expected construct relations (i.e., the nomological network)
Method and concepts • Correlation of observed measures with other measures
Characteristics of strong test validity program
• Measures of focal constructs correlate with other validated measures of the same constructs
![Page 7: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/7.jpg)
Concurrent external validity example: WJ III GIA clusters correlations with other IQ
battery full scale scores
Provide evidence at select key age groups (related to intended age range and purpose of battery) in normal samples
![Page 8: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/8.jpg)
Concurrent external validity example: WJ III Achievement (reading, math, writing) cluster correlations
with measures from other (external) ach. batteries
Provide evidence at select key age groups (related to intended age range and purpose of battery) in normal samples
![Page 9: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/9.jpg)
Other
Battery
Total (Full Scale) Score
WJ III
Pred.
Ach.
WJ III
GIA-
Extended
WJ III
GIA-
Standard
DAS .41 -- .52 .47
WPPSI-R .37 -- .52 .47
WISC-III .50 .68 .67 .63
WAIS-III .39 .56 -- .56
KAIT .53 .56 -- .56
Concurrent external validity example: Comparative predictive validity (of achievement)
Comparisons of correlations (across reading, math, written language, and total achievement domains) of the average WJ III GIA and Predicted Achievement score
options and full scale scores from other (external) major intelligence batteries
Provide evidence at select key age groups (related to intended age range and purpose of battery) in normal samples
![Page 10: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/10.jpg)
External Stage of Test Development
Purpose Examine the external relations among the focal construct (i.e., intelligence or cognitive abilities) and other constructs and/or subject characteristics
Questions asked Do the focal constructs and observed measures “fit” within a network of expected construct relations (i.e., the nomological network)
Method and concepts • Correlation of observed measures with other measures
Characteristics of strong test validity program
• Focal constructs vary in theorized ways with other constructs
• Measures of focal constructs correlate with other validated measures of the same constructs
![Page 11: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/11.jpg)
• Focal constructs vary in theorized ways with other constructs• Measures correlate with other validated measures of the same constructs
(select illustrative examples—concurrent external validity correlations)
?
![Page 12: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/12.jpg)
• Focal constructs vary in theorized ways with other
constructs
• Measures correlate with
other validated measures of the same constructs
(select illustrative example—
exploratory factor analysis of select
WJ III and WISC-III tests)
![Page 13: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/13.jpg)
WJ IIIBLKROT
WISC-III Tests
Information 0.27
Coding 0.08
Similarities 0.29
Picture Arangment 0.14
Arithmetic 0.09
Block Design 0.38
Vocabulary 0.23
Object Assembly 0.31
Comprehension 0.15
Symbol Search 0.23
Digit Symbol 0.08
Note: Absolute magnitude of correlations artificially low due to sample range restriction. Important observation is relative magnitude of correlations
• Focal constructs vary in theorized ways with other constructs
• Measures correlate with other validated measures of the same constructs
(select illustrative example—WJ III Block Rotation [Gv-Vz] correlation with WISC-III tests in grade 3-5 sample)
![Page 14: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/14.jpg)
Phelps et al. (2005) WISC-III/WJ III cross-
battery (joint) CFA
• Focal constructs vary in theorized ways with other
constructs
• Measures correlate with
other validated measures of the same constructs
(select illustrative example—
confirmatory factor analysis of select
WJ III and WISC-III tests)
![Page 15: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/15.jpg)
Phelps et al. (2005) WISC-III/WJ III cross-battery (joint) CFA
![Page 16: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/16.jpg)
VRBCMPZ
ANLSYNZ
CONFRMZ
CRSOUTZ
MEMSENZ
MEMWRDZ
NUMREVZ
PICRECZ
SPARELZ
VISCLOZ
VISMAT2Z
BLKROTZ
DECSPDZ
RETFLUZ
RPCNAMZ
AWKMEMZ
LWIDNTZ PSGCMPZRDGFLZ
WVOCSS
WSIMSS
WARITHSS
WINFOSS
WCOMPSS
WLNSSS
WPICCSS
WBDSS
WMATRSS
WPICASS
KDEFSS
KLOGSTSSKAUDCSS
KMYSCSS
KDOUBMSS
r1
r2
r3
r4
r6
r7
r8
r10
r11
r12
r13
r14
r15 r39
r38
r37
r36
r35
r34
r33
r32
r31
r30
r29
r28
r27
r26
r25
r24
r23
r22
r9
r42r43
r44
Gc
Gsm
GrwGf
Gv
Gsf2
f1
f9
f7
f3
f8
g
.70
.70
.89
.66
.71Gq
f10
.72
r5
.38
.45
.69
.90
.80
.19.24
.50
.73
.26
.57
.66
.76
.64
.69
.50
.67
.67
.67
.47
.55
.69
.30.53
.36
.60
.77
.21
.24
.59
.83
.85
.73
.36
.32
.64
.52.47
.80
.80.45
.35
.21
.54 .69
.51
Joint WJ III/WAIS-III/WMS-III/KAIT CFAGregg/Hoy College LD/NLD (n=200) Sample—Analysis by K. McGrew
(This is NOT the complete model..only portion that includes Gv factor information)
![Page 17: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/17.jpg)
External Stage of Test Development
Purpose Examine the external relations among the focal construct (i.e., intelligence or cognitive abilities) and other constructs and/or subject characteristics
Questions asked Do the focal constructs and observed measures “fit” within a network of expected construct relations (i.e., the nomological network)
Method and concepts • Structural equation modeling
Characteristics of strong test validity program
• Theory-based hypotheses are supported, particularly when compared to rival hypotheses
![Page 18: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/18.jpg)
Structural equation modeling external validity evidence example
![Page 19: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/19.jpg)
Structural equation modeling external validity evidence example
![Page 20: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/20.jpg)
Picture Recognition
Visual Matching
Decision Speed
Sound Blending
Gf
Gv
Gs
Glr
Ga
MemSpan
Oral Comp
WA
r12
r8
r14
r3
r15
r17
r18
Gc
General Information r13
r16
r20
r21
Incomplete Words
Sound Patterns
r9
r10
Vis-Aud Learningr7
Block Rotation
Spatial Relations
r1
r2
DR: Vis-Aud Lrng
Retrieval Fluencyr5
r6
Word Attackr24
Verbal Comp r11
Cross Outr19
.44
.35
.40
.82
.64
.73
.48
.69
.78
.64.49
.45
.96
g
f2
f6
f3
f8
f4
f5
f1
.85
.94
.87.8
4
.93
f7
.44
.78
.89
.83
Memory for Namesr4
.52
.79
.36
Analysis-Synthesis
Concept Formation
Numerical Reas
.63
.74.63
Mem for Sentences
Mem for Words
.78
.69
WorkMem
Numbers Reversed
Aud Working Mem
r23
r22.62
.67
.62
f9
.93
.07
.46
.27
.19
Ages 6-8
Structural equation modeling external validity evidence example
![Page 21: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/21.jpg)
External Stage of Test Development
Purpose Examine the external relations among the focal construct (i.e., intelligence or cognitive abilities) and other constructs and/or subject characteristics
Questions asked Do the focal constructs and observed measures “fit” within a network of expected construct relations (i.e., the nomological network)
Method and concepts • Group differentiation
Characteristics of strong test validity program
• Measures of the constructs differentiate existing groups that are known to differ on the constructs
![Page 22: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/22.jpg)
Group differentiation external validity evidence example: LD vs Non-LD university samples
![Page 23: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/23.jpg)
Group differentiation external validity evidence example: Normal/Gifted/LD/MR samples
![Page 24: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/24.jpg)
Group differentiation external validity evidence example—discriminant function analysis
(Normal/Gifted/LD/MR samples)
![Page 25: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/25.jpg)
Group differentiation external validity evidence example—discriminant function analysis classification accuracy
(Normal/Gifted/LD/MR samples—grade 3-4)
![Page 26: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/26.jpg)
Group differentiation external validity evidence example(variety of “clinical disorder groups”)
(continued on next slide)
![Page 27: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/27.jpg)
Group differentiation external validity evidence example (cont.) variety of “clinical disorder groups”)
![Page 28: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/28.jpg)
Lack of rigor and quality control in all prior/earlier stages will “rattle through the data” and rear its ugly head when performing the final statistical analysis, especially multivariate validity analyses (SEM, DF, multiple regression, EFA, CFA)
Shorts cuts in prior stages will “bite you in in the ____” as you attempt to perform final statistical analysis
Data screening, data screening, data screening!!!!……. prior to do performing final statistical analysis
• Compute extensive descriptive statistical analysis for all variables (e.g., histograms, scatterplots, box-whisker plots, etc.)
• More than means and SD’s. Also calculate median, skew, kurtosis, n-tiles, etc.
Deliberately planned and sophisticated “front end” data collection short-cuts (e.g., matrix sampling) introduce an extreme level of “back end” complexity to routine statistical/psychometric analysis
Know your limits, level of expertise, and skills. Even those with extensive test development experience often need access to trusted measurement/statistical consultants
(cont. next slide)
(Note: The following information is almost identical to that presented in Part F—Internal psychometric/statistical analysis)
![Page 29: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/29.jpg)
Published statistics/psychometric information needs to be based on final publication length tests
• Often need to use test-length correction formula’s (e.g., KR-21) for test reliabilities
• Correlations between short /and or long norming versions of a test and other tests, that differ in test length (number of items) from publication length test, may need special adjustments/corrections.
Back up, back up, back up!!!!!!!!!! Don’t let a dead hard drive or computer destroy your work and progress. Do it constantly. Build redundancy into your files and people skill sets
Sad fact: Majority of test users do NOT pay attention to the fancy and special psychometric/statistical analysis you report in technical chapters or manuals. Be prepared for post-publication education via other methods.
Post-manual publication technical reports of special/sophisticated analyses are good when publication time-line pressures dictate making difficult decisions.
![Page 30: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/30.jpg)
Most test developers are stuck in a methodological rut. There is much that can be learned about the internal and external validity of a test battery using lesser-used statistical methods.
• Multidimensional scaling (MDS); cluster analysis, CART (classification and regression tree analysis), MARS (multivariate applied regression splines)
Use of curve smoothing procedures to better estimate population parameters from statistical analyses across age groups.
Multiple group CFA (planned incomplete data) reference variable validity designs and methods (Jack McArdle).
![Page 31: Applied Psych Test Design: Part G: Psychometric/technical statistical analysis: External](https://reader033.fdocuments.us/reader033/viewer/2022051313/547eb777b4af9f4c738b45a6/html5/thumbnails/31.jpg)
End of Part G
Additional steps in test development process will be presented in subsequent modules as they are developed