6. Validity
-
Upload
mark-parayil -
Category
Documents
-
view
219 -
download
0
Transcript of 6. Validity
-
8/2/2019 6. Validity
1/31
PSY 6535Psychometric Theory
Validity Part 1
-
8/2/2019 6. Validity
2/31
Overview
Content validity
Criterion-related validity
-
8/2/2019 6. Validity
3/31
Issues of Validity
Does the test actually measure what it is
purported to measure? Do differences in tests
scores reflect true differences in the
underlying construct?
Are inferences based on the test scores
justified?
-
8/2/2019 6. Validity
4/31
Example:
Validity of a Measure
The use of the polygraph (lie detector test) is
not nearly as valid as some say and can easily
be beaten and should never be admitted into
evidence in courts of law, say psychologists
from two scientific communities who were
surveyed on the validity of polygraphs. APANews Release
-
8/2/2019 6. Validity
5/31
Validity is About Inferences.
Cronbach (1971): Validation is the process of
collecting evidence to support the types of
inferences that are drawn from test scores.
Validity is the degree to which all ofthe
accumulated evidence supports the intended
interpretation of test scores for the intended
purpose. (AERA, APA, NCME, 1999, p. 11).
-
8/2/2019 6. Validity
6/31
Validity for what?
Inferences and decisions based on test scores
A person with this score is likely to
Be a better parent Do well in law school
Be most satisfied as an engineer
Steal from his/her employer
-
8/2/2019 6. Validity
7/31
Types of Validity
Content
Criterion-based
ConstructConstruct
(general evidence gathering)
Content
(more theory-based)
Criterion-related
(more data-based)
-
8/2/2019 6. Validity
8/31
Content Validity of a Measure
Collectively, do the items adequately
represent all of the domains of the construct
of interest?
Staring Point: A Well Defined Construct.
Often have a panel of experts judge whether
items adequately sample the domain of
interest.
-
8/2/2019 6. Validity
9/31
Example: 1st Grade Math Objectives
What 1st Graders in School District X Should:
A. Be able to add any two positive numbers
whose sum is 20 or less.
B. Subtract any two numbers (each less than
15) whose difference is a positive number.
-
8/2/2019 6. Validity
10/31
Item Pool Which are Content Valid?
1. 13 + 2 =___
2. 12 5 =____
3. 10 13 = ____
4. 26 15 = ____
5. 13 + 4 7 = ____
6. Sammy has 10 pennies. He lost 2. How many pennies does
Sammy have now?
A. 2 pennies; B. 8 pennies; C. 10 pennies; D. 12 pennies
-
8/2/2019 6. Validity
11/31
Example: Depression(Modified from the DSM IV)
A complex of symptoms marked by:
Disruptions in appetite and weight
Insomnia or hypersomnia
Loss of interest or pleasure in activities
Loss of energy
Feelings of worthlessness
Feels sad or empty nearly everyday
Frequent deathrelated thoughts
-
8/2/2019 6. Validity
12/31
Item Pool Which are Content Valid?
I feel blue or sad.
I feel nervous when speaking to someone in
authority.
I have crying spells.
Im always willing to admit it when I make a
mistake.
I felt that everything I did was an effort.
I never resent being asked to return a favor.
I experience spells of terror or panic.
-
8/2/2019 6. Validity
13/31
Assessing Content Validity
Steps for assessing content validity:
1. Describe the content domain
2. Determine the areas of the content domain that are measured
by each item
3. Compare the structure of the test with the structure of thecontent domain
Challenges:
Difficulty in defining the domain
Categorizing the content domain and map items to the categories
Ensure representativeness
-
8/2/2019 6. Validity
14/31
Contamination & Deficiency
Construct Measure
Relevance
(Content Validity)
MeasureContamination
MeasureDeficiency
-
8/2/2019 6. Validity
15/31
What do we want?
A measure that samples from all important
domains or aspects (Low Deficiency)
A measure that does not include anything
irrelevant (Low Contamination)
That is, a measure that adequately captures
all of the domains of the construct that it is
intended to measure. (High Content Validity)
-
8/2/2019 6. Validity
16/31
Criterion-related Evidence for a Measure
What should this test predict? What inferences are we
going to use this test to make?
Criterion-related validation is data based.
Does the test actually predict behavior that it is
supposed to predict?
Correlate an honesty test with employee theft
Correlate a pencil and paper measure of delinquency
with arrest records
Correlate a measure of study habits with actual grades
-
8/2/2019 6. Validity
17/31
Two Main Types of
Criterion-Related Validity
Predictive validityfuture criteria
Concurrent validitycurrentcriteria
-
8/2/2019 6. Validity
18/31
Criterion-related validity:
Concurrent validity
Students who have been admitted to Wayne
State take the SAT. Their GPA is recorded at the
same time. The correlation between the test scores and
performance is computed. This correlation is
sometimes called a validity coefficient.
-
8/2/2019 6. Validity
19/31
Criterion-related validity:
Predictive validity
Students take the SAT (or ACT) during High
School and then some are selected into Wayne
State. Later, their SAT scores are correlated withtheir college GPA.
This correlation is also sometimes called a
validity coefficient.
If SAT scores and college GPA are correlated,
then the SAT has some degree of predictive
validity for predicting college GPA.
-
8/2/2019 6. Validity
20/31
Problem:
Small Samples = Imprecise Estimates
Sample Size Observed
Correlation
Lower Bound of
95% CI
Upper Bound of
95% CI
10 .50 -.33 .89
20 .50 .04 .79
50 .50 .25 .69
100 .50 .33 .64
200 .50 .39 .60
400 .50 .42 .571000 .50 .45 .55
-
8/2/2019 6. Validity
21/31
Problem: Range Restriction
Range Restriction The variance in scores in the
sample at hand is smaller than the variance in
scores in the population of interest.
Range restriction is thought to reduce theobserved correlation between test scores and
criterion measures. (Exceptions are possible)
In the previous examples where was the
restriction/why was there restriction?
-
8/2/2019 6. Validity
22/31
Example: range restriction
JobPerforman
ce
General cognitive ability
-
8/2/2019 6. Validity
23/31
Example: range restriction
JobPerforman
ce
General cognitive ability
-
8/2/2019 6. Validity
24/31
Example: range restriction
JobPerforman
ce
General cognitive ability
-
8/2/2019 6. Validity
25/31
When/where might we find
range restriction?
Sample of employees chosen based on high test
scores and interview scores (high scores on
predictor) Sample of current employees promoted due to
high performance (high scores on criterion
measure)
In both cases variability is being reduced (either
in the predictor variable or in the criterion
variable)
-
8/2/2019 6. Validity
26/31
Measurement Error
Reliability Index of the presence of
measurement error (1.0 reliability = No error)
Unreliability in the predictor and criterion serves
to reduce (attenuate) their observed correlation Researchers are often concerned about
attenuation in predictor-criterion associations
-
8/2/2019 6. Validity
27/31
When/where might we find unreliability?
Everywhere!
Tests used as predictors (e.g., measures of
depression)
Criterion measures (e.g., ratings of client
well-being)
Unreliability is a concern for both
predictors and criteria Unreliability in
both can reduce correlations
-
8/2/2019 6. Validity
28/31
Assume that measures of X and Y have
alphas of .60 and .70, respectively. The
observed r between X and Y is .40. However,
we might want to know how much this
correlation is depressedby
measurement error.
-
8/2/2019 6. Validity
29/31
Correction for Attenuation
Where:
rxy = observed correlation between x and y
rxx and ryy = reliability coefficients for x and y
xy
c
xx yy
rr
r r
-
8/2/2019 6. Validity
30/31
Correcting for Measurement Error
Reliability
Measure x
Reliability
Measure y
Observed
Correlation
Corrected
Correlation
.50 .60 .40 .73
.60 .70 .40 .62
.70 .80 .40 .53
.80 .90 .40 .47
.90 .90 .40 .44
-
8/2/2019 6. Validity
31/31
Summary Issues
Criterion-related Validity What sample will we use?
Small Samples More Imprecision in the correlation estimate
Issues of Generalization
What is our Criterion? How do we measure it?
Variability is needed for both Predictor and Criterion variables
Attenuation Due to Measurement Error
Predictor-Criterion Overlap
Same items on both measures bad!