Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear
description
Transcript of Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear
![Page 1: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/1.jpg)
Item Response Theory and Longitudinal Modeling:
The Real World is
Less Complicated than We Fear
Marty McCallNorthwest Evaluation Association
Presented to the MSDE/MARCES Conference
ASSESSING & MODELING COGNITIVE DEVELOPMENT IN SCHOOL:
INTELLECTUAL GROWTH AND STANDARD SETTING October, 19, 2006
![Page 2: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/2.jpg)
2
Examining constructs through vertical scales
• What are vertical scales?
• Who uses them and why?
• Who doesn’t use them and why not?
![Page 3: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/3.jpg)
3
What are vertical scales?• In the IRT context, they are:
• scales measuring a construct from the easiest through the most difficult tasks
• equal interval scales spanning ages or grades
• also called developmental scales• a common framework for
measurement of a construct over time
![Page 4: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/4.jpg)
4
Why use vertical scales?
• To model growth:
Tests that are vertically scaled are intended to support valid inferences regarding growth over time.
--Patz, Yao, Chia, Lewis, & Hoskins (CTB/McGraw)
![Page 5: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/5.jpg)
5
Why use vertical scales?• To study cognitive changes:
When people acquire new skills, they are changing in fundamental interesting ways.
By being able to measure change over time it is possible to map phenomena at the heart of the educational enterprise.
--John Willet
![Page 6: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/6.jpg)
6
Who uses vertical scales?
• CTB McGraw• TerraNova Series• Comprehensive Test of Basic Skills
(CTBS)• California Achievement Test
• Harcourt• Stanford Achievement Test• Metropolitan Achievement Test
• Statewide NCLB tests• All states using CTB or Harcourt’s tests• Mississippi, North Carolina, Oregon,
Idaho• Woodcock cognitive batteries
![Page 7: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/7.jpg)
7
Development and useNote that many of these scales were
developed prior to NCLB and before cognitive psychology had gained currency.
Achievement tests began in an era of normative interpretation.
Policymakers are now catching up to content and standards-based interpretations.
![Page 8: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/8.jpg)
8
Assumptions-implicit and explicitThe construct is a unified continuum of
learning culminating in mature expertise
Domain coverage areas are not necessarily statistical dimensions
Scale building models the sequence of skills and the relationship between them
The construct embodies a complex unidimensional ability
![Page 9: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/9.jpg)
9
What?
The construct embodies a complex unidimensional ability
A mature ability such as reading or doing algebra problems involve many component skills.
The ability itself is unlike any of its component skills.
Complex skills are emergent properties of simpler skills and in turn become components of still more complex skills
![Page 10: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/10.jpg)
10
Who doesn’t use vertical scales?Why not?
In recent years, there have been challenges to the validity of vertical scales. Much of this comes from the viewpoint of standards-based tests including those developed for NCLB purposes.
Many critical studies use no data or data simulated to exhibit dimensionality.
![Page 11: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/11.jpg)
11
Assumptions-implicit and explicit
Subject matter at each grade forms a unique construct described in content standards documents.
Topics not explicitly covered in standards documents are not tested.
Content categories represent independent or quasi-independent abilities.
![Page 12: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/12.jpg)
12
How assumptions affect vertical scaling issues
Cross-grade linking blocks detract from grade-specific content validity
Changes in content descriptions indicate differences in dimensionality for different grades
Vertical linking connects unlike constructs to a scale that may be mathematically tractable but lacks validity
![Page 13: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/13.jpg)
13
Vertical scale critics ask:“How can you put unlike structures
together and expect to get meaningful scores and coherent achievement standards?”
Vertical scale proponents ask:
“If you believe the constructs are different how can you talk about change over time? Without growth modeling how can you get coherent achievement standards?”
![Page 14: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/14.jpg)
14
Criticism centers on two major issues
• Linking error
• Violations of dimensionality assumptions
![Page 15: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/15.jpg)
15
Issue #1: Linking creates error
There is some error associated with all measurement, but current methods of vertical scaling greatly minimize it. These methods include:
--triangulation with multiple forms or common person links
--comprehensive and well-distributed linking blocks
--continuous adjacent linking
--fixed parameter linking in adaptive context
![Page 16: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/16.jpg)
16
How do people actually create and maintain vertical scales?
• Harcourt – common person for SAT and comprehensive linking blocks
• CTB – methods include concurrent calibration, non-equivalent anchor tests (NEAT), innovative linking methods
• ETS – (the king of NEAT) – also uses an integrated IRT method (Davier & Davier)
![Page 17: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/17.jpg)
17
How do we do it?
Scale establishment method extensively described in
Probability in the Measurement of Achievement
By George Ingebo
![Page 18: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/18.jpg)
18
How do we do it?Extensive initial linking
A
C D
B
1
3
4
1
2
3
423
4
2
3
![Page 19: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/19.jpg)
19
Benchmark X
Benchmark
X +1
Adaptive Continuous Vertical Linking
![Page 20: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/20.jpg)
20
Issue #2Dimensionality
Reading and mathematics at grade 3 looks very different than those subjects at grade 8. In addition, the curricular topics differ at each grade.
How can they be on the same scale?
![Page 21: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/21.jpg)
21
1. Does essential unidimensionality hold throughout the scale?
2. Do content areas within scales form statistical dimensions?
Study of Dimensionality: Research Questions
![Page 22: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/22.jpg)
22
1.Does essential unidimensionality hold throughout the scale?
• Examine a set of items that comprised forms for state tests in reading and mathematics in grades 3 through 8
• Use Yen’s Q3 statistic to assess dimensionality for an exploratory dimensionality study
![Page 23: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/23.jpg)
23
Basic concept: When the assumption of unidimensionality is satisfied, responses exhibit local independence. That is, when the effects of theta are taken into account, correlation between responses is zero.
Q3 is the correlation between residuals of response pairs.
![Page 24: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/24.jpg)
24
)exp(1
1|1()(
buPP ijki
where:
uik is the score of the kth examinee on the ith item Pi(k) is as given in the Rasch model:
)( kiikik Pud dik is the residual:
![Page 25: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/25.jpg)
25
jiddij rQ
The correlation taken over examinees who have taken item i and item j is:
Fishers r to z’ transformation gives a normal distribution to the correlations:
)1ln()1ln(5.' rrz
Q3 values tend to be negative (Kingston & Doran)
![Page 26: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/26.jpg)
26
Pairs of responses from adaptive tests – NWEA’s Measures of Academic Progress
Over 49 million response pairs per subject
READING MATH
Number of Items 252 252
Number of valid item pairs 25,713 20,449
Mean Fishers z' -0.025 -0.020
Standard Deviation z' 0.041 0.050
![Page 27: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/27.jpg)
27
![Page 28: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/28.jpg)
28
![Page 29: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/29.jpg)
29
![Page 30: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/30.jpg)
30
![Page 31: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/31.jpg)
31
2. Do content areas within scales form statistical dimensions?
Used method from
Bejar (1980). “A procedure for investigating the unidimensionality of achievement tests based on item parameter estimates” J of Ed Meas, 17(4), 283-296
Calibrate each item twice; once, using responses to all items on the test (the usual method); again using only responses to items in the same goal area.
![Page 32: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/32.jpg)
32
2. Do content areas within scales form statistical dimensions?
Data is from fixed form statewide accountability test of reading and mathematics.
![Page 33: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/33.jpg)
33
![Page 34: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/34.jpg)
34
![Page 35: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/35.jpg)
35
![Page 36: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/36.jpg)
36
![Page 37: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/37.jpg)
37
![Page 38: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/38.jpg)
38
![Page 39: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/39.jpg)
39
What we have found regarding skill development:
• New topics build on earlier ones and show up statistically as part of the construct
• Although they may not be specified in later standards, early topics and skills are embedded in later ones (e.g., phonemics, number sense)
• Essential unidimensionality (Stout’s terminology) holds throughout the scale with minor dimensions of interest
![Page 40: Item Response Theory and Longitudinal Modeling: The Real World is Less Complicated than We Fear](https://reader036.fdocuments.us/reader036/viewer/2022062323/5681522b550346895dc073b3/html5/thumbnails/40.jpg)
40
Thank you for your attention.
Marty McCallNorthwest Evaluation Association
5885 SW Meadows Road, Suite 200Lake Oswego, Oregon 97035-3256
Phone: 503-624-1951FAX: 503-639-7873