Estimating Growth when Content Specifications Change:

26
Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

description

Estimating Growth when Content Specifications Change:. A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University. The Problem. State curriculum frameworks often change from one grade to the next reflecting the addition of new instructional content. - PowerPoint PPT Presentation

Transcript of Estimating Growth when Content Specifications Change:

Page 1: Estimating Growth when Content Specifications Change:

Estimating Growth when Content Specifications Change:A Multidimensional IRT Approach

Mark D. ReckaseTianli LiMichigan State University

Page 2: Estimating Growth when Content Specifications Change:

The Problem

State curriculum frameworks often change from one grade to the next reflecting the addition of new instructional content. For example, at grade 7 algebra may be introduced as an

instructional goal. At grade 6, algebra is not an important component of the

curriculum. Tests at the two grades reflect the instructional content

so the 6th grade test does not include algebra and the 7th grade test does.

How can the score scales of these tests be linked?

Page 3: Estimating Growth when Content Specifications Change:

Research Questions

What do changes on the linked score scale mean, when the scale is produced using the usual unidimensional IRT models?

Can multidimensional IRT be used to form vertical scales? If so, how do the results compare to the unidimensional results?

Page 4: Estimating Growth when Content Specifications Change:

The Approach

State testing data were analyzed using multidimensional IRT to develop a realistic model for the test data at two grade levels.

The results of the real data analyses were idealized to create the specifications for simulating the tests at two grade levels.

Simulate data with known structure to determine how unidimensional and multidimensional procedures function.

Page 5: Estimating Growth when Content Specifications Change:

The Simulated Data Design

Grade 6 – two major constructsArithmeticProblem Solving

Grade 7 – three major constructsArithmeticProblem SolvingAlgebra

Page 6: Estimating Growth when Content Specifications Change:

Simulated Test Structure

Test Level Algebra Arithmetic Problem Solving

Total

Grade 6 0 17 (4) 23 (6) 40 (10)

Grade 7 11 (0) 11 (4) 18 (6) 40 (10)

Note: The numbers in parentheses are the common items between the two forms of the tests.

Page 7: Estimating Growth when Content Specifications Change:

Mean Vectors at each Grade Level

Class Level Algebra Arithmetic ProblemSolving

Grade 6Grade 7

-1.5 (-1.50)0 (.03)

.5 (.51)

.7 (.73)-.2 (-.21)0 (.01)

Note: Values in parentheses are the observed means from the simulated data

Page 8: Estimating Growth when Content Specifications Change:

Covariance MatricesCovariance Matrix for Grade 6

Algebra Arithmetic Problem Solving

Algebra .25 (.25) 0 (.00) 0 (.00)

Arithmetic 0 (.00) .8 (.84) .7 (.76)

Problem Solving

0 (.00) .7 (.76) 1.2 (1.29)

Covariance Matrix for Grade 7

Algebra Arithmetic Problem Solving

Algebra 1 (1.05) .4 (.42) .6 (.64)

Arithmetic .4 (.42) .6 (.60) .3 (.32)

Problem Solving

.6 (.64) .3 (.32) 1 (1.02)Note: Values in parentheses are estimated from the simulated data.

Page 9: Estimating Growth when Content Specifications Change:

Orientation of Items

-2-1.5 -1

-0.50 0.5

1

-1

0

1

2-2

-1.5

-1

-0.5

0

0.5

1

1.5

1

2

3

Page 10: Estimating Growth when Content Specifications Change:

Effect Size Built into Data

Algebra ArithmeticProblem Solving

1.9 .26 .21

Page 11: Estimating Growth when Content Specifications Change:

Unidimensional Basisfor Comparison Imagine that the full set of 70 items from both

test levels are administered to the students at both grade levels.

The matrix of 2000 + 2000 students from the two grades by 70 items can be analyzed with the unidimensional models to serve as a basis for comparison for the vertical scaling result.

Analyze the matrix using 2pl and Rasch model.

Page 12: Estimating Growth when Content Specifications Change:

2PL Solution

-2

-1

0

1

2

-1

0

1

2-2

-1

0

1

2

1

2

3

Page 13: Estimating Growth when Content Specifications Change:

Rasch Model Solution

-2

-1

0

1

2

-1

0

1

2-3

-2

-1

0

1

2

1

2

3

Page 14: Estimating Growth when Content Specifications Change:

Vertical Scaling Analysis

Common-item concurrent calibration BILOGMG

Off grade items coded as not reachedBoth 2pl and Rasch model used for analysis

Determine effect size of difference in mean of two grade levels

Page 15: Estimating Growth when Content Specifications Change:

Vertically Scaled Effect Sizes

2PL Model70 Items

Rasch Model

70 Items

2PL ModelConcurrent

Rasch Model

Concurrent

Mean (SD)Grade 6

-.54 (.78) -.42 (.93) -.22 (1.16) -.14 (1.06)

Mean (SD)Grade 7

.56 (1.13) .45 (1.15) .26 (1.20) .21 (1.38)

Effect Size 1.13 .83 .41 .28

Page 16: Estimating Growth when Content Specifications Change:

Vertically Scaled Effect Sizes

Linked effect size is smaller than full data effect size.

Rasch effect size is less than 2pl effect size.

Full data set effect size is less than modeled effect size.

Page 17: Estimating Growth when Content Specifications Change:

Alternative Linking Method

Common-item, separate calibration

Common item parameter relationship was poor

-2 -1.5 -1 -0.5 0 0.5 1-2

-1.5

-1

-0.5

0

0.5

1

b-parameters Grade xb-

para

met

ers

Gra

de x

+ 1

Page 18: Estimating Growth when Content Specifications Change:

MIRT Analysis

Full data analysis with TESTFACTThree dimensional analysisDetermine effect size for each dimensionCorrelate each estimated with the

generating s to determine meaning of the results.

Page 19: Estimating Growth when Content Specifications Change:

MIRT Effect Sizes

θ1 θ2 θ3

Mean (SD) Total

.01 (.95) -.01 (.90) .05 (.72)

Mean (SD) 6 -.57 (.54) .16 (.99) .03 (.74)

Mean (SD) 7 .60 (.90) -.19 (.77) .06 (.69)

Effect Size 1.56 -.40 .05

Page 20: Estimating Growth when Content Specifications Change:

Correlation between Trueand Estimated Values

Est θ1 Est θ2 Est θ3

True θ1 .92 -.08 .02

True θ2 .47 .50 -.18

True θ3 .46 .80 -.03

Page 21: Estimating Growth when Content Specifications Change:

Interpretation of MIRT Solution

Results are difficult to interpret because of the default procedures in TESTFACT.

Solution needs to be rotated to have axes align with content dimensions.

Current solution shows that is related to algebra and shows the big algebra effect.

is a combination of arithmetic and problem solving with the emphasis on problem solving. Most likely it has the sign of the a-parameters

reversed.

Page 22: Estimating Growth when Content Specifications Change:

Concurrent MIRT Analysis

Use concurrent calibration of data from the two grade levels.Three dimensional solutionNo rotation

Determine effect sizes and correlations with true values.

Page 23: Estimating Growth when Content Specifications Change:

Concurrent MIRT Calibration

θ1 θ2 θ3

Mean (SD) Total

.06 (.75) -.09 (.57) -.38 (1.01)

Mean (SD) 6 -.02 (.87) -.29 (.56) .18 (.64)

Mean (SD) 7 .14 (.59) .10 (.50) -.94 (.99)

Effect Size .22 .74 -1.34

Page 24: Estimating Growth when Content Specifications Change:

Concurrent MIRT Calibration

Est θ1 Est θ2 Est θ3

True θ1 .16 .57 -.87

True θ2 .54 .02 -.40

True θ3 .77 -.05 -.43

Page 25: Estimating Growth when Content Specifications Change:

Concurrent MIRT Calibration

Scale on Dimension 3 is reversed and it has a large effect size (algebra).

Dimension 1 is most related to arithmetic and problem solving with a moderate effect size.

Dimension 2 is moderately related to algebra and has a large effect size.

The overall result gives a reasonable estimate of effects, but the dimensions need to be rotated to match the constructs.

Page 26: Estimating Growth when Content Specifications Change:

Conclusions

Unidimensional linking of the two level tests underestimate the effect size.

Rasch model gives a smaller effect size than the two parameter logistic model.

MIRT solution shows promise. Need to determine how to rotate solution to match

constructs. TESTFACT has problems converging on estimates

because of mismatch between assumptions and reality.