Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior...

28
Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring

Transcript of Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior...

Page 1: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Scoring Technology Enhanced Items

Sue LottridgeDirector of Machine Scoring

Amy BurkhardtSenior Research Associate of Machine Scoring

Page 2: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Technology Enhanced Items

• Seeing more TEIs in assessments– Consortia – Formative assessments

• Decisions around TEIs– Count-based (e.g., 25 MCs, 2 CRs, 3 TEIs)– Content-based

Page 3: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Drag and Drop TEIs

• Select– Drag N objects to a single drop target– Similar to ‘Check all that apply’ Selected Response

Items

• Categorize– Drag N objects to M drop targets– Limits: an object can be dragged to multiple Y

targets, or no

• Order– Drag N objects to M drop targets in proper order

• Composites (multi-part)– Dependencies

Page 4: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

• Claims– Choice of TEI– Justification

• Creation– Environment– Format– Complexity– Constraints

• Interoperability– Rendering– Data storage– Porting

• Performance– Response time– Latency– Efficiency

• Cost– Time to develop– Permissions– Storage– QA

• Scoring– Combinatorics– Who sets rules

TEI Considerations

Page 5: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

TEIs Live in the “Grey Area” between MC and CRs

Multiple Choice Items

Constructed Response

ItemsTEIs

Page 6: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Evaluating TE Item Scoring

• Classical Theory Methods (p-value, score dist, pbis)

• Analyze trends in responses– Frequency of response patterns– Counts of object choices– Proportion of ‘blank’ responses– Frequent, incorrect responses

• Analysis may– Suggest where examinees may not understand

the item– Highlight alternative correct answers– Suggest need for partial credit or collapsing

categories

Page 7: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

TEI Scoring and Performance Factors

Item Design

Structure

Clarity

Constraints

Examinee

“Gets” the item

Facility with Tools

Experience with Item Type

Scoring

Rubric Alignment

Rubric Clarity

Scoring Quality

Page 8: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Item 1

Key: 2 points if response matches key.1 point if top or bottom row matches key.0 otherwise.

There are 19,531 ways to answer a single part, and so 381,459,961 ways to answer both parts.

Page 9: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

What do the data tell us? Response pattern frequencies

More students dragged 2/3 and then 1/3 into boxes than answered the item correctly.

Page 10: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Part 1 and Part 2 Frequencies

Page 11: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Summation versus expression representation?

Page 12: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Summation versus expression representation?

  Original Rubric New RubricScore Count Percent Count Percent0 2432 81% 2257 75%1 212 7% 335 11%2 375 12% 427 14%p-value .16 .20

• 190 examinees would have received a higher score• 138 ---- 0 to 1• 37 ---- 0 to 2• 15 ---- 1 to 2

Page 13: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Item 1 Summary

• Item Design– Clarify question– Clarify directions– Review drag target size– Revisit number of drag objects

• Examinee – Enable practice with infinite wells– Observe examinees answering the item

• Scoring– Summation versus expression? – 14% of responses are blank, why?

Page 14: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Item 2

Score

Number of Correct Objects 

Present

Number of Incorrect 

Objects Present2 4 0

14 1 or 23 12 0

0 Otherwise

Ignoring order, there are 2^10 (1024) possible answers.Preserving order, there are about 10,000,000 possible answers.

Ignoring order, there were 573 unique answers.Preserving order, there were 2961 unique answers.

Page 15: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Response pattern frequencies

Page 16: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

What objects are chosen by examinees?

Object MeanCorrelation with

item score3(x) 87% .13

x+x+x 69% .26x^3 65% -.52

5x-2x 46% .35x+3 43% -.37

3x+3 37% -.363(2x-x) 33% .17

x/3 55% -.495(x-2) 26% -.18x-x-x 23% -.25

Page 17: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Object selection by score

Object

0 (N=5814)

1(N=1212)

2(N=312)

3(x) 85% 94% 100%x+x+x 62% 92% 100%x^3 78% 20% 0%

5x-2x 37% 73% 100%x+3 53% 7% 0%

3x+3 46% 2% 0%3(2x-x) 31% 24% 100%

x/3 68% 6% 0%5(x-2) 30% 13% 0%x-x-x 28% 1% 0%

Page 18: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

New Scoring Rules

• Student needs to drag more correct objects than incorrect objects to earn a score of 1

Scores Original Rubric New Rubric

0 79% 63%1 17% 33%2 4% 4%p-value .12 .21

Page 19: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Relationship of parts to item score

Object PercentOriginal

CorrelationNew

Correlation3(x) 87% .13 .12

x+x+x 69% .26 .30x^3 65% -.52 -.53

5x-2x 46% .35 .29x+3 43% -.37 -.52

3x+3 37% -.36 -.503(2x-x) 33% .17 .04

x/3 55% -.49 -.625(x-2) 26% -.18 -.24x-x-x 23% -.25 -.36

Page 20: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Object Selections by Score Point

Original Rubric Revised Rubric

Object0

(N=5814)1

(N=1212)2

(N=312)0

(N=4624)1

(N=2402)2

(N=312)3(x) 85% 94% 100% 85% 91% 100%

x+x+x 62% 92% 100% 58% 85% 100%

x^3 78% 20% 0% 84% 38% 0%

5x-2x 37% 73% 100% 36% 57% 100%

x+3 53% 7% 0% 64% 10% 0%

3x+3 46% 2% 0% 57% 4% 0%3(2x-x) 31% 24% 100% 35% 19% 100%

x/3 68% 6% 0% 79% 16% 0%

5(x-2) 30% 13% 0% 34% 15% 0%

x-x-x 28% 1% 0% 35% 2% 0%

Page 21: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Item 2 Summary

• Item Design– Review drag target size– Revisit number of drag objects

• Examinee – Examinees appeared to understand the task

• Scoring– Are more generous rules aligned with

standard/claim? – Other rules?

Page 22: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Item 3

Student earns a 2 if she drags 4 or 5 correct steps in order and last step is x-3.Student earns a 1 if she drags 3 correct steps in order and last step is x-3.Student earns a 0 otherwise.

There are 19,081 ways to answer this item.

20 ways to earn a 216 ways to earn a 1

Page 23: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.
Page 24: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Response Frequencies (1108 unique responses)

Page 25: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Score distributions

Score Original Rubric Revised Rubric

N % N %

0 3891 75% 3758 73%

1 40 1% 173 3%

2 1227 24% 1227 24%

P-value .24 .25

Revised rubric – allows for partial credit scoring when student response contains correct path, but student drags ‘extra’ objects to fill up the remaining spaces

775 (13% of responses were blank)

Page 26: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Item 3 Summary

• Item – Remove Infinite wells – Add ‘distractors’?– Remove borders around drop targets or make dynamic

• Examinee – Students seem compelled to drag objects to fill all

spaces– Students do not reduce to final answer

• Scoring– Combinatorics – complicated scoring rules – Reversals?– Same level transformations?

Page 27: Scoring Technology Enhanced Items Sue Lottridge Director of Machine Scoring Amy Burkhardt Senior Research Associate of Machine Scoring.

Conclusions

• A review of responses and frequencies can reveal areas of misunderstanding, potential for item revision, or uncaptured correct responses

• Complexity of item leads to complexity in scoring– More ‘objects’ = more possible correct responses!– Object content influences scoring

• Placing constraints on item can help– Infinite wells– Size and number of objects

• Changes to scoring don’t always add value