IRT Model Specifications and Scale Characteristics · PDF fileIRT Model Specifications and...
Transcript of IRT Model Specifications and Scale Characteristics · PDF fileIRT Model Specifications and...
IRTModelSpecificationsandScaleCharacteristics
Lecture#3ICPSRItemResponseTheoryWorkshop
Lecture#3:1 of37
LectureOverview
IRTmodelspecifications
Scalingcharacteristics
Lecture#3:2 of37
PurposeofIRT
ThemainpurposeofIRTistocreateascale fortheinterpretationoftestswithusefulproperties
SomeofthepropertiesofIRTwillallowustodescribethecharacteristicsofthatscaleinamoremeaningfulway
Lecture#3:3 of37
ModelSpecifications
Logisticmodelsareusedtolinkpersontraitanditemresponseprobabilities
Theprobabilityofacorrectresponseisamonotonicallyincreasingfunctionofthetraitbeingmeasured,theta
Conditionalprobabilityofitemperformanceisavailableallalongthescaleofthetraitbeingmeasured
Lecture#3:4 of37
ScaleCharacteristics
Aswithmany(most?)metrics,thescaleitselfinIRTisarbitrarilychosen
Onceascaleischosen,themodelhassomeveryusefulproperties: Testitemsandpersontraitlevelsarereferencedtothesameintervalscale
Personanditemstatisticsarenotdependentononeanother
Lecture#3:5 of37
Dichotomousmodels
Thesemodelsareusedwhentestitemsarebinary Scoredaseitherincorrectorcorrect,0or1
Thethreeparameterlogistic(3PL)modeldescribestherelationshipbetweenexamineeabilityandtheprobabilityofacorrectresponsewith3parameters:difficulty,discrimination,andguessing
Lecture#3:6 of37
3PLIRTModel
Lecture#3:7 of37
OtherModels
2PL::noguessing(c)parameter Itisassumedthatguessingisnotafactorinrespondingtoanitem
1PL::noc orslope(a)parameter Itisalsoassumedthatallitemsareequallydiscriminating A.K.A.theRaschmodel
Lecture#3:8 of37
2PLIRTModel
2PL::c=0
Lecture#3:9 of37
1PLIRTModel
1PL::a=common,c=0,
Lecture#3:10 of37
IRTParameters
Ability():generallyscaledwithameanof0andSDof1(likeazscore)
Theeffectiverangeof isthereforefromabout4to+4
Thisscaleisarbitrary,butoncechosenit: Isusedtoidentifythemodel Determinesthescaleoftheitemparameters
Lecture#3:11 of37
ItemParameters
b difficultyorlocation Samescaleas,generally4 b +4
a discriminationorslope Oftenboundedby0,generallya 2.0
c guessingorlowerasymptote Boundedby0&1,generallyc 0.25
Lecture#3:12 of37
ImportantAssumptions
UnidimensionalityoftheTest LocalIndependence NatureoftheICC ParameterInvariance
Lecture#3:13 of37
Arbitrarinessofthescale
ParametersinanIRTmodelareinvariant,butalsoscaleindeterminate
AscalemustbechosentoidentifyanIRTmodel Thatscaleisonlydefineduptoalineartransformation
Choosingameanof0andSDof1for identifiesascaleforinterpretation,anddeterminesthescaleofitemparameters
Anylineartransformationof,withacorrespondingtransformationforitemparameters,wouldprovidethesameICCs
Lecture#3:14 of37
ParameterInvariance
Thisassumptionstatesthatparametersareinvariantuptoalineartransformation Accountsforthearbitrarinessofthescalechosentoidentifythemodel
Oncethescaleischosen,thisassumptioncanbetested
Lecture#3:15 of37
AbilityScale
Becauseresponseprobabilities(ICCs)aremaintainedthroughalineartransformation,theabilityscalecanbe(andoftenis)transformedaftercalibrationtocreateamoreconvenientscaleforinterpretation,usage,andscorereporting
Example:GRE( =500, =100)
Lecture#3:16 of37
if then
new
new
new
new
x yb xb y
aax
c c
Thesetransformationpreservetheprobability:1 1|
Lecture#3:17 of37
AbilityScores
Abilityisoftenthelabelusedtodescribewhatthetestmeasuresineducationalcontexts.
AmoregeneraltermwouldbeTraitwhichwouldalsoencompasspsychological(noncognitive)measures
TheTraitorAbilityisusedtodefinewhatisbeingmeasuredbythepoolofitemsfromwhichthetestitemsweredrawn
Lecture#3:18 of37
Lecture#3:19 of37
Wecantactuallysampletheentireuniverseofpossibletestitems,soweareofteninterestedinaddingmeaningthetraitscaletogetabetterunderstandingoftheconstructbeingmeasured
AddingMeaningtoAbility
IRTallowsustoincreasethemeaningandinterpretabilityofscaledscoresthrough:
ItemMapping Identifyingabilitylevelsthatcorrespondtoparticularlevelsofitemperformance
Benchmarking Determininganchorpointsthatgivemeaningtothescale
Lecture#3:20 of37
ItemMapping
DetermineaparticularlevelofResponseProbability(RP)thatrepresentsmasteryandmaptheabilitylevelthatcorrespondstothisRPvalueforeachitem
Examples:RP50,RP65,RP85
Lecture#3:21 of37
Lecture#3:22 of37
Item Mapping
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
200 300 400 500 600 700 800
GRE Achievement Scale
Prob
abili
ty
RP85 = 600RP85 = 700
Lecture#3:23 of37
Item Mapping
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
-3 -2 -1 0 1 2 3
Ability ()
Prob
abili
ty
RP85 = 1RP85 = 2
ItemMapping
Throughexaminationoftheitemitself,atestmakermaythenrelatetheparticularRPvaluerepresentingmasterytoagiven level
Hereisthekindofitemthatsomeonewitha600GREscorehasmastered
Lecture#3:24 of37
AnchorPoints
DeterminetestscorelevelsthatcorrespondtomeaningfulcategorizationsofAbility(e.g.,Basic,Proficient,Advanced)
OtherexamplesofBenchmarks: Lastyearsaveragescore Locationofbestorworstschools Locationofaveragestudentsscore
Lecture#3:25 of37
Lecture#3:26 of37
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
-3 -2 -1 0 1 2 3
Ability ()
Expe
cted
Sco
re
B P A
0.0 1.4 1.8
90.3%
82.0%
37.1%
Anchor Points for a TCC
ExcelSpreadsheetDemo
ShowExcelSpreadsheetcontainingfouritems,theirICCs,andtheassociatedTCC
SpecifydifferentitemparametersanddeterminehowchangesaffecttheresultingICCs
Lecture#3:27 of37
ExampleItems
Parameter Item 1 Item 2 Item 3 Item 4
b 0.0 -1.0 1.0 1.0
a 1.0 0.5 1.0 2.0
c 0.2 0.0 0.0 0.1
Lecture#3:28 of37
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
-3 -2 -1 0 1 2 3
Ability ()
Prob
abili
ty o
f Cor
rect
Res
pons
e
12
3
4
Lecture#3:29 of37
TestCharacteristicCurve
Atestcharacteristiccurve(TCC)iscreatedbysummingeachICCacrosstheabilitycontinuum
Theverticalaxisnowreflectstheexpectedscoreonthetest foranexamineewithagivenabilitylevel
Lecture#3:30 of37
TestCharacteristicCurve
Atestcharacteristiccurve(TCC)iscreatedbysummingeachICCacrosstheabilitycontinuum
Theverticalaxisnowreflectstheexpectedscoreonthetestforasubjectwithagivenabilitylevel
Since istheexpectedscorefortheitem,theTCCistheexpectedscore,E(Y), forthetest Howmanyitemsweexpectasubjectwithaparticularabilitylevel
toanswercorrectly
Lecture#3:31 of37
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
-3 -2 -1 0 1 2 3
Ability ()
Prob
abili
ty o
f Cor
rect
Res
pons
e
12
3
4
Lecture#3:32 of37
0
1
2
3
4
-3 -2 -1 0 1 2 3
Ability ()
Expe
cted
Sco
re
Lecture#3:33 of37
0
1
2
3
4
-3 -2 -1 0 1 2 3
Ability ()
Expe
cted
Sco
re
We expect that examinees with ability = 0.49 on average will answer 2 out of the 4 items correctly.
Lecture#3:34 of37
CONCLUDINGREMARKS
Lecture#3:35 of37
WrappingUp
Itemresponsetheoryisapowerfulmethodthatcanbeusedtobuildandassessscales
Themethodisflexibleandaccommodatemanytypesoftestingitemsandsituations
TodaywastheintroductiontoIRTconcepts Overtherestoftheweekwewillexpanduponeachofthese
Lecture#3:36 of37
NextLab
Today computertime:IntroductiontoMplus 1PLand2PLexampleswithsyntax
Tomorrowmorning: PolytomousData
Wewilldiscusswhathappenswehaveitemsscoredwithmorethantwocategories
TheIRTmodelsusedwillbegeneralizationsofthedichotomousmodelspresentedhere
Lecture#3:37 of37