Caveon Webinar Series: Using Decision Theory for Accurate Pass/Fail Decisions

Post on 16-Apr-2017

570 views 0 download

Transcript of Caveon Webinar Series: Using Decision Theory for Accurate Pass/Fail Decisions

Upcoming Caveon Events

• Caveon Webinar Series: Next session, June 19Protecting your Tests Using Copyright Law• Presenters include Intellectual Property Attorney Kenneth Horton and a

member of the Caveon Web Patrol team• Register at: http://bit.ly/protectingip

• NCSA – June 19-21 National Harbor, MD– Dr. John Fremer is co-presenting Preventing, Detecting, and Investigating Test

Security Irregularities: A Comprehensive Guidebook On Test Security For States – Visit the Caveon booth!

Latest Publications• Handbook of Test Security – Now available for

purchase! We’ll share a discount code before end of session.

• TILSA Guidebook for State Assessment Directors on Data Forensics – coming soon!

Caveon Online• Caveon Security Insights Blog

– http://www.caveon.com/blog/• twitter

– Follow @Caveon• LinkedIn

– Caveon Company Page– “Caveon Test Security” Group

• Please contribute!• Facebook

– Will you be our “friend?”– “Like” us!

www.caveon.com

“Using Decision Theory to Score Accurate Pass/Fail Decisions”

Lawrence M. Rudner, Ph.D., MBAVice President and Chief Psychometrician Research and DevelopmentGMAC®

May 15, 2013

Caveon Webinar Series:

Jamie Mulkey, Ed.D.Vice President and General ManagerTest Development ServicesCaveon

Agenda for today

• Role of decision theory

• Examples

• Logic

• Tools

• Adaptive Testing

Goal of Measurement Decision Theory

Classify an examinee into one of K groups

– mastery/non-master– below basic / basic / proficient / advanced– A / B / C / D / F

Poll #1

Are you involved with any classification tests as part of your work?

Attendee Responses:

Yes – Pass/Fail – 49%Yes - Yes - Multiple categories, e.g. A,B,C,D,F – 39%No – 11%

Poll #2

How familiar are you with Item Response Theory?

Attendee Responses:

Very – I understand and routinely apply IRT formulas – 37%Somewhat – I understand the logic and concepts – 38%A little – I have heard of it – 20%Not at all – I have never heard of it – 5%

Poll #3

What is your primary job function?

Attendee Responses:

Teacher or Content Expert -6%Item Writer – 8%Psychometrician – 30%Manager and I am a non Psychometrician – 35%Manager and I am a Psychometrician – 21%

Usual Approach

0

0. 2

0. 4

0. 6

0. 8

1

-3 -2 -1 0 1 2 3

Population Distribution

Usual Approach

0

0. 2

0. 4

0. 6

0. 8

1

-3 -2 -1 0 1 2 3

Population Distribution

New Thinking

Probability of being a Master or a Non-Master

Non-Master Master0.00.10.20.30.40.50.60.70.80.91.0

A Different Question

Old: Your score was 76 which is above the passing score of 72. You passed.

vs

New: Probability of this response pattern for a master is 85% and the probability for a non-master is 15%. You passed.

IRT Approach

Probability of a correct response to Question 123 given ability level

Question 123

-3 -2 -1 0 1 2 30.00.10.20.30.40.50.60.70.80.91.0

Non-Master Master0.00.10.20.30.40.50.60.70.80.91.0

New Thinking

Probability of a correct response to Question 123 for Masters and Non-Masters

Question 123

Advantages

• Simple framework• Small number of items• Small calibration sample sizes• Classifies as well as or better

than IRT• Effective for adaptive testing • Well developed science

Applications

• Intelligent Tutoring Systems• Diagnostic Testing• Personality Assessment• Automated Essay Scoring• Certification Examinations• End-of-course examinations

Examples

A Certification Examination

MDT

Logic

Notation

• K - # of mastery states

• P(mk) - Prob of a randomly drawn examinee being in each mastery state k

• z - an individual’s response vector z1,z2,…,zN zi ∈ (0,1) for N questions

Want

P(mk | z )

The probability of each mastery state k, mk, given the response vector z.

The probability of being a master given zThe probability of being a non-master given z

Do you recognize these people?

Bayes Theorem

• P(a|b)*P(b) = P(b|a)*P(a)

k k kP(m | ) P( )= P( |m ) P(m )cz z z

Mastery state (using Bayes Theorem)

P (m | ) = P ( | m ) P (m )k k kz zc

But there are too many possible response vectors z

Mastery state (using Bayes Theorem)

P (m | ) = P ( | m ) P (m )k k kz zc

But there are too many possible response vectors z

P ( | m ) = P (z | m )k i ki =1

N

z

Simplifying assumption

Basic Concept Conditional probabilities of a correct response, P(zi=1|mk)

Item 1

Item 2

Item 3

Masters (m1)

.8

.8

.6

Non-masters (m2)

.3

.6

.5

Response Vector [1,1,0]

Probability of the response vector z for each mastery state is:

P(z| m1) =.8 * .8 * (1-.6) = .26

Conditional probabilities of a correct response, P(zi=1|mk)

Item 1

Item 2

Item 3

Masters (m1)

.8

.8

.6

Non-masters (m2)

.3

.6

.5

Response Vector [1,1,0]

Examinee 1

Probability of the response vector z for each mastery state is:

P(z| m1) =.8 * .8 * (1-.6) = .26P(z| m2) =.3 * .6 * (1-.5) = .09

Conditional probabilities of a correct response, P(zi=1|mk)

Item 1

Item 2

Item 3

Masters (m1)

.8

.8

.6

Non-masters (m2)

.3

.6

.5

Response Vector [1,1,0]

Examinee 1

Probability of the response vector z for each mastery state is:

P(z| m1) =.8 * .8 * (1-.6) = .26P(z| m2) =.3 * .6 * (1-.5) = .09

Normalized

P(z| m1) = .26 / (.26 + .09) = .74P(z| m2) = .09 / (.26 + .09) = .26

Conditional probabilities of a correct response, P(zi=1|mk)

Item 1

Item 2

Item 3

Masters (m1)

.8

.8

.6

Non-masters (m2)

.3

.6

.5

Response Vector [1,1,0]

Examinee 1

Probability of the response vector z for each mastery state is:

P(z| m1) =.2 * .2 * .6 = .024P(z| m2) =.7 * .4 * .5 = .14

Conditional probabilities of a correct response, P(zi=1|mk)

Item 1

Item 2

Item 3

Masters (m1)

.8

.8

.6

Non-masters (m2)

.3

.6

.5

Response Vector [0,0,1]

Examinee 2

Probability of the response vector z for each mastery state is:

P(z| m1) =.2 * .2 * .6 = .024P(z| m2) =.7 * .4 * .5 = .14

Normalized

P(z| m1) = .024 / (.024 + .14) = .15P(z| m2) = .14 / (.024 + .14) = .85

Conditional probabilities of a correct response, P(zi=1|mk)

Item 1

Item 2

Item 3

Masters (m1)

.8

.8

.6

Non-masters (m2)

.3

.6

.5

Response Vector [0,0,1]

Examinee 2

Conditional probabilities of a correct response, P(zi=1|mk)

Item 1

Item 2

Item 3

Masters (m1)

.8

.8

.6

Non-masters (m2)

.3

.6

.5

Response Vector [1,0,1]

Poll 1. Master 2. Non-master

Check YourselfExaminee 3

Probability of the response vector z for each mastery state is:

P(z| m1) =.8 * (1-.8) * .6 = .096P(z| m2) =.3 * (1-.6) * .5 = .06

Normalized

P(z| m1) = .096 / (.096 + .06) = .62P(z| m2) = .06 / (.096 + .06) = .38

Response Vector [1,0,1]

Conditional probabilities of a correct response, P(zi=1|mk)

Item 1

Item 2

Item 3

Masters (m1)

.8

.8

.6

Non-masters (m2)

.3

.6

.5

Examinee 3

Decision Criteria

Decision Rule – Maximum Likelihood

0

0.05

0.1

0.15

0.2

0.25

0.3

P(z|mk)

MasterNon-Master

• Probability of the response vector, z, for each mastery state is:P(z| m1) = .8 * .8 * (1-.6) = .26 P(z| m2) = .3 * .6 * (1-.5) = .09

Decision Rule - Maximum a posteriori probability• Probability of each mastery state is

P(m1|z) = c * .26 *.7 = c* .52 = .87P(m2|z) = c * .09 *.3 = c* .08 = .13

00.10.20.30.40.50.60.70.80.9

P(mk|z)

MasterNon-Master

Decision Criteria

Bayes Risk

Given a set of item responses z and the costs associated with each decision, select dk to minimize the total expected cost.

Tools

Tools and Resources

http://edres.org/mdt• Paper• Java Applet• Download Excel tool• Tools for

– Data Generation– Item Calibration– Scoring– CAT simulation (in progress)

http://bit.ly/pareonline

Example

Adaptive Testing

1. Sequentially select items to maximize certainty,

2. Administer and score item,

3. Update the estimated mastery state classification probabilities,

4. Evaluate whether there is enough information to terminate testing,

5. Back to Step 1 if needed.

Sequential Testing

Claude Shannon

Entropy

A measure of the disorder of a system.

How many bits of information are needed to send

a) 1,000,000 random signals

b) 1,000,000 zero’s

H S p pkk

K

k( ) lo g

12

Less peaked = more uncertainty = more entropy

Non-Master Master0.0

0.2

0.4

0.6

0.8

1.0

Non-Master Master0.0

0.2

0.4

0.6

0.8

1.0

H(s) = 1.00

H(s) = 0.72

Adaptive Testing

0.2

0.4

0.6

0.8

1

0 5 10 15 20 25 30 35 40 45 50

Max No of items

Prop

ortio

n

Accuracy

Classified

Percent classified vs accuracy as a function of the maximum number of items administered (NAEP items)

Recap

• Simple framework

• Small number of items

• Classifies as well as or better than much more complicated IRT

• Effective for adaptive testing

• Small sample sizes

• Well developed science

Option For

• Small certification programs

• Large certification programs

• Embedded in instructional systems

• Test preparation

HANDBOOK OF TEST SECURITY

• Editors - James Wollack & John Fremer• Published March 2013• Preventing, Detecting, and Investigating Cheating• Testing in Many Domains

– Certification/Licensure– Clinical– Educational– Industrial/Organizational

• Don’t forget to order your copy at www.routledge.com– http://bit.ly/HandbookTS (Case Sensitive)– Save 20% - Enter discount code: HYJ82

Questions?

Please type questions for our presenters in the GoToWebinar control panel on your screen

THANK YOU!

- Follow Caveon on twitter @caveon- Check out our blog…www.caveon.com/blog- LinkedIn Group – “Caveon Test Security”

Lawrence M. Rudner, Ph.D. MBAVice President and Chief Psychometrician Research and DevelopmentGMAC®

Jamie Mulkey, Ed.D.Vice President and General Manager Test Development ServicesCaveon