Psychometrics 101: Know what your assessment data is telling you
Transcript of Psychometrics 101: Know what your assessment data is telling you
1
An ExamSoft Client Webinar
Psychometrics 101: Know What Your Exam
Data is Telling You
Psychometrics 101: Know what your assessment data is telling you Eric Ermie – Director of Client Solutions, ExamSoft (Formerly) Program Manager for Assessment and Evaluation, The Ohio State University College of Medicine.
AGENDA • Types of stats • Interpre.ng the item analysis report
• General sta.s.cal guidelines • Examples
TYPES OF STATS
Common Stats: • Item Difficulty/p Value-‐ decimal
representa3on of difficulty using the percentage of students who got the item correct. The lower the decimal the higher the difficulty
• Upper 27% -‐ what percentage of the top 27% of performers got the ques3on correct
• Lower 27% -‐ what percentage of the boBom 27% of performers got the ques3on correct.
Common Stats Cont’d: • Discrimina.on index – the
difference in performance between the Upper 27% and the Lower 27%
• Point-‐Biserial-‐ a discrimina3on sta3s3c that indicates whether doing well on that specific item correlated with doing well on the exam overall. Thus was that item a good or bad predictor of overall performance on the exam.
ITEM ANALYSIS REPORT
But with any statistic it is important to remember context matters!
ITEM ANALYSIS EXAMPLES
Diff(p) Upper A B D E
0.98 100.00% 0.10 0 1 1 *178
0.00 0.55 0.55 98.34
0.00 0.02 -0.10 0.10
0.00 0.00 -0.02 0.02
0.00 0.00 0.00 1.000.00 0.00 0.02 0.98Lower 27%
Upper 27%
Disc. Index 0.00
0.00
0.000.00
0
0.00
Lower
Disc. Index
1
% Selected
Point Biserial (rpb)
96.15% E0.04
Item #
Correct Responses Point Biserial
Correct Answer
Response Frequencies (*Indicates correct answer)
C
Diff(p) Upper A B D E
0.66 82.00% 0.28 7 17 *120 9
3.87 9.39 66.30 4.97
-0.11 -0.19 0.28 -0.07
-0.04 -0.19 0.36 -0.04
0.00 0.00 0.82 0.060.04 0.19 0.46 0.10
Lower C
Item #
Correct Responses Disc. Index
Point Biserial
Correct Answer
Response Frequencies (*Indicates correct answer)
0.36
Lower 27%Upper 27%
Disc. Index -0.09
0.210.12
Point Biserial (rpb)
46.15% D 28
15.47
-0.12
7
% Selected
ITEM ANALYSIS EXAMPLES
ITEM ANALYSIS EXAMPLES
Diff(p) Upper A B D E
0.36 52.00% 0.22 35 34 *66 25
19.34 18.78 36.46 13.81
-0.09 0.04 0.22 -0.06
-0.15 0.07 0.25 -0.02
0.10 0.24 0.52 0.100.25 0.17 0.27 0.12
Item #
Correct Responses Disc. Index
Point Biserial
Correct Answer
Response Frequencies (*Indicates correct answer)
Lower C
0.25
Lower 27%Upper 27%
Disc. Index -0.15
0.190.04
Point Biserial (rpb)
26.92% D 21
11.60
-0.20
22
% Selected
ITEM ANALYSIS EXAMPLES
Diff(p) Upper A B D E
0.55 25.00% -0.43 7 17 *120 9
3.87 9.39 55.00 7.46
-0.11 -0.19 -0.43 0.00
-0.04 -0.19 -0.57 0.00
0.00 0.00 0.25 0.000.00 0.00 0.83 0.00
Lower CItem #
Correct Responses Disc. Index
Point Biserial
Correct Answer
Response Frequencies (*Indicates correct answer)
-0.57
Lower 27%Upper 27%
Disc. Index -0.09
0.170.75
Point Biserial (rpb)
82.50% D 28
37.54
-0.12
82
% Selected
ITEM ANALYSIS EXAMPLES
Diff(p) Upper A B D E
0.52 64.00% 0.18 61 21 5 0
33.70 11.60 2.76 0.00
-0.10 -0.19 0.12 0.00
-0.12 -0.13 0.04 0.00
0.26 0.04 0.06 0.000.38 0.17 0.02 0.00
Item #
Correct Responses Disc. Index
Point Biserial
Correct Answer
Response Frequencies (*Indicates correct answer)
Lower C
0.22
Lower 27%Upper 27%
Disc. Index 0.22
0.420.64
Point Biserial (rpb)
42.31% C *94
51.93
0.18
24
% Selected
ITEM ANALYSIS EXAMPLES
Diff(p) Upper A B D E
0.71 90.00% 0.31 0 *129 30 21
0.00 71.27 16.57 11.60
0.00 0.31 -0.25 -0.11
0.00 0.34 -0.23 -0.09
0.00 0.90 0.06 0.040.00 0.56 0.29 0.13
Item #
Correct Responses Disc. Index
Point Biserial
Correct Answer
Response Frequencies (*Indicates correct answer)
Lower C
0.34
Lower 27%Upper 27%
Disc. Index -0.02
0.020.00
Point Biserial (rpb)
55.77% B 1
0.55
-0.16
34
% Selected
GENERAL GUIDELINES
Desired sta3s3cal range’s -‐ opinions differ but most commonly used are: • Item Difficulty/p Value -‐ Acceptable item difficulty is not a set number but more a
correla3on with ques3on inten3on. If you intended the item to be a mastery item you want the difficulty as close to 1.00 as possible. If you desired a discrimina3ng ques3on significantly lower levels are acceptable.
• Upper 27% -‐ if less than 60% of your top performers are geQng a ques3on correct a further analysis is needed to see if there are issues with the ques3on. Also if less of your upper 27% get a ques3on correct than your lower 27% then there is also an issue.
• Lower 27% -‐ generally you never want it to be higher than the upper 27%. As low as 0% can be acceptable as high as 100% can be acceptable if it is a mastery ques3on.
GENERAL GUIDELINES
Desired sta3s3cal range’s -‐ opinions differ but most commonly used are: • Discrimina.on index – some set specific numbers of acceptable and unacceptable
values, I would argue the more accurate guide is that the lower the p value the higher the discrimina3on index needs to be. Generally .2 the item is considered to have discriminated, less than that is considered no discrimina3on. .3 or greater is consider highly discrimina3ng.
• Point-‐Biserial – similarly to discrimina3on index some set specific numbers of acceptable and unacceptable values. Generally .2 and above is considered to have discrimina3on and have posi3ve associa3on with overall performance on the assessment, lower levels are acceptable for mastery and .3+ would be desired for discrimina3ng ques3ons.
GENERAL GUIDELINES
KR-‐20 Used as an overall measure of reliability for the assessment. Measured on a scale from 0.0 to 1.0 with 0.0 being very poor and 1.0 being excellent. Quick notes:
Heavily influenced by number of ques3ons in assessment Heavily influenced by number of students taking the assessments The combina3on can FREQUENTLY lead to false posi3ve and false nega3ve KR-‐20 values.
EXTRANEOUS FACTORS
Stats alone do not tell the whole story: • Student behavior
– Chea3ng – Return on investment
• Conflic3ng content/faculty • “six degrees from Sunday” Ways to increase the accuracy/usefulness of your stats: • Item review process
– Format – Level of difficulty – Alterna3ve correct op3ons
• Historical item analysis – Across assessments – Across versions
• Reuse/Recycle
WHERE DO WE FIT IN?
• Simplified and detailed versions of item analysis reports
• Historical item analysis data by version, assessment and in aggregate
• Ability to pull item analysis by discipline/ques3on author/category
EXAMSOFT FIT
THE DATA YOU NEED
Click to edit Master title style
Click to edit Master subtitle style For More Information:
Call: 1.866.429.8889
Email: [email protected]
Visit: learn.examsoft.com