Reliability
Transcript of Reliability
Reliability
Discrepancies between true ability and measurement of ability constitute errors of measurement.
In Psychological Testing, ERROR does not imply that a mistake has been made. It implies that there will always be inaccuracy in measurements.
Tests that are free of measurement error are deemed to be reliable.
Tests that have too much error are deemed to be unreliable.
It is assumed that each person has a true score that would be obtained if there were no errors in measurement.
The difference between the true score and the observed score results from measurement error.
X -T= EWhere X – observed scoreT- true scoreE- error
It is assumed that the true score for an individual will not change with repeated applications of the same test.
Because of random error, however, repeated applications of the same test can produce different scores.
The standard deviation will be the standard error of measurement.
Remember that the standard deviation tells us about the average deviation around the mean.
The standard error of measurements tell us, on the average, how much the score varies from the true score.
In practice, the standard deviation of the observed score and the reliability of the test are used to estimate the standard error of measurement.
Federal government guidelines require that a test be reliable before one can use it to make employment and educational placement decisions (Heubert and Hauser, 1999).
Models of Reliability
Time Sampling: The Test -Retest Method
Is used to evaluate the error associated with administering a test at 2 different times.
Administer the same test on 2 well-specified occassions and find the correlation between scores from the 2 administrations.
Models of Reliability
Item Sampling: Parallel Forms MethodEquivalent Forms ReliabilityParallel Forms
• Determines the error variance that is attributable to the selection of one particular set of items
• Compares two equivalent forms of a test that measure the same attribute
• Pearson Product Moment Correlation
Models of Reliability
• Split Half Method
• A test is given and is divided into halves that are scored separately. The results of one half of the test are then compared with the results of the other.
• Odd-even system• Correlation between
the 2 halves
• Kuder-Richardson 20 Formula (KR20)
• Use to calculate for the reliability of the test in which the items are dichotomous, scored 0 or 1 (usually for right or wrong)
• Sum of the product of people passing each item times the proportion of people failing each item
Models of Reliability
• Split Half Method • Spearman- Brown Formula: use to correct for the half length of the test
Kuder-Richardson 21 (KR21)
A special case of the reliability formula that does not require the calculation of the p’s and q’s., instead it uses the mean test score
Assumes that all items are of average difficulty
Coeficient Alpha
Cronbach Alpha
The most general method of finding estimates of reliability through internal consistency.
How reliable is reliable?
What is “high enough”?
The answer depends on the use of the test.
.70 - .80 are good enough for the purposes of basic research.
In CLINICAL SETTINGS, a .90 reliability index may not be good enough; greater than .95 should be required