RELIABILITY OF DISEASE CLASSIFICATION
-
Upload
vernon-wiggins -
Category
Documents
-
view
26 -
download
0
description
Transcript of RELIABILITY OF DISEASE CLASSIFICATION
RELIABILITY OF DISEASE CLASSIFICATION
Nigel Paneth
TERMINOLOGY
Reliability is analogous to precision
Validity is analogous to accuracy
Reliability is how well an observer classifies the same individual under different circumstances.Validity is how well a given test reflects another test of known greater accuracy.
RELIABILITY AND VALIDITYReliability includes:• assessments of the same observer at different times - INTRA-OBSERVER RELIABILITY• assessments of different observers at the same time - INTER-OBSERVER RELIABILITY
Reliability assumes that all tests or observers are equal; Validity assumes that there is a gold standard to which a test or observer should be compared.
ASSESSING RELABILITY
How do we assess reliability?
One way is to look simply at percent agreement.
Percent agreement is the proportion of all diagnoses classified the same way by two observers.
EXAMPLE OF PERCENT AGREEMENT
Two physicians are each given a set of 100 X-rays to look at independently and asked to judge whether pneumonia is present or absent. When both sets of diagnoses are tallied, it is found that 95% of the diagnoses are the same.
IS PERCENT AGREEMENT GOOD ENOUGH?
Do these two physicians exhibit high diagnostic reliability?
Can there be 95% agreement between two observers without really having good reliablity?
Compare the two tables below:
Table 1 Table 2
MD#1
Yes No
MD#2
Yes 1 3
No 2 94
MD#1
Yes No
MD#2
Yes 43 3
No 2 52
In both instances, the physicians agree 95% of the time. Are the two physicians equally reliable in the two tables?
MD#1
Yes No
MD#2
Yes 43 3
No 2 52
• What is the essential difference between the two tables?
• The problem arises from the ease of agreement on common events (e.g. not having pneumonia in the first table).
• So a measure of agreement should take into account the “ease” of agreement due to chance alone.
USE OF THE KAPPA STATISTIC TO ASSESS
RELIABILITY
Kappa is a widely used test of inter or intra-observer agreement (or reliability) which corrects for chance agreement.
KAPPA VARIES FROM + 1 to - 1+ 1 means that the two observers are perfectly
reliable. They classify everyone exactly the same way.
0 means there is no relationship at all between the two observer’s classifications, above the agreement that would be expected by chance.
- 1 means the two observers classify exactly the opposite of each other. If one observer says yes, the other always says no.
GUIDE TO USE OF KAPPAS IN EPIDEMIOLOGY AND MEDICINE
Kappa > .80 is considered excellent
Kappa .60 - .80 is considered good
Kappa .40 - .60 is considered fair
Kappa < .40 is considered poor
1st WAY TO CALCULATE KAPPA
1. Calculate observed agreement (cells in which the observers agree/total cells). In both table 1 and table 2 it is 95%
2. Calculate expected agreement (chance agreement) based on the marginal totals
Table 1’s marginal totals are:
OBSERVED
MD#1
Yes No
MD#2
Yes 1 3 4
No 2 94 96
3 97 100
• How do we calculate the N expected by chance in each cell?
• We assume that each cell should reflect the marginal distributions, i.e. the proportion of yes and no answers should be the same within the four-fold table as in the marginal totals.
OBSERVED MD #1
Yes No
MD#2 Yes 1 3 4
No 2 94 96
3 97 100
EXPECTED MD #1
Yes No
MD#2 Yes 4
No 96
3 97 100
To do this, we find the proportion of answers in either the column (3% and 97%, yes and no respectively for MD #1) or row (4% and 96% yes and no respectively for MD #2) marginal totals, and apply one of the two proportions to the other marginal total. For example, 96% of the row totals are in the “No” category. Therefore, by chance 96% of MD #1’s “No’s” should also be in the “No” column. 96% of 97 is 93.12.
EXPECTED
MD#1
Yes No
MD#2 Yes 4
No 93.12 96
3 97 100
By subtraction, all other cells fill in automatically, and each yes/no distribution reflects the marginal distribution. Any cell could have been used to make the calculation, because once one cell is specified in a 2x2 table with fixed marginal distributions, all other cells are also specified.
EXPECTED MD #1
Yes No
MD#2 Yes 0.12 3.88 4
No 2.88 93.12 96
3 97 100
Now you can see that just by the operation of chance, 93.24 of the 100
observations should have been agreed to by the two observers. (93.12 + 0.12)
EXPECTED MD #1
Yes No
MD#2 Yes 0.12 3.88 4
No 2.88 93.12 96
3 97 100
Lets now compare the actual agreement with the expected agreement.
• Expected agreement is 6.76% from perfect agreement of 100% (100 – 93.24)
• Actual agreement is 5.0% from perfect agreement (100 – 95).
• So our two observers were 1.76% better than chance, but if they had agreed perfectly they would have been 6.76% better than chance. So they are really only about ¼ better than chance (1.76/6.76)
Below is the formula for calculating Kappa from expected agreement
Observed agreement - Expected Agreement
1 - Expected Agreement
95% - 93.24% = 1.76% = .26
1 - 93.24% 6.76%
How good is a Kappa of 0.26?
Kappa > .80 is considered excellent
Kappa .60 - .80 is considered good
Kappa .40 - .60 is considered fair
Kappa < .40 is considered poor
In the second example, the observed agreement was also 95%, but the
marginal totals were very different
ACTUAL MD #1
Yes No
MD#2 Yes 46
No 54
45 55 100
Using the same procedure as before, we calculate the expected N in any one cell, based on the marginal totals. For example, the lower right cell is 54% of 55, which is 29.7
ACTUAL MD #1
Yes No
MD#2 Yes 46
No 29.7 54
45 55 100
And, by subtraction the other cells are as below. The cells which indicate agreement are highlighted in yellow, and add up to 50.4%
ACTUAL MD #1
Yes No
MD#2 Yes 20.7 25.3 46
No 24.3 29.7 54
45 55 100
Enter the two agreements into the formula: Observed agreement - Expected Agreement
1 - Expected Agreement
95% - 50.4% = 44.6% = .901 - 50.4% 49.6%
In this example, the observers have the same % agreement, but now they are much different from chance. Kappa of 0.90 is considered excellent
A 2nd WAY TO CALCULATE THE KAPPA STATISTIC
MD#1
Yes No
MD#2
Yes A B N1
No C D N2
N3 N4total
2(AD - BC)
N1N4 + N2N3
where the Ns are the marginal totals, labeled thus:
Look again at the tables on slide 7.For Table 1:
2(94 x 1 - 2 x 3) = 176 = .26 4 x 97 + 3 x 96 676
For Table 2:
2(52 x 43 - 3 x 2) = 4460 = .90 46 x 55 + 45 x 54 4960
Note parallels between:
THE ODDS RATIO
THE CHI-SQUARE STATISTIC
THE KAPPA STATISTIC
Note that the cross-products of the four-fold table, and their relation to marginal totals, are central to all three expressions