ROC Tutorial

download ROC Tutorial

of 62

Transcript of ROC Tutorial

  • 8/10/2019 ROC Tutorial

    1/62

    Diagnostic tests: ROC curves,sensitivity, and specificity

    Michael Walker

    [email protected]

  • 8/10/2019 ROC Tutorial

    2/62

    Tutorial articles

    Spitalnic, S. Test Properties 1: Sensitivity,

    Specificity and Predictive Values. HospitalPhysician, September 2004, p. 27-31.

    Spitalnic, S. Test Properties 2: Likelihood ratios,

    Bayes formula, and receiver operatingcharacteristic curves. Hospital Physician,October 2004, p. 53-58.

    www.turner-white.com

  • 8/10/2019 ROC Tutorial

    3/62

    Clinical objectives

    Describe the diagnostic accuracy of a test.

    Compare accuracy of different tests.

  • 8/10/2019 ROC Tutorial

    4/62

    Suppose we wish to know if the

    expression level of a gene in a tumor canpredict if patients will have a recurrence oftheir cancer.

    High expression => high probability ofrecurrence

  • 8/10/2019 ROC Tutorial

    5/62

    Not cancer Cancer

    Gene expression

    Threshold

    In this case, gene expression separates cancer from non-cancer perfectly.

  • 8/10/2019 ROC Tutorial

    6/62

    Not cancer Cancer

    Gene expression

    Gene expression levels for cancer and non-cancer overlap.

    What threshold should we choose to predict recurrence?

  • 8/10/2019 ROC Tutorial

    7/62

    Threshold

    Choose a high threshold:

    Few false positive predictions,but

    lots of false negatives

    Not cancer

    Cancer

  • 8/10/2019 ROC Tutorial

    8/62

    Threshold

    Non-diseasedcases

    Diseasedcases

    Choose a low threshold:Many false positive predictions,

    butfew false negatives

  • 8/10/2019 ROC Tutorial

    9/62

    Notation for conditional probability

    Suppose that the patient has the disease (according togold standard)

    Notation to specify the probability that the test for thepatient is positive, given that the patient has the disease:

    P(Test positive | patient has disease)

    P(T+ | D+)

    This notation describes the conditional probability.

  • 8/10/2019 ROC Tutorial

    10/62

    Sensitivity and Specificity

    We want the test to be positive when the patient

    has the disease: Sensitivity

    = P(Test positive | patient has disease)

    We want the test to be negative when the patient

    does not have the disease: Specificity

    = P(Test negative | patient does not have disease)

  • 8/10/2019 ROC Tutorial

    11/62

    Sensitivity and Specificity

    Sensitive => find ALL disease

    Sensitivity= P(Test positive | patient has disease)

    Specific => find ONLY disease

    Specificity= P(Test negative | patient does not have

    disease)

  • 8/10/2019 ROC Tutorial

    12/62

    Sensitivity and specificity depend on

    How well the test separates the two groups What threshold we choose

  • 8/10/2019 ROC Tutorial

    13/62

    Not cancer Cancer

    Gene expression

    Threshold

    If gene expression separates cancer from non-cancer perfectly:

    Good sensitivity = P(Test positive | patient has disease)

    Good specificity = P(Test negative | patient does not have disease)

  • 8/10/2019 ROC Tutorial

    14/62

    Not cancer Cancer

    Gene expression

    If gene expression levels for cancer and non-caner overlap:

    Sensitivity and specificity depend on what threshold we choose

  • 8/10/2019 ROC Tutorial

    15/62

    Threshold

    High threshold:Few false positive predictions,

    but

    lots of false negatives

    Not cancer

    Cancer

    Poor sensitivity = P(Test positive |patient has disease)Many false negatives

    Good specificity = P(Test negative |patient does not have disease)Few false positives

  • 8/10/2019 ROC Tutorial

    16/62

    Threshold

    Not cancer

    Cancer

    Low threshold:Many false positive predictions,

    but

    few false negatives

    Good sensitivity= P(Test positive | patient has disease)Few false negatives

    Poor specificity= P(Test negative | patient does not have disease)Many false positives

  • 8/10/2019 ROC Tutorial

    17/62

    Sensitivity and Specificity

    Sensitivity

    = P(Test positive | patient has disease)= P(T+ | D+)

    = True positive rate

    Specificity

    = P(Test negative | patient does not have disease)

    = P(T- | D-)

    =True negative rate

  • 8/10/2019 ROC Tutorial

    18/62

    False positive rate (FPR) = 1 specificity

    False negative rate (FNR) = 1 - sensitivity

  • 8/10/2019 ROC Tutorial

    19/62

    Threshold

    High threshold:Few false positive predictions,

    but

    lots of false negatives

    Not cancer

    Cancer Poor sensitivity = P(Test positive |patient has disease)Many false negatives

    Good specificity = P(Test negative |patient does not have disease)Few false positives

    Low False positive rate (FPR) = 1 specificity

    High False negative rate (FNR) = 1 - sensitivity

  • 8/10/2019 ROC Tutorial

    20/62

    Sensitivity and specificity tell us about the

    test result, given that we know if thepatient has the disease or not.

    In clinic, we dont know if the patient hasthe disease; thats what we want the testto tell us.

  • 8/10/2019 ROC Tutorial

    21/62

    PPV and NPV

    Positive predictive value (PPV)

    = P(patient has disease | Test positive )= P(D+ | T+)

    Negative predictive value (NPV)

    = P(patient does not have disease | Test

    negative )

    = P(D- | T-)

  • 8/10/2019 ROC Tutorial

    22/62

    Sensitivity and Specificity

    Sensitivity = P(T+ | D+)

    Specificity = P(T- | D-) PPV = P(D+ | T+)

    NPV = P(D- | T-)

  • 8/10/2019 ROC Tutorial

    23/62

    PPV and NPV are a function of the

    prevalence (the proportion of thepopulation that has the disease), as wewill see shortly

    Sensitivity and specificity do not dependon the prevalence. They are conditional on

    the patient either having or not having thedisease.

  • 8/10/2019 ROC Tutorial

    24/62

    Spitalnic DVT Example 1

    Disease +ve Disease -ve Total

    Test +ve 90 160 250

    Test -ve 10 240 250

    Total 100 400 500

  • 8/10/2019 ROC Tutorial

    25/62

    Example

    Disease

    +ve

    Disease -

    ve

    Total

    Test +ve 90 160 250

    Test -ve 10 240 250

    Total 100 400 500

    Disease+ve

    Disease -ve Total

    Test

    +ve True +ve False +ve Total T+

    Test -ve False -ve True -ve Total T-

    Total Total D+ Total D- Total pts

  • 8/10/2019 ROC Tutorial

    26/62

    A B A+B

    C D C+D

    A+C B+D A+B+C+D

    Disease

    +ve

    Disease -

    ve

    Total

    Test +ve 90 160 250

    Test -ve 10 240 250

    Total 100 400 500

  • 8/10/2019 ROC Tutorial

    27/62

    Sensitivity = A/(A+C) = 0.9

    Specificity = D/(B+D) = 0.6

    FPR = 1 - Specificity = B/(B+D) = 0.4

    FNR = 1 - Sensitivity = C / (A+C) = 0.1

    PPV = P(D+|T+) = A/(A+B) 0.36

    NPV = P(D-|T-) = D/(C+D) 0.96

  • 8/10/2019 ROC Tutorial

    28/62

    Later, well see how to calculate PPV and

    NPV using Bayes rule

  • 8/10/2019 ROC Tutorial

    29/62

    Tutorial article

    Spitalnic, S. Test Properties 2: Likelihood ratios,

    Bayes formula, and receiver operatingcharacteristic curves. Hospital Physician,October 2004, p. 53-58.

  • 8/10/2019 ROC Tutorial

    30/62

    Pre-test probability = estimated probability

    that the patient has the disease beforegetting the diagnostic test result. The pre-test probability = prior probability

    Post-test probability = estimatedprobability that the patient has the diseaseafter getting the diagnostic test result. The post-test probability = posterior

    probability

  • 8/10/2019 ROC Tutorial

    31/62

    ROC curves

    We want to be able to compare the

    accuracy of diagnostic tests. Sensitivity and specificity are candidate

    measures for accuracy, but have someproblems, as well see.

    ROC curves are an alternative measure

  • 8/10/2019 ROC Tutorial

    32/62

    ROC curves

    We plot sensitivity against 1 specificity to

    create the ROC curve for a test

  • 8/10/2019 ROC Tutorial

    33/62

    Not cancer Cancer

    Gene expression

    Threshold

    A test that perfectly separates the two groups

    Sensitivity = P(Test positive | patient has disease) = 1.0

    Specificity = P(Test negative | patient does not have disease) = 1.0

  • 8/10/2019 ROC Tutorial

    34/62

    TPF,sensitivity

    FPF, 1-specificity

    ROC curve for a perfect test

    Sensitivity = 1Specificity = 1

    1 specificity = 0

  • 8/10/2019 ROC Tutorial

    35/62

    For a single diagnostic test, sensitivity and

    specificity vary with the threshold we use.

  • 8/10/2019 ROC Tutorial

    36/62

    Threshold

    TPF,sensitivity

    FPF, 1-specificity

    High threshold

    Not cancer

    Cancer

    High threshold:Good specificity = P(T-| D-)

    Medium sensitivity = P(T+|D+)

  • 8/10/2019 ROC Tutorial

    37/62

    Threshold

    TPF

    ,sensitivity

    FPF, 1-specificity

    Medium threshold

    Not cancer

    Cancer

    Medium threshold:Medium specificity = P(T-| D-)

    Medium sensitivity = P(T+|D+)

  • 8/10/2019 ROC Tutorial

    38/62

    Threshold

    TPF

    ,sensitivity

    FPF, 1-specificity

    Low threshold

    Cancer

    Not cancer

    Low threshold:Medium specificity = P(T-| D-)

    Good sensitivity = P(T+|D+)

    E t l th h ld

  • 8/10/2019 ROC Tutorial

    39/62

    Threshold

    Not cancer

    Cancer TPF

    ,sensitivity

    FPF, 1-specificity

    Extreme low threshold:No specificity = P(T-| D-)

    Perfect sensitivity = P(T+|D+)

  • 8/10/2019 ROC Tutorial

    40/62

    TPF

    ,sensitivity

    FPF, 1-specificity

    For a test that cannot separate

    the two classes, the ROCcurve is a straight 45 degreeline.

    Good tests approach the topleft corner of the ROC curve. c

    hance

    line

    Good tests approach the top left corner of the

  • 8/10/2019 ROC Tutorial

    41/62

    TPF

    ,sensitivity

    FPF, 1-specificity

    Good tests approach the top left corner of theROC curve.

    chance

    line

    The area under the ROC curve describes test accuracy

  • 8/10/2019 ROC Tutorial

    42/62

    TPF

    ,sensitivity

    FPF, 1-specificity

    The area under the ROC curve describes test accuracy

    chance

    line

    Poor test: ROC area near 0.5

    Good test: ROC area near 1.0

  • 8/10/2019 ROC Tutorial

    43/62

    Sensitivity and specificity dont always

    make it clear which of two diagnostic testsis better

    Which test is better?

  • 8/10/2019 ROC Tutorial

    44/62

    Which test is better?

    False Positive Fraction

    = 1.0 Specificity

    True

    Positive

    Fraction

    =

    Se

    nsitivity

    1.0

    1.0

    0.00.0

    Test A

    Test B

    Different ROC for each test

  • 8/10/2019 ROC Tutorial

    45/62

    False Positive Fraction

    = 1.0 Specificity

    True

    Positive

    Fraction

    =

    Se

    nsitivity

    1.0

    1.0

    0.00.0

    Test A

    Test BTest B is betterGreater ROC area

    Different ROC for each test

  • 8/10/2019 ROC Tutorial

    46/62

    False Positive Fraction

    = 1.0 Specificity

    True

    Positive

    Fraction

    =

    Se

    nsitivity

    1.0

    1.0

    0.00.0

    Test A

    Test BTest A is betterGreater ROC area

    Both tests are on the same ROC

  • 8/10/2019 ROC Tutorial

    47/62

    Both tests are on the same ROC

    False Positive Fraction

    = 1.0 Specificity

    True

    Positive

    Fraction

    =

    Se

    nsitivity

    1.0

    1.0

    0.0

    0.0

    Test A

    Test B Tests have same

    area under ROC

  • 8/10/2019 ROC Tutorial

    48/62

    ROC curves may cross.

    In this case, total area under the

    ROC curve may not be a goodmeasure for comparing tests

    Potential ROC issues

  • 8/10/2019 ROC Tutorial

    49/62

    Potential ROC issues

    Lack of gold standard for diagnosis

    Lack of reproducibility E.g., disagreement among pathologists

    Bias in sample selection, spectrum of

    disease used in evaluating test Choose sickest patients, healthy controls

    Problems in ascertainment Genetic disease may not be manifest

    Cant always reliably measure ROC area

    Bayes rule

  • 8/10/2019 ROC Tutorial

    50/62

    Bayes rule

    How to use Bayes rule to determine the

    posterior probability

    Bayes rule

  • 8/10/2019 ROC Tutorial

    51/62

    y

    P(D+) (prior probability), the prior probability

    that the patient has the disease in theabsence of any test data (prevalence)

    P(T+): probability of a positive test result

    (including both true positive and false

    positive)

    P(D|T) (posteriori probability), the probability

    of disease given the test result

    Derivation of Bayes rule

  • 8/10/2019 ROC Tutorial

    52/62

    Derivation of Bayes rule

    )(

    )()|(

    )|( TPDPDTP

    TDP =

    )()|()()|( DPDTPTPTDP =

    Bayes rule

  • 8/10/2019 ROC Tutorial

    53/62

    Bayes rule

    )()()|(

    )|( TPDPDTP

    TDP =

    )()|()()|()()|()|(

    +++++

    +++=++

    DPDTPDPDTPDPDTP

    TDP

    )()|()()|()()|()|( +++=++

    DPDTPDPDTPDPDTPTDP

  • 8/10/2019 ROC Tutorial

    54/62

    P(D+|T+) = Probability of disease given test +ve P(T+|D+) = sensitivity

    P(D+) = prevalence

    P(T+|D-) = 1 specificity

    P(D-) = 1- prevalence

    )()|()()|()|(

    +++++++

    DPDTPDPDTPTDP

    Example: sensitivity = 0 9 specificity = 0 8

  • 8/10/2019 ROC Tutorial

    55/62

    Example: sensitivity = 0.9, specificity = 0.8,

    prevalence=0.5 P(T+|D+) = Sensitivity = 0.9

    P(D+) = 0.5 P(T+|D-) = 1 specificity = 1 - 0.8 = 0.2

    P(D-) = 1- prevalence = 1 0.5 = 0.5

    82.0)|( 5.0*2.05.0*9.05.0*9.0

    =+

    =++

    TDP

    )()|()()|(

    )()|()|(++++

    +++=++

    DPDTPDPDTP

    DPDTPTDP

    Likelihood ratio version of Bayes

  • 8/10/2019 ROC Tutorial

    56/62

    rule This section is optional

    We can express Bayes rule in terms ofodds and likelihood ratios

    Well start by defining odds and likelihoodratios

    Odds

  • 8/10/2019 ROC Tutorial

    57/62

    Odds

    Probability of an event is in the interval [0,1] If the probability of an event is P = 0.5,

    Then the odds = P/(1-P) = 0.5/(1-0.5) = 0.5/0.5 = 1

    P

    POdds

    =

    1

  • 8/10/2019 ROC Tutorial

    58/62

  • 8/10/2019 ROC Tutorial

    59/62

    Likelihood ratio (LR)

  • 8/10/2019 ROC Tutorial

    60/62

    Likelihood ratio (LR)

    Likelihood ratio for a negative test (LR-)

    yspecificit

    ysensitivit

    DTP

    DTP

    LR

    =

    +=

    1

    )|(

    )|(

    Likelihood version of Bayes rule

  • 8/10/2019 ROC Tutorial

    61/62

    Likelihood version of Bayes rule

    Post-test odds = LR * pre-test odds

    See example in Spitalnic, S. TestProperties 2: Likelihood ratios, Bayesformula, and receiver operatingcharacteristic curves. Hospital Physician,October 2004, p. 53-58.

  • 8/10/2019 ROC Tutorial

    62/62

    Power and sample size for ROC curve

    Example using NCSS PASS