Measures of Disease Incidence Used

download Measures of Disease Incidence Used

of 8

description

Epidemiologia

Transcript of Measures of Disease Incidence Used

  • 5/19/2018 Measures of Disease Incidence Used

    1/8

    International Journal

    of

    Epidemiology

    O Oxford University Preu

    1980

    Vol. 9 No. 1

    Printed

    In

    Great Britain

    Measures

    of

    Disease Incidence Used

    in Epidemiologic Research

    H A L M OR GEN ST ER N

    1

    2

    D A VID G K L EI N B A U M

    3

    4

    and LAWRENCE LKUPPER

    4

    Morgenstern

    H

    [ Dep ar t m en t

    of

    Epidemiology

    and

    Public Hea lth, School

    of

    Medicine, Yale University,

    60

    College

    Street,

    New

    Haven,

    CT

    06 51 0 ] , K le in b au m

    D G and

    Kupper

    L L

    Measure*

    of

    disease incidence used

    in

    epidemio-

    logic research. International Journal

    o

    Epidemiology 1 9 8 0 ,

    9:

    9 7 1 0 4 .

    This paper distinguishes between

    2

    concepts

    for

    measuring

    the

    incidence

    of

    disease: risk

    and

    rate. Alternative

    procedures

    for

    estimating these measures from epidem iologic data

    are

    reviewed

    and

    illustrated.

    An

    a t tempt

    is

    made

    to

    integrate statistical principles with epidemiologic methods while minimizing

    the

    use

    of

    higher mathematics.

    Several theoretical

    and

    practical criteria

    are

    discussed

    for

    choosing

    the

    appro priate incidence measure

    in the

    plan-

    ningof a study and forselecting thebest m ethod ofest imation inthe analysis.

    In a follow-up study of human populations, the

    fundamental concerns

    to the

    investigator

    are the

    measurement, analysis, and interpretation of an

    observed setofnew even ts: incident casesofdisease,

    deaths from one

    or

    more causes, or new occurrences

    of any health-related event. Despite the long history

    of this type

    of

    research, dating back well over

    a

    cen-

    tury, muchofthe current literature on epidemiologic

    methods (including textbooks) regards disease in-

    cidence as asingle concept that is readily operation-

    alized from longitudinal data. The purpose of

    this paper

    is to

    distinguish

    2

    basic concepts

    for

    quantifying incidence

    and to

    compare alternative

    methods

    for

    estimating each

    of

    them. While most

    of

    the ideas that follow are notnew,it isfelt that the

    theoretical work of statisticians has not been

    adequately appliedtothe understand ing of empirical

    research. Perhaps,oneexplanationforthis s ituation

    is the increasing lack of mutual understanding

    between health researchers

    who are

    primarily

    concerned with findings

    and

    statisticians

    who are

    more concerned with analytic methods. An attempt

    1

    Department of Epidemiology and Public Health,

    School of Medicine, Yale University, 60 College

    Street, New Haven, CT 06510, USA.

    1

    Center for Health Studies, Institution for Social and

    Policy Studies, Yale University, 89 Trumbull Street,

    New Haven, CT 06520, USA.

    3

    Department of Epidemiology, School of Public

    Health, University of North Carolina, Chapel Hill,

    NC 27514, USA.

    4

    Department of BioiUtistics, School of Public Health,

    University of North Carolina, Chapel Hill, NC 27514,

    USA.

    Tbe work on this paper was partly funded by two grants

    from:

    tbe National Cancer Institute W1-PO1-CA-16359-O5)

    and tbe Henry J. Kaiser Family Foundation.

    is made in this paper to merge statistical principles

    with epidemiologic methods while minimizing

    the

    useofhigher m athematics.

    BASIC THEORETICAL CONCEPTS

    Two distinct concepts

    may be

    employed

    to

    assess

    the expected occurrence

    of new

    (incident) events:

    risk and rate. Risk is the probability ofan individual

    developing a given disease or health status change

    over a specified period, conditional on not dying

    from anyother causes during that period. Thus, risk

    can vary between zero and one and is dimensionless.

    It

    is

    important

    to

    recognize that the co ncept

    of

    risk

    requires

    a

    specific period referent

    e.g.,

    the

    5 year

    risk of developing lung cancer. Defining risk as a

    probability is required not because wenecessarily

    believe that disease occurrence is a random event,

    but rather to express ourignoranceof thecausal

    process

    and how to

    observe

    it. Of

    course,

    it is

    recognized that

    the

    term 'risk'

    or its

    derivatives

    (e.g., risky, at-risk, risk factor,

    etc.) is

    commonly

    used in a much broader framework than suggested

    by the above definition. Nevertheless, in order to

    give the concept scientific meaning,it isimportant

    to treat risk as aconditional (a priori) probability

    referringtoan individual.

    The concept

    of

    rate involves

    a

    more complex

    formulation than does risk

    and

    therefore

    is

    some-

    what more difficult

    to

    understand.

    In

    general,

    a

    rate is aninstantaneous potential forchange inon e

    quantity peTunit change in another quantity

    where, typically, the second quantity istime (e.g.,

    velocity). The(instantaneous) rateofdevelopment

    of

    a

    particular disease

    in a

    population might

    be

    expressed

    as the

    number

    of

    new cases per year. The

    97

  • 5/19/2018 Measures of Disease Incidence Used

    2/8

    98

    INTERNATIONAL JOURNAL OF EPIDEMIOLOGY

    limitation of the latter quantity is that it does not

    reflect the size or experience of the candidate

    population (at-risk) from which the cases develop.

    Consequently, epidemiologists are more concerned

    with the 'relative rate' (1), which is defined as the

    instantaneous potential for change in disease status

    (i.e., the occurrence of new cases) per unit of time,

    relative to the size of the candidate population at

    time t. More descriptively, the (relative) incidence

    rate reflects the 'force of morbidity' (or mortality)

    on t he p opula tion and therefore has no direct

    interpretation on the individual level. Also, in

    contrast to the concept of risk, the incidence

    rate refers to a 'point in time' and has no period

    referent. Furtherm ore, the rate is no t dimensionless

    but is expressed in units of I/ tim e (e.g., years

    1

    ).

    As a result, the incidence rate, unlike risk, does not

    have an upper limit of one but, in fact, theoretically

    can approach infinity. This feature can be seen by

    simply altering the unit of time. For example, an

    incidence rate of .05 weeks

    1

    is equivalent to 2.6

    years

    1

    , which is greater than one. Suppose an

    earthqua ke killed every mem ber of a village at

    exactly the same moment; the death rate at that

    m om ent' wo uld be infinite. Yet, the risk of a village

    member being killed by the earthquake on that day,

    month, year, or any period of time that includes the

    disaster cannot exceed one.

    ESTIMATION OF RISK AND RATE

    The most commonly used method for estimating

    risk from follow-up data is to calculate the pro-

    portion of candidate subjects at the onset, who

    develop the disease during the subsequent follow-up

    period. The numerator is the number of newly

    detected cases (I), and the denominator is the

    number of disease-free subjects at the beginning

    of follow-up (N

    o

    ). While this fraction is generally

    called an incidence 'rate' by researchers, it should

    be noted that the term is a misnomer since the

    prop ortion is an estima te of a probability (1 ). We

    will adopt the term 'cumulative incidence' (2) to

    express the proportion of a 'fixed' cohort for

    which the disease is detected during a subsequent

    follow -up peripd (At); thus, the risk (R) is estima ted

    by calculating the cumulative incidence (CI),

    (1)

    The coh ort is 'fixed' in the sense that no entries

    are allowed during the follow-up period. Vet, the

    size of the cohort is likdy to be reduced during the

    follow-up period as a result of deaths and other

    sources of attrition. This presents a problem

    however, since risk was defined as a conditiona

    probab ility; and strictly speaking, a subject shoul

    not be included in the denominator (N

    o

    ) unles

    he/she is followed for the entire duration of At o

    is known to develop the disease of interest. Con

    sequently, even assuming no unnecessary loss t

    follow-up and complete autopsies done on a

    deaths, we do n ot k now whether a subject, wh

    died of another cause, would have developed th

    disease of interest during the remaining follow-u

    period. This problem is often referred to as one o

    'competing risks' or 'competing causes of death

    and becomes increasingly a more relevant issue a

    the follow-up period gets longer.

    Ano ther difficulty in using the CI to estimat

    risk is that subjects may be followed for differen

    durations as a result of the study design. For ex

    ample, a large clinical trial typically enrols subject

    into the study throughout the data collectio

    period, which may last for years. Thus, regardles

    of the attrition and mortality rates, the length o

    follow-up will vary considerably among subject

    Therefore, any attempt to estimate risk by applyin

    equation (1) will not effectively make use of in

    formation from all subjects (3).

    Closely associated with the above issues is th

    distinction between a 'fixed' cohort and a 'dynamic

    population. While no subjects may be added to

    fixed cohort after the onset of follow-up, th

    comp osition of a dynamic population is mor

    fluid, allowing for ad ditions during the follow-u

    period. Very often, a fixed cohort is selected from

    a dynamic, source population to which causa

    inferences are made. If the source population is

    geographically defined region (e.g., a county

    the study cohort is likely to become increasingl

    dissimilar to or unrepresentative of the sourc

    population as the follow-up proceeds. For instance

    after 10 years, the remaining cohor t will have age

    10 years while the mean age or the distribution o

    ages in the dynamic source population may not hav

    changed. Consequently, the cases that develop i

    the study cohort will become increasingly un

    representative of the cases that develop in th

    source population. The reason for this difference i

    that while subjects are not added to the stud

    cohort after the onset of follow-up, the compositio

    of the source p opulation is dynamic and may there

    fore remain in equilibrium.

    As a hypothetical illustration of the inheren

    theoretical difficulties in using the CI to estimat

    risk, consider a fixed coho rt of 5 healthy subject

    who are followed for 5 years (or less) for detection

  • 5/19/2018 Measures of Disease Incidence Used

    3/8

    MEASURES OF DISEASE INCIDENCE

    99

    1