Measures of Disease Incidence Used

5/19/2018 Measures of Disease Incidence Used

1/8

International Journal

of

Epidemiology

O Oxford University Preu

1980

Vol. 9 No. 1

Printed

In

Great Britain

Measures

of

Disease Incidence Used

in Epidemiologic Research

H A L M OR GEN ST ER N

1

2

D A VID G K L EI N B A U M

3

4

and LAWRENCE LKUPPER

4

Morgenstern

H

[ Dep ar t m en t

of

Epidemiology

and

Public Hea lth, School

of

Medicine, Yale University,

60

College

Street,

New

Haven,

CT

06 51 0 ] , K le in b au m

D G and

Kupper

L L

Measure*

of

disease incidence used

in

epidemio-

logic research. International Journal

o

Epidemiology 1 9 8 0 ,

9:

9 7 1 0 4 .

This paper distinguishes between

2

concepts

for

measuring

the

incidence

of

disease: risk

and

rate. Alternative

procedures

for

estimating these measures from epidem iologic data

are

reviewed

and

illustrated.

An

a t tempt

is

made

to

integrate statistical principles with epidemiologic methods while minimizing

the

use

of

higher mathematics.

Several theoretical

and

practical criteria

are

discussed

for

choosing

the

appro priate incidence measure

in the

plan-

ningof a study and forselecting thebest m ethod ofest imation inthe analysis.

In a follow-up study of human populations, the

fundamental concerns

to the

investigator

are the

measurement, analysis, and interpretation of an

observed setofnew even ts: incident casesofdisease,

deaths from one

or

more causes, or new occurrences

of any health-related event. Despite the long history

of this type

of

research, dating back well over

a

cen-

tury, muchofthe current literature on epidemiologic

methods (including textbooks) regards disease in-

cidence as asingle concept that is readily operation-

alized from longitudinal data. The purpose of

this paper

is to

distinguish

2

basic concepts

for

quantifying incidence

and to

compare alternative

methods

for

estimating each

of

them. While most

of

the ideas that follow are notnew,it isfelt that the

theoretical work of statisticians has not been

adequately appliedtothe understand ing of empirical

research. Perhaps,oneexplanationforthis s ituation

is the increasing lack of mutual understanding

between health researchers

who are

primarily

concerned with findings

and

statisticians

who are

more concerned with analytic methods. An attempt

1

Department of Epidemiology and Public Health,

School of Medicine, Yale University, 60 College

Street, New Haven, CT 06510, USA.

1

Center for Health Studies, Institution for Social and

Policy Studies, Yale University, 89 Trumbull Street,

New Haven, CT 06520, USA.

3

Department of Epidemiology, School of Public

Health, University of North Carolina, Chapel Hill,

NC 27514, USA.

4

Department of BioiUtistics, School of Public Health,

University of North Carolina, Chapel Hill, NC 27514,

USA.

Tbe work on this paper was partly funded by two grants

from:

tbe National Cancer Institute W1-PO1-CA-16359-O5)

and tbe Henry J. Kaiser Family Foundation.

is made in this paper to merge statistical principles

with epidemiologic methods while minimizing

the

useofhigher m athematics.

BASIC THEORETICAL CONCEPTS

Two distinct concepts

may be

employed

to

assess

the expected occurrence

of new

(incident) events:

risk and rate. Risk is the probability ofan individual

developing a given disease or health status change

over a specified period, conditional on not dying

from anyother causes during that period. Thus, risk

can vary between zero and one and is dimensionless.

It

is

important

to

recognize that the co ncept

of

risk

requires

a

specific period referent

e.g.,

the

5 year

risk of developing lung cancer. Defining risk as a

probability is required not because wenecessarily

believe that disease occurrence is a random event,

but rather to express ourignoranceof thecausal

process

and how to

observe

it. Of

course,

it is

recognized that

the

term 'risk'

or its

derivatives

(e.g., risky, at-risk, risk factor,

etc.) is

commonly

used in a much broader framework than suggested

by the above definition. Nevertheless, in order to

give the concept scientific meaning,it isimportant

to treat risk as aconditional (a priori) probability

referringtoan individual.

The concept

of

rate involves

a

more complex

formulation than does risk

and

therefore

is

some-

what more difficult

to

understand.

In

general,

a

rate is aninstantaneous potential forchange inon e

quantity peTunit change in another quantity

where, typically, the second quantity istime (e.g.,

velocity). The(instantaneous) rateofdevelopment

of

a

particular disease

in a

population might

be

expressed

as the

number

of

new cases per year. The

97


2/8

98

INTERNATIONAL JOURNAL OF EPIDEMIOLOGY

limitation of the latter quantity is that it does not

reflect the size or experience of the candidate

population (at-risk) from which the cases develop.

Consequently, epidemiologists are more concerned

with the 'relative rate' (1), which is defined as the

instantaneous potential for change in disease status

(i.e., the occurrence of new cases) per unit of time,

relative to the size of the candidate population at

time t. More descriptively, the (relative) incidence

rate reflects the 'force of morbidity' (or mortality)

on t he p opula tion and therefore has no direct

interpretation on the individual level. Also, in

contrast to the concept of risk, the incidence

rate refers to a 'point in time' and has no period

referent. Furtherm ore, the rate is no t dimensionless

but is expressed in units of I/ tim e (e.g., years

1

).

As a result, the incidence rate, unlike risk, does not

have an upper limit of one but, in fact, theoretically

can approach infinity. This feature can be seen by

simply altering the unit of time. For example, an

incidence rate of .05 weeks

1

is equivalent to 2.6

years

1

, which is greater than one. Suppose an

earthqua ke killed every mem ber of a village at

exactly the same moment; the death rate at that

m om ent' wo uld be infinite. Yet, the risk of a village

member being killed by the earthquake on that day,

month, year, or any period of time that includes the

disaster cannot exceed one.

ESTIMATION OF RISK AND RATE

The most commonly used method for estimating

risk from follow-up data is to calculate the pro-

portion of candidate subjects at the onset, who

develop the disease during the subsequent follow-up

period. The numerator is the number of newly

detected cases (I), and the denominator is the

number of disease-free subjects at the beginning

of follow-up (N

o

). While this fraction is generally

called an incidence 'rate' by researchers, it should

be noted that the term is a misnomer since the

prop ortion is an estima te of a probability (1 ). We

will adopt the term 'cumulative incidence' (2) to

express the proportion of a 'fixed' cohort for

which the disease is detected during a subsequent

follow -up peripd (At); thus, the risk (R) is estima ted

by calculating the cumulative incidence (CI),

(1)

The coh ort is 'fixed' in the sense that no entries

are allowed during the follow-up period. Vet, the

size of the cohort is likdy to be reduced during the

follow-up period as a result of deaths and other

sources of attrition. This presents a problem

however, since risk was defined as a conditiona

probab ility; and strictly speaking, a subject shoul

not be included in the denominator (N

o

) unles

he/she is followed for the entire duration of At o

is known to develop the disease of interest. Con

sequently, even assuming no unnecessary loss t

follow-up and complete autopsies done on a

deaths, we do n ot k now whether a subject, wh

died of another cause, would have developed th

disease of interest during the remaining follow-u

period. This problem is often referred to as one o

'competing risks' or 'competing causes of death

and becomes increasingly a more relevant issue a

the follow-up period gets longer.

Ano ther difficulty in using the CI to estimat

risk is that subjects may be followed for differen

durations as a result of the study design. For ex

ample, a large clinical trial typically enrols subject

into the study throughout the data collectio

period, which may last for years. Thus, regardles

of the attrition and mortality rates, the length o

follow-up will vary considerably among subject

Therefore, any attempt to estimate risk by applyin

equation (1) will not effectively make use of in

formation from all subjects (3).

Closely associated with the above issues is th

distinction between a 'fixed' cohort and a 'dynamic

population. While no subjects may be added to

fixed cohort after the onset of follow-up, th

comp osition of a dynamic population is mor

fluid, allowing for ad ditions during the follow-u

period. Very often, a fixed cohort is selected from

a dynamic, source population to which causa

inferences are made. If the source population is

geographically defined region (e.g., a county

the study cohort is likely to become increasingl

dissimilar to or unrepresentative of the sourc

population as the follow-up proceeds. For instance

after 10 years, the remaining cohor t will have age

10 years while the mean age or the distribution o

ages in the dynamic source population may not hav

changed. Consequently, the cases that develop i

the study cohort will become increasingly un

representative of the cases that develop in th

source population. The reason for this difference i

that while subjects are not added to the stud

cohort after the onset of follow-up, the compositio

of the source p opulation is dynamic and may there

fore remain in equilibrium.

As a hypothetical illustration of the inheren

theoretical difficulties in using the CI to estimat

risk, consider a fixed coho rt of 5 healthy subject

who are followed for 5 years (or less) for detection


3/8

MEASURES OF DISEASE INCIDENCE

99

1

Measures of Disease Incidence Used

Documents

Transcript of Measures of Disease Incidence Used