Introduction to baselines & significance for interpretation

36
Introduction to baselines & significance for interpretation Sheena Sullivan, MPH, PhD

description

Introduction to baselines & significance for interpretation. Sheena Sullivan, MPH, PhD. Baselines and thresholds. Baseline The usual or average level of influenza activity that occurs during a typical year Threshold - PowerPoint PPT Presentation

Transcript of Introduction to baselines & significance for interpretation

Page 1: Introduction to baselines  & significance for interpretation

Introduction to baselines & significance for interpretation

Sheena Sullivan, MPH, PhD

Page 2: Introduction to baselines  & significance for interpretation

2

Baselines and thresholds Baseline

The usual or average level of influenza activity that occurs during a typical year

Threshold The level of influenza activity that signals the

occurrence of a specific activity Seasonal threshold

The level of influenza activity that signals the start and end of the annual influenza season

Alert threshold A level above which influenza activity is higher than

most years.

Page 3: Introduction to baselines  & significance for interpretation

3

Baselines and thresholds Used as a point of reference to detect

Start/end of the season Severity of season Outbreaks

Useful to inform public health actions improve clinical diagnosis stimulate diagnosis encourage early prescription of antivirals Indicate uptake and timeliness of vaccine

Used retrospectively and prospectively

Page 4: Introduction to baselines  & significance for interpretation

Example: Victorian sentinel data

Tay et al (submitted) 4

0

10

20

30

40

50

60

2005 2006 2007 2008 2009 2010 2011

Year

Admission Proportion per 100000

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

ILI Proportion per 100000

VAED GPSS MMDS

worse

outbreakstart

Page 5: Introduction to baselines  & significance for interpretation

5

Data considerations Baselines are difficult to establish

Rely on very stable data collected over a long period of time

Multiple years of data are needed to account for fluctuations in activity from year to year Severity of virus types Changes in data collection methods, diagnosis coding,

participating clinical facilities, insurance eligibility, testing practises

Most surveillance systems cannot adjust quickly to changing baselines, so that utilization shifts may trigger false alarms – public health crises and major public events may undermine health surveillance systems at the very times they are needed most

Page 6: Introduction to baselines  & significance for interpretation

6

Data considerations Five years of surveillance data collected

consistently is considered standard for a reliable baseline 3 years minimum Several months may be sufficient with

sophisticated modelling (e.g. Cowling 2006) Difficult in tropical areas or areas with many

different causes of ILI

Page 7: Introduction to baselines  & significance for interpretation

7

Sources of data ILI

E.g. number of consultations at sentinel GPs among all patients seen

E.g. proportion of total outpatient visits SARI

E.g. hospital admissions Deaths

E.g. number or rate of influenza or pneumonia deaths Laboratory notifications

E.g. percentage of influenza-positive specimens among respiratory tests

Over the counter pharmaceutical sales Call volume to information /advice lines Frequency of online searches (e.g. Google flu trends,

Ginsberg et al 2009)

Page 8: Introduction to baselines  & significance for interpretation

8

Sources of data Multiple sources of data can be combined to

develop composite indicators of baseline and thresholds E.g. start of season = week in which ILI crosses a

certain value and the percentage of specimens testing positive reaches a certain value

Page 9: Introduction to baselines  & significance for interpretation

9

Types of data Count

May be used where denominators are not known, or cases are rare

Require very stable data An average number of cases may be used where

denominators are unknown Proportion

Better parameter to use Can correct for shifts both in numerator and

denominator activity (e.g. health care utilization and in disease activity)

Help to clarify/magnify an outbreak signal

Page 10: Introduction to baselines  & significance for interpretation

10

Example: counts v proportions------- total number of visits------- counts of ILI

Burkom et al. 2008

Page 11: Introduction to baselines  & significance for interpretation

11

Individual site vs. aggregated site Baselines may be calculated for individual

sites, to monitor activity in particular locations, or may use data compiled from all surveillance sites In locations with few sites, individual baselines

help control for non-reporting sites, regional variation within a country, or relative representativeness of sites of the national or regional populations

In locations with many sites, data aggregated from all sites might be the best way to set a baseline

Page 12: Introduction to baselines  & significance for interpretation

12

Data represented The type of threshold calculated depends on

data available and usage Indicate start of season Comparison of this week’s value with the

expected value Outbreak indicator

Page 13: Introduction to baselines  & significance for interpretation

13

Timeliness of data Depends on needs

Early-warning systems require near-real-time data

If wanting to report what happened in a season, timeliness less important

Real-time E.g. China

Weekly or fortnightly reports E.g. US

Death certificates May take months to verify

Page 14: Introduction to baselines  & significance for interpretation

14

Methods for determining thresholdsMethod description

Examples Advantages Disadvantages

Visual Based on a visual analysis of past data, define baseline, off-seasonal baseline, threshold and seasonal threshold values

Graphically based19 Model based3

Very simple to implement and understand

Overly simplified, will not capture any trend changes over time

Averaging Usually involves calculating a median or mean of data

Methods of moments (MEM)18

Simple to implement

Can allow a past season's or week's aberrant values to influence current time prediction of baseline and threshold values

Page 15: Introduction to baselines  & significance for interpretation

15

Methods for determining thresholdsMethod description

Examples Advantages Disadvantages

Process control Based on similar processes to those used in detecting anomalies in industrial production processes. Most methods rely on some method of setting an upper control limit. Some methods also involve looking at the rate of change in the data series.

Shewhart charts9,16 CUSUM charts4,13,14

Exponentially weighted moving average charts13,16

Best for detection of start of season and unusual patterns. Works well in situations where rates are low. May be best method to use in tropical climates. Good at detecting the start of the season when the start is slow.

Not as accurate as time series methods. May be sensitive to small changes in reporting efficiency.

Other Regression3,4,15

Time series4Account well for seasonal variations. Good for assessing severity. Useful for predicting activity

Difficult for the non-statistician

Page 16: Introduction to baselines  & significance for interpretation

16

Types of baselines

Figure 2. ILI presentation rates at metropolitan and rural general practice sentinel sites, 1997 to 2012

The 2012 Victorian Influenza VaccineEffectiveness Audit Report: http://www.victorianflusurveillance.com.au

Static (flat) Does not change with

changes in seasonal patterns found in data

Uses data for a specified period of time (may be for a whole year, may be for only period when surveillance is conducted)

Baseline - defines the start and end of an influenza season

Average and above average - describe the intensity of a season

Page 17: Introduction to baselines  & significance for interpretation

17

Visual inspection

Jan-94

Mar-94May-

94Jul

-94Sep-

94Nov-

94Jan

-95Mar-9

5May-

95Jul

-95Sep-

95Nov-

95Jan

-96Mar-9

6May-

96Jul

-96Sep-

96Nov-

96Jan

-97Mar-9

7May-

97Jul

-97Sep-

97Nov-

97Jan

-98Mar-9

8May-

98Jul

-98Sep-

98Nov-

98Jan

-99Mar-9

9May-

99Jul

-99Sep-

99Nov-

99Jan

-00Mar-0

0May-

00Jul

-00Sep-

00Nov-

00

0

50

100

150

200

250

300

350

400

450

500

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Hospital admission

ILI consultations

Adm

issi

ons

ILI p

er 1

00 p

atie

nts

seen

>3.5 epidemic

1.5-3.5 higher than expected seasonal activity

0.25-1.5 normal seasonal activ-ity

<0.25 baseline activity

Watts et al., 2003

Page 18: Introduction to baselines  & significance for interpretation

18

Shewart chartsFigure 1. Victorian weekly laboratory notifications of influenza 2002-2008 with Shewhart Chart threshold of 6.5.

Simplest control chart Developed by industry for QA/QC Assumes normal distribution

Binomial (Bernoulli) & Poisson variants Mean (w) based on previous data Control limits

Upper limit = w + kw K either predetermined, usually 2 or 3,

upper limit of 95%CI or set based on recent observations

Alert declared when obervations exceed this upper limit i.e. |yt − µ| > kσ

False alarms mean and sd should be estimated

from large dataset to avoid false alarms

Steiner et al 2010

Page 19: Introduction to baselines  & significance for interpretation

19

Example: US ILI Baseline Plots % of patient visits to healthcare providers for ILI

reported each week weighted by state population and compared with a baseline % visits = n(ILI) / N N=total patients for that week y = % visits * w w=weight for the state

Baseline = mean % of patient visits for ILI during non-influenza weeks for the previous three seasons plus 2 standard deviations (+2) Non-influenza week = two or more consecutive weeks in

which each week accounted for less than 2% of the season’s total number of specimens that tested positive for influenza

National baseline = 2.2%, but each region has its own baseline also

Does not include summer data; do not know if activity is outside the norm for the summer months

http://www.cdc.gov/flu/weekly/overview.htm#Outpatient

Page 20: Introduction to baselines  & significance for interpretation

20

National Baseline Outbreak/Epidemic Activity

Page 21: Introduction to baselines  & significance for interpretation

Australia Influenza Surveillance Report, Oct 2012 21

Types of baselinesFigure. Rate of deaths classified as influenza and pneumonia from the NSW Registered Death Certificates, 1 January 2007 to 21 September 2012

Cyclical (seasonal) Good for data with regular

seasonality May be inappropriate in

regions with unclear seasons (e.g. tropics)

Cyclical changes in baseline reflect seasonal pattern of disease activity Distinguishes disease related

increases with normal seasonal increases in a syndrome (e.g. pneumonia)

Different methods used: Regression models, moving

averages, time series

Unusual activity

Page 22: Introduction to baselines  & significance for interpretation

22

Example: 122 Cities Mortality Baseline Weekly report of total death certificates & total for

which pneumonia or influenza (P&I) was listed as underlying cause of death, by age group

Percentage of deaths due to P&I are compared with seasonal baseline and epidemic threshold values calculated for each week

Seasonal baseline is calculated using a periodic regression model that incorporates a robust regression procedure applied to data from the previous five years

An increase of 1.645 standard deviations above the seasonal baseline is considered the “epidemic threshold”

Page 23: Introduction to baselines  & significance for interpretation

23

Example: 122 Cities Mortality Baseline

http://www.cdc.gov/flu/weekly/

Page 24: Introduction to baselines  & significance for interpretation

24

Interpreting data: which method is best? Depends on the application

Methods which use a static baseline may be better for defining the beginning/end of a season

Methods that rely on seasonality may be inappropriate for the tropics

Can formally evaluate based on: Sensitivity – true alarm rate Specificity – false alarm rate Positive predictive value – ratio of true positive

epidemic alarms over the total number of alerts Timeliness – how quickly the method signals an

outbreak (run-length)

Page 25: Introduction to baselines  & significance for interpretation

25

Interpreting data Knowing your data is key to interpreting

Time series data may show anomalies associated with holidays, long weekends, etc. Interpretation of those anomalies is dependent on the

data analyst’s knowledge of trends not accounted for in the detection algorithm

Baselines may be influenced by changes in Data collection methods Provider participation Changes in case definitions Changes in population/health care use

Page 26: Introduction to baselines  & significance for interpretation

26

Example: understanding and interpreting data

Ungchusak et al. 2012

Page 27: Introduction to baselines  & significance for interpretation

27

Example: Categories of influenza season in Victoria for six surveillance datasets, 2002 - 2011

0

5

10

15

20

25

30

Year

GPSS

Alert

Above Average

Average

Alert

Above Average

Average

Alert

Above Average

Average

Alert

Baseline

Above Average

Average

Alert

Above Average

Average

Baseline

Alert

Above Average

Average

Baseline0

20

40

60

80

100

120

Year

MMDS

Alert

Above Average

Average

Baseline0%

5%

10%

15%

20%

25%

30%

35%

40%

Percentage

Year

Test Positive Influenza

Alert

Above Average

Average

Baseline

0

2

4

6

8

10

12

14

16

Proportion per 1,000

Year

GPSS Composite

Alert

Above Average

Average

Baseline 0

5

10

15

20

25

30

Proportion per 1,000

Year

MMDS Composite

Alert

Above Average

Average

Baseline0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

Proportion per 100,000

Year

VAED

Alert

Above Average

Average

Baseline

Tay et al (submitted)

Page 28: Introduction to baselines  & significance for interpretation

28

Understanding source data Lab-confirmed influenza

May not be available Mortality data

Lag time between influenza circulation and death Hospitalisation

Issues with coding Lag time between infection and rise in

presentations to hospital Weekly versus daily data

Greater variation with daily reports

Page 29: Introduction to baselines  & significance for interpretation

Example: understanding and interpreting data

Page 30: Introduction to baselines  & significance for interpretation

30

Summary Baselines and epidemic thresholds help to

understand the significance of increased influenza activity When to know a flu season has begun When to know if a spike in activity is a real spike An indicator that activity is unusual or outside

the norm An indicator for public health action

Graphical representation of current surveillance data compared with baseline data and previous years’ data provides a meaningful snapshot to public health practitioners, policy makers, and others regarding current activity

Page 31: Introduction to baselines  & significance for interpretation

31

References1. Burkom, et al. Developments in the Roles, Features, and Evaluation of Alerting Algorithms for Disease Outbreak

Monitoring, Johns Hopkins APL Technical Digest, 20082. Clothier HJ, et al. A comparison of data sources for the surveillance of seasonal and pandemic influenza in

Victoria. Commun Dis Intell. 2006;30(3):345-9. 3. Cooper DL, et al. Can syndromic thresholds provide early warning of national influenza outbreaks? J Public

Health. 2009;31(1):17-25.4. Cowling BJ, et al. Methods for monitoring influenza surveillance data. Int J Epidemiol. 2006;35(5):1314-21.5. Dedman D, Watson J. The use of thresholds to describe levels of influenza activity. PHLS Microbiol Dig.

1997;14:206-8.6. Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using

search engine query data. Nature. 2009 Feb 19;457(7232):1012-4.7. Goldstein E, et al. Improving the estimation of influenza-related mortality over a seasonal baseline. Epidemiology.

2012;23(6):829-38.8. Goldstein E, et al. Predicting the epidemic sizes of influenza A/H1N1, A/H3N2, and B: a statistical method. PLoS

Med. 2011;8(7):e1001051. 9. Hashimoto S, et al. Detection of epidemics in their early stage through infectious disease surveillance. Int J

Epidemiol. 2000;29(5):905-10.10. Health Protection Agency (HPA). Surveillance of influenza and other respiratory viruses in the UK: 2010-2011.

London: HPA; May 2011. Available from: http://www.hpa.org.uk/Publications/InfectiousDiseases/Influenza/1105influenzareport/

11. Hutwagner LC, Maloney EK, Bean NH, Slutsker L, Martin SM: Using laboratory-based surveillance data for prevention: an algorithm for detecting salmonella outbreaks. Emerg Infect Dis 1997, 3:395–400.

12. Kelly HA, et al. The significance of increased influenza notifications during spring and summer of 2010-11 in Australia. Influenza Other Respi Viruses. 2012.

13. Kuang J, et al. Epidemic features affecting the performance of outbreak detection algorithms. BMC Public Health. 2012;12(1):418.

14. O'Brien SJ, Christie P. Do CuSums have a role in routine communicable disease surveillance? Public Health. 1997;111(4):255-8.

15. Serfling RE. Methods for current statistical analysis of excess pneumonia-influenza deaths. Public health reports 1963; 78:494-506.

16. Steiner SH, et al. Detecting the start of an influenza outbreak using exponentially weighted moving average charts. BMC Med Inform Decis. 2010;10.

17. Ungchusak et al. Lessons Learned from Influenza A(H1N1)pdm09 Pandemic Response in Thailand. EID 2012:18, 18. Vega T, et al. Influenza surveillance in Europe: establishing epidemic thresholds by the Moving Epidemic Method.

Influenza Other Respi Viruses. 2012. & http://cran.r-project.org/web/packages/mem/mem.pdf 19. Watts CG, et al. Establishing thresholds for influenza surveillance in Victoria. Aust N Z J Public Health.

2003;27(4):409-12.20. World Health Organization. WHO Interim Global Epidemiological Surveillance Standards for Influenzaa. Geneva:

Global Influenza Programme, Surveillance and Monitoring team, World Health Organization, 2012. http://www.wpro.who.int/emerging_diseases/documents/docs/GuideforDesigningandConductingInfluenzaStudies.pdf

Page 32: Introduction to baselines  & significance for interpretation

32

Exercise - Defining baseline curves and alert threshold Determining the baseline1. Align the transmission peaks of several

years data around the median week of peak reporting

2. Calculate an average weekly number for each week centred on the median peak week of transmission

Page 33: Introduction to baselines  & significance for interpretation

33

Page 34: Introduction to baselines  & significance for interpretation

34

Defining alert threshold Display the lowest and highest season,

excluding exceptional events (e.g. pandemic)

Page 35: Introduction to baselines  & significance for interpretation

35

Defining alert threshold Calculate the standard deviation of the mean

for each week and create a curve based on those values

Page 36: Introduction to baselines  & significance for interpretation

36

Examining data Plot the current year’s data on the curve