John Hargrove, Brian Williams DAIDD Workshop December 2013U Florida, Gainesville

Data collection, analysis, modelling, publication ……. and beyond

Lessons learned from the analysis of HIV prevalence and incidence data from Zimbabwe.

John Hargrove, Brian Williams

DAIDD WorkshopDecember 2013 U Florida, Gainesville

What kinds of data are useful for modeling? How to collect/access data important in modeling disease systems (network data for contact patterns, weather data, disease incidence

data, etc)?

When is it OK to take data or parameter estimates from other studies and use them in my model?

If DAIDD is supposed to be for epidemiology-oriented people who are here to learn about mathematical modeling in order to collaborate with modelers and speak their language, it would be helpful to know how we

can actually use our epi methods training to contribute to model development.

The thrust of this workshop is an effort to encourage those already invested in (quantitative) medical research to consider ways in

which mathematical modelling might add value to their research.

One of last year’s DAIDD participants had the following thoughts, which we will bear in mind during this discussion:

Strictly speaking the analysis of such an RCT interests itself solely in deciding whether or not pre-defined null hypotheses can or cannot be

rejected.[Some studies even preclude other analyses being

applied to the data].But restricting one’s view in this way can mean

that one is wasting valuable information that can shed light on other areas of interest.

The Randomised Control Trial is quite rightly regarded as the gold standard for a clinical trial. RCTs are often

used to test the efficacy and/or effectiveness of various types of medical intervention within a patient

population.

14,110 mothers and their babies were recruited within 72 hours of giving birth.

The RCT tested for the efficacy of a single large dose of vitamin A in reducing maternal and neonatal

mortality among HIV positive and negative cases, HIV incidence in mothers, and mother-to-baby

transmission of the virus.Trial suggested following demonstration in India that vitamin A could reduce perinatal mortality even in

settings where there was no HIV.

The ZVITAMBO (Zimbabwe Vitamin A for Mothers and Babies) study was such an RCT carried out in Harare,

Zimbabwe between November 1997 and January 2000.

The Trial might thus be viewed as a disappointment – even if it did, at least, provide an unequivocal

answer to the research question.

But this disappointment was entirely over-ridden by the spin-off, which steadily emerged from the

analysis of all of the data collected in the process.

The ZVITAMBO study found no effect at all (neither positive nor negative) of vitamin A treatment on any of the six medical outcomes investigated.

• Demonstrated a marked genetic predisposition to HIV infection among sub-groups of the population.

• Showed that HIV positive women were at significantly increased risk of dying – regardless of CD4 count.

• Used to validate the BED assay for application to clade C virus: and currently being similarly used to validate more effective avidity bio-markers to be used in HIV incidence estimation.

The study demonstrated unequivocally the importance of exclusive breastfeeding in minimising mother-to-

child transmission of HIV and optimising disease free infant survival

In order to estimate the effect of Vitamin treatment on HIV acquisition it was necessary to test all mothers and babies for HIV – at recruitment and then at 3-mo intervals for up to two

years.

As a consequence the Trial produced an interesting pictures of HIV prevalence and incidence as a function of time and of

maternal age.

In what follows we will try to see what we can learn from such data (first) without using any mathematical modelling.

And then try to see what further juice we can get through the use of the mathematical press.

None of the above results depended (primarily) on mathematical modelling. We now look at a further example where simple statistical analysis was not sufficient and where modelling was necessary: and

useful …

First law of statistics?Look at your data.

Second law of statistics?Play with your data.

The thrust of what we are trying to get across in this workshop is that we want to engage

with data

If it’s good enough for Isaac it’s good enough for me.

PLAY with your data.

I was like a boy playing on the sea-shore, and diverting myself now and then finding a smoother pebble or a prettier shell than ordinary, whilst the great ocean of truth lay all undiscovered before

me.

A pre-requisite for a good (data-based) modelling exercise is a good data set.

So first clean your data.Data on age is just one example…..

Data on parity is another.The cleaning process can be tedious, but it is necessary.

“There’s never time to do it right ….”.“but there’s always time to do it over”

Now pool on age and see whether there is any relationship between HIV prevalence and calendar date.

Is there any trend in the prevalence with date of recruitment??

Mean prevalence vs month of recruitment

Month of recruitment0 4 8 12 16 20 24 28

Prev

alen

ce a

t re

crui

tmen

t

22

24

26

28

30

32

34

36

38

40

1. Prevalence is increasing2. Prevalence is decreasing

3. Prevalence is not changing significantly with time

4. Something else is going on5. The dog ate my homework

6. I have a headache7. I don’t like you anyway so I’m not going to

answer any of your questions

Decide between the following possibilities:

15

05

1015

Tem

pera

ture

(Cel

sius

)

1960 1970 1980 1990 2000 2010Year

What is the relationship?

16

13.8

1414

.214

.414

.6Te

mpe

ratu

re (C

elsi

us)

1960 1970 1980 1990 2000 2010Year

Median Global Temperature During the Past 50 Years

What do we now think about the scales in this figure?How quickly would we expect HIV prevalence to change?


Month of recruitment0 4 8 12 16 20 24 28

Prev

alen

ce a

t re

crui

tmen

t

22

24

26

28

30

32

34

36

38

40

For the ZVITAMBO Trial, HIV prevalence

increased significantly during 1998, thereafter it declined significantly.

We have fitted a parabola to the data. Is

that a good model?

What happens for very small, or very large,

values of time?What does prevalence

pattern actually look like pre/post ZVITAMBO?


Month and year of recruitment

Prev

alenc

e at r

ecru

itmen

t

22

24

26

28

30

32

34

36

38

40

N J M M J S N J M M J S N J1997 1998 1999 2000

When the ZVITAMBO data

are amalgamated with other data from Harare ANC sites, prevalence

appears to have peaked at the end of 1998 and seems to have been declining ever since.

Why should this be?Is it a natural consequence of epidemic development?

Is it just due to deaths?Do the same changes

occur in all age groups?Perhaps older people are

dying off, leaving just young women with

(relatively) low prevalence?

Look at age effects.

HIV prevalence initially increases with age –

peaking at a horrendous level of 50% for women aged about 30. Then

declines sharply.Why the decline?

Is it due to decreasing incidence in older women?

Or is it due to deaths?If due to death among

older women would expect decline in mean age.Perhaps this fits with

declining prevalence over time?

Age13 16 19 22 25 28 31 34 37 40 43 46 49

Prev

alenc

e (p

erce

nt)

0

5

10

15

20

25

30

35

40

45

50

55

Age structure did shift towards younger women. From 1991 to

2002, teenage pregnancies increased from 11% to 23%; >35s decreased from 13% to

3%; mean from 27.4 to 24.6 yrs.

But since that time there has been a reversal in the age trend.

Need to look at age-specific HIV incidence and prevalence.

Harare ANC dataMean ages

23.0

23.5

24.0

24.5

25.0

25.5

26.0

26.5

27.0

1990 1994 1998 2002 2006 2010Year

Mea

n ag

e

Only two estimates of age-incidence function. Why so few??

The shape of the two age-

incidence graphs are similar and consistent with the idea that risk of HIV infection has, over much

of the epidemic, been a decreasing function of age.

The women for Mbizvo study were recruited in 1991/2; 7-9

years before ZVITAMBO.

Why does the age-incidence curve seem to be so much less variable in the Mbizvo study?Age at recruitment

18 22 26 30 34 38 42 46 50

Perc

ent i

nfec

ted

per y

ear

0

1

2

3

4

5

6

7Mbizvo (1992) 4.8/ 100 py [3.1-6.5]ZVI TAMBO (1998/ 99) 3.4%pa [3.0-3.8]

1987 1992 1997 2002 20071987 1992 1997 2002 2007

Year

1987 1992 1997 2002 20071987 1992 1997 2002 20071987 1992 1997 2002 2007

Pre

vale

nce

(%; 9

5% c

.i.)

0

5

10

15

20

25

30

35

40

45

50

MahomedMbizvoZVITAMBOMoHCW

Age 20 - 24Median = 1996Peak = 34.2%

Age 25 - 29Median = 1997Peak = 47.1%

Age 30 - 34Median = 1998Peak = 44.0%

Age 35 +Median = 1999Peak = 33.5%

Age < 20Median = 1993Peak = 25.4%

Look how height and timing of peak prevalence changes with age.

How do we explain these changes?What is the significance of prevalence changes in teenage

mothers?What about in older women?

where is the initial rate of increase in prevalence to a peak level proportional to a, and where prevalence converges, at rate , to b 0 for large t; is an offset parameter which decides the timing of the peak in prevalence.

Changes in prevalence with time – whether pooled or stratified on age – are very nicely fitted using a

“double logistic” function.

So, now we have a nice fit to all

of the available data on ANC HIV data in Harare – both for pooled and age-distributed

data.

So should we go right ahead and publish?

Why might we not want to do that …. Or at least not just yet?What does the statistical model

tell us about changes in HIV incidence?

What does it tell us about the mechanisms behind the observed changes in HIV

prevalence?

It’s becoming difficult to understand, explain and

describe (in words) what is going on.

Perhaps we are (finally) at the point where we NEED a

dynamic (mathematical) model?

Mortality in Harare.

With the end of the war in

Zimbabwe in 1980 there was a large influx of foreign aid, jobs were created, and health and

education services were improved.

Mortality in Harare declined – until the effects of the HIV-

AIDS epidemic made themselves

felt.

Mortality in Harare Zimbabwe

-2

0

2

4

6

8

10

12

14

1980 1985 1990 1995 2000 2005

Year

Mor

talit

y pe

r 100

0

We keep the population constant.

And have AIDS mortality modelled as a Weibull function.

We start with as very simple “box car” model where the probability of infection is a constant

for all ages of women and at all times

= birth rateN = S + I = infection rateI = Weibull mortality

mS I

I N SI /N IS

0.0

0.2

0.4

0.6

0.8

1.0

0 10 20 30Time (years)

P(s

urvi

ving

)

Normal (Weibull 2)

Exponential(Weibull 1)

m

0.0

0.2

0.4

0.6

0.8

1.0

1980 1990 2000 2010 2020Year

0.00

0.05

0.10

0.15

0.20P

reva

lenc

e

Inci

denc

e/m

orta

lity

= birth rateN = population = e–P

I = Weibull mort.

~

~

mS I

I N S I /N I S

0.0

0.2

0.4

0.6

0.8

1.0

0 10 20 30Prevalence (%)

Rel

ativ

e tra

nsm

issi

on .

m

–Pe

Heterogeneity in sexual behaviour

0.0

0.1

0.2

0.3

0.4

1980 1990 2000 2010 2020Year

0.00

0.02

0.04

0.06P

reva

lenc

e

Inci

denc

e/m

orta

lity

0.0

0.2

0.4

0.6

0.8

1.0

1985 1990 1995 2000Year

Rel

ativ

e tra

nsm

issi

on .

~

mS I

I N SI /N I S ~

= birth rateN = population = C(t)I = mortality

~~

C(t)

Including control

0.0

0.1

0.2

0.3

0.4

1980 1990 2000 2010 2020Year

0.00

0.02

0.04

0.06P

reva

lenc

e

Inci

denc

e/m

orta

lity

~

mS I

I N SI /N I S *

= birth rateN = population = eI = mortality

~

* –M0.0

0.2

0.4

0.6

0.8

1.0

0 2 4Annual mortality (%)

Rel

ativ

e tra

nsm

issi

on . –Me

Mortality leads to behaviour change

0.0

0.1

0.2

0.3

0.4

1980 1990 2000 2010 2020Year

0.00

0.02

0.04

0.06P

reva

lenc

e

Inci

denc

e/m

orta

lity

So things seem to have been changing for the better, on the HIV front at least, in Zimbabwe. Why?

Natural consequence of epidemic development?Economic melt down?

Emigration?Better educated population?

Greater proportion of people married?Greater awareness leading to behaviour

change?

The number of condoms distributed in

Zimbabwe has risen steadily since 1994 – as has the proportion purchased rather than

donated.

Condoms distributed in Zimbabwe1990-2004

Year1990 1992 1994 1996 1998 2000 2002 2004

Cond

oms d

istrib

uted

(milli

ons)

0

10

20

30

40

50

60

70

80

90

Public sectorSocial marketing

1987 1992 1997 2002 20071987 1992 1997 2002 2007

Year

1987 1992 1997 2002 20071987 1992 1997 2002 20071987 1992 1997 2002 2007

Pre

vale

nce

(%; 9

5% c

.i.)

0

5

10

15

20

25

30

35

40

45

50

MahomedMbizvoZVITAMBOMoHCW

Age 20 - 24Median = 1996Peak = 34.2%

Age 25 - 29Median = 1997Peak = 47.1%

Age 30 - 34Median = 1998Peak = 44.0%

Age 35 +Median = 1999Peak = 33.5%

Age < 20Median = 1993Peak = 25.4%

Before we get TOO excited and self-satisfied….Recall that we have fitted prevalence data for age-pooled

situation. Why do you think that might be?

What kinds of data are useful for modeling?Data from well-designed, well-executed trials/experiments

How to collect/access data important in modeling disease systems (network data for contact patterns, weather data, disease incidence

data, etc)?

When is it OK to take data or parameter estimates from other studies and use them in my model?

In the approach here we have stood this question on its head. We did NOT start with a model and then look for data. We started with the data

set:we played with it, we thought about it, we interpreted it and then, and

only then, we derived a model. Because we NEEDED a model.

If DAIDD is supposed to be for epidemiology-oriented people who are here to learn about mathematical modeling in order to collaborate with modelers and speak their language, it would be helpful to know how we

can actually use our epi methods training to contribute to model development.

This presentation has tried to show how the use of standard “classical “ epidemiological techniques was critical to getting a basic understanding of what was going on . This then suggested the kind of model that was

required to improve that understanding.

John Hargrove, Brian Williams DAIDD Workshop December 2013U Florida, Gainesville

Documents

Transcript of John Hargrove, Brian Williams DAIDD Workshop December 2013U Florida, Gainesville