John Hargrove, Brian Williams DAIDD Workshop December 2013U Florida, Gainesville
description
Transcript of John Hargrove, Brian Williams DAIDD Workshop December 2013U Florida, Gainesville
Data collection, analysis, modelling, publication ……. and beyond
Lessons learned from the analysis of HIV prevalence and incidence data from Zimbabwe.
John Hargrove, Brian Williams
DAIDD WorkshopDecember 2013 U Florida, Gainesville
What kinds of data are useful for modeling? How to collect/access data important in modeling disease systems (network data for contact patterns, weather data, disease incidence
data, etc)?
When is it OK to take data or parameter estimates from other studies and use them in my model?
If DAIDD is supposed to be for epidemiology-oriented people who are here to learn about mathematical modeling in order to collaborate with modelers and speak their language, it would be helpful to know how we
can actually use our epi methods training to contribute to model development.
The thrust of this workshop is an effort to encourage those already invested in (quantitative) medical research to consider ways in
which mathematical modelling might add value to their research.
One of last year’s DAIDD participants had the following thoughts, which we will bear in mind during this discussion:
Strictly speaking the analysis of such an RCT interests itself solely in deciding whether or not pre-defined null hypotheses can or cannot be
rejected.[Some studies even preclude other analyses being
applied to the data].But restricting one’s view in this way can mean
that one is wasting valuable information that can shed light on other areas of interest.
The Randomised Control Trial is quite rightly regarded as the gold standard for a clinical trial. RCTs are often
used to test the efficacy and/or effectiveness of various types of medical intervention within a patient
population.
14,110 mothers and their babies were recruited within 72 hours of giving birth.
The RCT tested for the efficacy of a single large dose of vitamin A in reducing maternal and neonatal
mortality among HIV positive and negative cases, HIV incidence in mothers, and mother-to-baby
transmission of the virus.Trial suggested following demonstration in India that vitamin A could reduce perinatal mortality even in
settings where there was no HIV.
The ZVITAMBO (Zimbabwe Vitamin A for Mothers and Babies) study was such an RCT carried out in Harare,
Zimbabwe between November 1997 and January 2000.
The Trial might thus be viewed as a disappointment – even if it did, at least, provide an unequivocal
answer to the research question.
But this disappointment was entirely over-ridden by the spin-off, which steadily emerged from the
analysis of all of the data collected in the process.
The ZVITAMBO study found no effect at all (neither positive nor negative) of vitamin A treatment on any of the six medical outcomes investigated.
• Demonstrated a marked genetic predisposition to HIV infection among sub-groups of the population.
• Showed that HIV positive women were at significantly increased risk of dying – regardless of CD4 count.
• Used to validate the BED assay for application to clade C virus: and currently being similarly used to validate more effective avidity bio-markers to be used in HIV incidence estimation.
The study demonstrated unequivocally the importance of exclusive breastfeeding in minimising mother-to-
child transmission of HIV and optimising disease free infant survival
In order to estimate the effect of Vitamin treatment on HIV acquisition it was necessary to test all mothers and babies for HIV – at recruitment and then at 3-mo intervals for up to two
years.
As a consequence the Trial produced an interesting pictures of HIV prevalence and incidence as a function of time and of
maternal age.
In what follows we will try to see what we can learn from such data (first) without using any mathematical modelling.
And then try to see what further juice we can get through the use of the mathematical press.
None of the above results depended (primarily) on mathematical modelling. We now look at a further example where simple statistical analysis was not sufficient and where modelling was necessary: and
useful …
First law of statistics?Look at your data.
Second law of statistics?Play with your data.
The thrust of what we are trying to get across in this workshop is that we want to engage
with data
If it’s good enough for Isaac it’s good enough for me.
PLAY with your data.
I was like a boy playing on the sea-shore, and diverting myself now and then finding a smoother pebble or a prettier shell than ordinary, whilst the great ocean of truth lay all undiscovered before
me.
A pre-requisite for a good (data-based) modelling exercise is a good data set.
So first clean your data.Data on age is just one example…..
Data on parity is another.The cleaning process can be tedious, but it is necessary.
“There’s never time to do it right ….”.“but there’s always time to do it over”
Now pool on age and see whether there is any relationship between HIV prevalence and calendar date.
Is there any trend in the prevalence with date of recruitment??
Mean prevalence vs month of recruitment
Month of recruitment0 4 8 12 16 20 24 28
Prev
alen
ce a
t re
crui
tmen
t
22
24
26
28
30
32
34
36
38
40
1. Prevalence is increasing2. Prevalence is decreasing
3. Prevalence is not changing significantly with time
4. Something else is going on5. The dog ate my homework
6. I have a headache7. I don’t like you anyway so I’m not going to
answer any of your questions
Decide between the following possibilities:
15
05
1015
Tem
pera
ture
(Cel
sius
)
1960 1970 1980 1990 2000 2010Year
What is the relationship?
16
13.8
1414
.214
.414
.6Te
mpe
ratu
re (C
elsi
us)
1960 1970 1980 1990 2000 2010Year
Median Global Temperature During the Past 50 Years
What do we now think about the scales in this figure?How quickly would we expect HIV prevalence to change?
Mean prevalence vs month of recruitment
Month of recruitment0 4 8 12 16 20 24 28
Prev
alen
ce a
t re
crui
tmen
t
22
24
26
28
30
32
34
36
38
40
For the ZVITAMBO Trial, HIV prevalence
increased significantly during 1998, thereafter it declined significantly.
We have fitted a parabola to the data. Is
that a good model?
What happens for very small, or very large,
values of time?What does prevalence
pattern actually look like pre/post ZVITAMBO?
Mean prevalence vs month of recruitment
Month and year of recruitment
Prev
alenc
e at r
ecru
itmen
t
22
24
26
28
30
32
34
36
38
40
N J M M J S N J M M J S N J1997 1998 1999 2000
When the ZVITAMBO data
are amalgamated with other data from Harare ANC sites, prevalence
appears to have peaked at the end of 1998 and seems to have been declining ever since.
Why should this be?Is it a natural consequence of epidemic development?
Is it just due to deaths?Do the same changes
occur in all age groups?Perhaps older people are
dying off, leaving just young women with
(relatively) low prevalence?
Look at age effects.
HIV prevalence initially increases with age –
peaking at a horrendous level of 50% for women aged about 30. Then
declines sharply.Why the decline?
Is it due to decreasing incidence in older women?
Or is it due to deaths?If due to death among
older women would expect decline in mean age.Perhaps this fits with
declining prevalence over time?
Age13 16 19 22 25 28 31 34 37 40 43 46 49
Prev
alenc
e (p
erce
nt)
0
5
10
15
20
25
30
35
40
45
50
55
Age structure did shift towards younger women. From 1991 to
2002, teenage pregnancies increased from 11% to 23%; >35s decreased from 13% to
3%; mean from 27.4 to 24.6 yrs.
But since that time there has been a reversal in the age trend.
Need to look at age-specific HIV incidence and prevalence.
Harare ANC dataMean ages
23.0
23.5
24.0
24.5
25.0
25.5
26.0
26.5
27.0
1990 1994 1998 2002 2006 2010Year
Mea
n ag
e
Only two estimates of age-incidence function. Why so few??
The shape of the two age-
incidence graphs are similar and consistent with the idea that risk of HIV infection has, over much
of the epidemic, been a decreasing function of age.
The women for Mbizvo study were recruited in 1991/2; 7-9
years before ZVITAMBO.
Why does the age-incidence curve seem to be so much less variable in the Mbizvo study?Age at recruitment
18 22 26 30 34 38 42 46 50
Perc
ent i
nfec
ted
per y
ear
0
1
2
3
4
5
6
7Mbizvo (1992) 4.8/ 100 py [3.1-6.5]ZVI TAMBO (1998/ 99) 3.4%pa [3.0-3.8]
1987 1992 1997 2002 20071987 1992 1997 2002 2007
Year
1987 1992 1997 2002 20071987 1992 1997 2002 20071987 1992 1997 2002 2007
Pre
vale
nce
(%; 9
5% c
.i.)
0
5
10
15
20
25
30
35
40
45
50
MahomedMbizvoZVITAMBOMoHCW
Age 20 - 24Median = 1996Peak = 34.2%
Age 25 - 29Median = 1997Peak = 47.1%
Age 30 - 34Median = 1998Peak = 44.0%
Age 35 +Median = 1999Peak = 33.5%
Age < 20Median = 1993Peak = 25.4%
Look how height and timing of peak prevalence changes with age.
How do we explain these changes?What is the significance of prevalence changes in teenage
mothers?What about in older women?
where is the initial rate of increase in prevalence to a peak level proportional to a, and where prevalence converges, at rate , to b 0 for large t; is an offset parameter which decides the timing of the peak in prevalence.
Changes in prevalence with time – whether pooled or stratified on age – are very nicely fitted using a
“double logistic” function.
So, now we have a nice fit to all
of the available data on ANC HIV data in Harare – both for pooled and age-distributed
data.
So should we go right ahead and publish?
Why might we not want to do that …. Or at least not just yet?What does the statistical model
tell us about changes in HIV incidence?
What does it tell us about the mechanisms behind the observed changes in HIV
prevalence?
It’s becoming difficult to understand, explain and
describe (in words) what is going on.
Perhaps we are (finally) at the point where we NEED a
dynamic (mathematical) model?
Mortality in Harare.
With the end of the war in
Zimbabwe in 1980 there was a large influx of foreign aid, jobs were created, and health and
education services were improved.
Mortality in Harare declined – until the effects of the HIV-
AIDS epidemic made themselves
felt.
Mortality in Harare Zimbabwe
-2
0
2
4
6
8
10
12
14
1980 1985 1990 1995 2000 2005
Year
Mor
talit
y pe
r 100
0
We keep the population constant.
And have AIDS mortality modelled as a Weibull function.
We start with as very simple “box car” model where the probability of infection is a constant
for all ages of women and at all times
= birth rateN = S + I = infection rateI = Weibull mortality
mS I
I N SI /N IS
0.0
0.2
0.4
0.6
0.8
1.0
0 10 20 30Time (years)
P(s
urvi
ving
)
Normal (Weibull 2)
Exponential(Weibull 1)
m
0.0
0.2
0.4
0.6
0.8
1.0
1980 1990 2000 2010 2020Year
0.00
0.05
0.10
0.15
0.20P
reva
lenc
e
Inci
denc
e/m
orta
lity
= birth rateN = population = e–P
I = Weibull mort.
~
~
mS I
I N S I /N I S
0.0
0.2
0.4
0.6
0.8
1.0
0 10 20 30Prevalence (%)
Rel
ativ
e tra
nsm
issi
on .
m
–Pe
Heterogeneity in sexual behaviour
0.0
0.1
0.2
0.3
0.4
1980 1990 2000 2010 2020Year
0.00
0.02
0.04
0.06P
reva
lenc
e
Inci
denc
e/m
orta
lity
0.0
0.2
0.4
0.6
0.8
1.0
1985 1990 1995 2000Year
Rel
ativ
e tra
nsm
issi
on .
~
mS I
I N SI /N I S ~
= birth rateN = population = C(t)I = mortality
~~
C(t)
Including control
0.0
0.1
0.2
0.3
0.4
1980 1990 2000 2010 2020Year
0.00
0.02
0.04
0.06P
reva
lenc
e
Inci
denc
e/m
orta
lity
~
mS I
I N SI /N I S *
= birth rateN = population = eI = mortality
~
* –M0.0
0.2
0.4
0.6
0.8
1.0
0 2 4Annual mortality (%)
Rel
ativ
e tra
nsm
issi
on . –Me
Mortality leads to behaviour change
0.0
0.1
0.2
0.3
0.4
1980 1990 2000 2010 2020Year
0.00
0.02
0.04
0.06P
reva
lenc
e
Inci
denc
e/m
orta
lity
So things seem to have been changing for the better, on the HIV front at least, in Zimbabwe. Why?
Natural consequence of epidemic development?Economic melt down?
Emigration?Better educated population?
Greater proportion of people married?Greater awareness leading to behaviour
change?
The number of condoms distributed in
Zimbabwe has risen steadily since 1994 – as has the proportion purchased rather than
donated.
Condoms distributed in Zimbabwe1990-2004
Year1990 1992 1994 1996 1998 2000 2002 2004
Cond
oms d
istrib
uted
(milli
ons)
0
10
20
30
40
50
60
70
80
90
Public sectorSocial marketing
1987 1992 1997 2002 20071987 1992 1997 2002 2007
Year
1987 1992 1997 2002 20071987 1992 1997 2002 20071987 1992 1997 2002 2007
Pre
vale
nce
(%; 9
5% c
.i.)
0
5
10
15
20
25
30
35
40
45
50
MahomedMbizvoZVITAMBOMoHCW
Age 20 - 24Median = 1996Peak = 34.2%
Age 25 - 29Median = 1997Peak = 47.1%
Age 30 - 34Median = 1998Peak = 44.0%
Age 35 +Median = 1999Peak = 33.5%
Age < 20Median = 1993Peak = 25.4%
Before we get TOO excited and self-satisfied….Recall that we have fitted prevalence data for age-pooled
situation. Why do you think that might be?
What kinds of data are useful for modeling?Data from well-designed, well-executed trials/experiments
How to collect/access data important in modeling disease systems (network data for contact patterns, weather data, disease incidence
data, etc)?
When is it OK to take data or parameter estimates from other studies and use them in my model?
In the approach here we have stood this question on its head. We did NOT start with a model and then look for data. We started with the data
set:we played with it, we thought about it, we interpreted it and then, and
only then, we derived a model. Because we NEEDED a model.
If DAIDD is supposed to be for epidemiology-oriented people who are here to learn about mathematical modeling in order to collaborate with modelers and speak their language, it would be helpful to know how we
can actually use our epi methods training to contribute to model development.
This presentation has tried to show how the use of standard “classical “ epidemiological techniques was critical to getting a basic understanding of what was going on . This then suggested the kind of model that was
required to improve that understanding.