A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the...
Transcript of A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the...
![Page 1: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/1.jpg)
A Bayesian hierarchical model tointegrate dietary exposure and biomarkermeasurements for the risk of cancer
Marta Pittavino, PhD
Senior Research Associate, Research Center for StatisticsUNIGE, Geneva - Switzerland
1 / 23
![Page 2: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/2.jpg)
Summary
Outline on measurement error
- Combine biomarkers with self-reports
A Bayesian hierarchical model with three components
- An application from the EPIC study
Concluding remarks
2 / 23
![Page 3: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/3.jpg)
Outline on measurement error
- Difference between the actual true value and themeasured value
- Self-reported assessments (questionnaire: Q and24-HDR: R) of dietary exposure prone to random andsystematic measurement errors
- Estimates of the association between dietary factorsand risk of disease can be biased
It has been suggested to complement self-reports withobjective measurements, as dietary biomarkers (M)
3 / 23
![Page 4: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/4.jpg)
Outline on measurement error
- Difference between the actual true value and themeasured value
- Self-reported assessments (questionnaire: Q and24-HDR: R) of dietary exposure prone to random andsystematic measurement errors
- Estimates of the association between dietary factorsand risk of disease can be biased
It has been suggested to complement self-reports withobjective measurements, as dietary biomarkers (M)
3 / 23
![Page 5: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/5.jpg)
Outline on measurement error
- Difference between the actual true value and themeasured value
- Self-reported assessments (questionnaire: Q and24-HDR: R) of dietary exposure prone to random andsystematic measurement errors
- Estimates of the association between dietary factorsand risk of disease can be biased
It has been suggested to complement self-reports withobjective measurements, as dietary biomarkers (M)
3 / 23
![Page 6: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/6.jpg)
Outline on measurement error
- Difference between the actual true value and themeasured value
- Self-reported assessments (questionnaire: Q and24-HDR: R) of dietary exposure prone to random andsystematic measurement errors
- Estimates of the association between dietary factorsand risk of disease can be biased
It has been suggested to complement self-reports withobjective measurements, as dietary biomarkers (M)
3 / 23
![Page 7: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/7.jpg)
Why combine self-reports and biomarkers?
Objective biomarkers have long integrated self-reported
dietary (or physical activity) measurements
- Several validation studies with recovery (OPEN, NHS,
WHI, EPIC) and concentration biomarker
measurements (Ferrari et al., AJE, 2007)
- Biomarkers also integrated in diet-disease association
studies (Freedman et al., EPI, 2011; Tasevska et al., AJE, 2018;
Prentice et al., AJE, 2017)
4 / 23
![Page 8: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/8.jpg)
Motivating scenario
Plasma levels of Vit-B6 were inversely associated with
kidney cancer risk in EPIC (Johansson, JNCI, 2014)
- Serum levels of Vit-B6, unlike Q measurements, were
inversely related to lung cancer (Johansson, JAMA, 2010)
- Dietary folate inversely related to breast cancer (de
Battle, JNCI, 2014 ), unlike plasma levels (Matejcic, IJC,
2017)
5 / 23
![Page 9: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/9.jpg)
Application
Data from two EPIC nested case-control studies on
kidney (n=1,108) and lung (n=1,764) cancers
- Q, R and M measurements on vit-B6 and folate
R measurements, available on ∼10% of the sample,
were imputed
Data were log-transformed to approximate normality
6 / 23
![Page 10: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/10.jpg)
The EPIC Study
Prospective cohort with500,000 participants from23 centres
- Dietary and lifestyleexposures assessed atbaseline
- Biological samples collectedat baseline from 80%disease-free participants
7 / 23
![Page 11: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/11.jpg)
The setting
Develop a Bayesian latent factor hierarchical model to:
- Integrate self-reported (Q) and (R) with concentration
biomarker (M) measurements
- Evaluate the measurement error structure
- Estimate the unknown association between unknown
true dietary intakes (X) and risk of disease (Y)
8 / 23
![Page 12: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/12.jpg)
How did we get here?
Long time ago, Ray Caroll and Pietro Ferrari discussed
the idea of a Bayesian model for measurement errors
- DQs and 24-HDRs of energy and fat intakes were
related to the risk of breast cancer in EPIC
Pietro Ferrari and Martyn Plummer submitted an
applicationto WCRF in 2013.
First results were produced last year
9 / 23
![Page 13: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/13.jpg)
10 / 23
![Page 14: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/14.jpg)
How did we get here?
Long time ago, Ray Caroll and Pietro Ferrari discussed
the idea of a Bayesian model for measurement errors
- Combine DQs and 24-HDRs of energy and fat intakes,
and relate them to risk of breast cancer in EPIC
Pietro Ferrari and Martyn Plummer submitted an
applicationto WCRF in 2013
- First results were produced in 2017
11 / 23
![Page 15: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/15.jpg)
The Bayesian hierarchical model
1) Exposure model: p(X |µ,Σ)
2) Measurement model:p(Q,R ,M |X )
3) Disease model: p(Y |X )
µ Σ
XR
Q(1)
M
(2)
(3)
Y
12 / 23
![Page 16: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/16.jpg)
The Bayesian hierarchical model
1) Exposure model: p(X |µ,Σ)
2) Measurement model:p(Q,R ,M |X )
3) Disease model: p(Y |X )
µ Σ
XR
Q(1)
M
(2)
(3)
Y
13 / 23
![Page 17: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/17.jpg)
The Bayesian hierarchical model
1) Exposure model:p(X |µ,Σ)
2) Measurement model:p(Q,R ,M |X )
3) Disease model: p(Y |X )
µ Σ
XR
Q(1)
M
(2)
(3)
Y
14 / 23
![Page 18: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/18.jpg)
1. The exposure model
Xik : vector of unknown true (latent factor) dietaryintake of vit-B6 and folate (k = 1, 2), withi = 1, . . . , n:
Xik ∼ MVN(µk ,ΣX ) = MVN(0,ΣX )
Σ−1X ∼ Wishart(DxDxDx , rxrxrx)
- - - - - - - - - - - - - - - - - - - - - - - - - -
- Dx : scale matrix, (k × k)
- rx : rank of the Wishart distribution
15 / 23
![Page 19: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/19.jpg)
1. The exposure model
Xik : vector of unknown true (latent factor) dietaryintake of vit-B6 and folate (k = 1, 2), withi = 1, . . . , n:
Xik ∼ MVN(µk ,ΣX ) = MVN(0,ΣX )
Σ−1X ∼ Wishart(DxDxDx , rxrxrx)
- - - - - - - - - - - - - - - - - - - - - - - - - -
- Dx : scale matrix, (k × k)
- rx : rank of the Wishart distribution
15 / 23
![Page 20: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/20.jpg)
2. The measurement model
for i = 1, . . . , n and k = 1, 2:
Qik = αQk+ βQk
· Xik + εQk
Rik = Xik + εRk
Mik = αMk+ βMk
· Xik + εMk
- Assumptions:
cov(εQk, εRk
) 6= 0, cov(εQ1 , εQ2) 6= 0, cov(εR1 , εR2) 6= 0,
cov(εQk, εMk
) = 0, cov(εRk, εMk
) = 0, cov(εM1 , εM2) = 0
16 / 23
![Page 21: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/21.jpg)
2. The measurement model
for i = 1, . . . , n and k = 1, 2:
Qik = αQk+ βQk
· Xik + εQk
Rik = Xik + εRk
Mik = αMk+ βMk
· Xik + εMk
- Assumptions:
cov(εQk, εRk
) 6= 0, cov(εQ1 , εQ2) 6= 0, cov(εR1 , εR2) 6= 0,
cov(εQk, εMk
) = 0, cov(εRk, εMk
) = 0, cov(εM1 , εM2) = 0
16 / 23
![Page 22: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/22.jpg)
2. The measurement model
for i = 1, . . . , n and k = 1, 2:
Qik = αQk+ βQk
· Xik + εQk
Rik = Xik + εRk
Mik = αMk+ βMk
· Xik + εMk
- Assumptions:
cov(εQk, εRk
) 6= 0, cov(εQ1 , εQ2) 6= 0, cov(εR1 , εR2) 6= 0,
cov(εQk, εMk
) = 0, cov(εRk, εMk
) = 0, cov(εM1 , εM2) = 0
16 / 23
![Page 23: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/23.jpg)
2. The measurement model (ii)
for i = 1, . . . , n and k = 1, 2:
Qik = αQk+ βQk
· Xik + εQk
Rik = Xik + εRk
Mik = αMk+ βMk
· Xik + εMk
- Prior distributions:
αQk∼ N(0,σ2
αQkσ2αQkσ2αQk
), βQk∼ N(0,σ2
βQkσ2βQkσ2βQk
)
εQR ∼ MVN(0,ΣεQR), Σ−1
εQR∼ Wishart(DεQR
DεQRDεQR, rεQRrεQRrεQR
)
αMk∼ N(0,σ2
αMkσ2αMkσ2αMk
), βMk∼ N(0,σ2
βMkσ2βMkσ2βMk
)
εM ∼ MVN(0,ΣεM ), Σ−1εM∼ Wishart(DεMDεMDεM , rεMrεMrεM )
17 / 23
![Page 24: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/24.jpg)
2. The measurement model (ii)
for i = 1, . . . , n and k = 1, 2:
Qik = αQk+ βQk
· Xik + εQk
Rik = Xik + εRk
Mik = αMk+ βMk
· Xik + εMk
- Prior distributions:
αQk∼ N(0,σ2
αQkσ2αQkσ2αQk
), βQk∼ N(0,σ2
βQkσ2βQkσ2βQk
)
εQR ∼ MVN(0,ΣεQR), Σ−1
εQR∼ Wishart(DεQR
DεQRDεQR, rεQRrεQRrεQR
)
αMk∼ N(0,σ2
αMkσ2αMkσ2αMk
), βMk∼ N(0,σ2
βMkσ2βMkσ2βMk
)
εM ∼ MVN(0,ΣεM ), Σ−1εM∼ Wishart(DεMDεMDεM , rεMrεMrεM )
17 / 23
![Page 25: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/25.jpg)
2. The measurement model (ii)
for i = 1, . . . , n and k = 1, 2:
Qik = αQk+ βQk
· Xik + εQk
Rik = Xik + εRk
Mik = αMk+ βMk
· Xik + εMk
- Prior distributions:
αQk∼ N(0,σ2
αQkσ2αQkσ2αQk
), βQk∼ N(0,σ2
βQkσ2βQkσ2βQk
)
εQR ∼ MVN(0,ΣεQR), Σ−1
εQR∼ Wishart(DεQR
DεQRDεQR, rεQRrεQRrεQR
)
αMk∼ N(0,σ2
αMkσ2αMkσ2αMk
), βMk∼ N(0,σ2
βMkσ2βMkσ2βMk
)
εM ∼ MVN(0,ΣεM ), Σ−1εM∼ Wishart(DεMDεMDεM , rεMrεMrεM )
17 / 23
![Page 26: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/26.jpg)
3. The disease model
Yi ∈ (0, 1): disease indicator for the i th study subject
Yi ∼ Bin(1, πi), with πi the probability of developingthe disease
- A conditional logistic model is assumed as:
P(Yi |Xk ,Zp) = H(γ1 · Xi1 + γ2 · Xi2 + ZTip γ3)
Γ = (γ1, γ2, γ3) ∼ N(0,σ2Γ(k+p)σ2
Γ(k+p)σ2
Γ(k+p))
18 / 23
![Page 27: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/27.jpg)
3. The disease model
Yi ∈ (0, 1): disease indicator for the i th study subject
Yi ∼ Bin(1, πi), with πi the probability of developingthe disease
- A conditional logistic model is assumed as:
P(Yi |Xk ,Zp) = H(γ1 · Xi1 + γ2 · Xi2 + ZTip γ3)
Γ = (γ1, γ2, γ3) ∼ N(0,σ2Γ(k+p)σ2
Γ(k+p)σ2
Γ(k+p))
18 / 23
![Page 28: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/28.jpg)
Analytical steps
R measurements, available on ∼10% of the sample,were imputed as E (R |Q)
- Data were log-transformed to approximate normality
- Residuals were computed by country, study, smoking,batch, age and sex (M), country, age and sex (Q), ageand sex (R)
- Disease models were run separately by study
- Only results for kidney cancer disease model are shown
Analyses were run in R with JAGS (Martyn Plummer)
19 / 23
![Page 29: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/29.jpg)
3. Results: disease modelTable 3. Relative risk estimates ( ˆRRk = e γ̂k ).
Vit-B6 Folate
R̂R1 (95% CI1) R̂R2 (95% CI1)
Q 1.00 (0.88, 1.14) 0.96 (0.84, 1.10)
R - - - -
M 0.79 (0.69, 0.89) 0.95 (0.84, 1.07)
X 0.71 (0.58, 0.88) 0.85 (0.72, 1.02)
1Credible Intervals
20 / 23
![Page 30: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/30.jpg)
3. Results: disease modelTable 3. Relative risk estimates ( ˆRRk = e γ̂k ).
Vit-B6 Folate
R̂R1 (95% CI1) R̂R2 (95% CI1)
Q 1.00 (0.88, 1.14) 0.96 (0.84, 1.10)
R - - - -
M 0.79 (0.69, 0.89) 0.95 (0.84, 1.07)
X 0.71 (0.58, 0.88) 0.85 (0.72, 1.02)
1Credible Intervals
20 / 23
![Page 31: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/31.jpg)
Concluding remarks
Challenging to tackle the complexity of dietary andbiomarker measurements
- After measurement error correction Vit-B6 and folateintakes were inversely associated with the risk ofkidney cancer
Aspects still to tune up
- Great potential to use data from studies with availableconcentration biomarker data
21 / 23
![Page 32: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/32.jpg)
Concluding remarks (ii)
Are Bayesian modelling worth the trouble in nutritional
epidemiology?
- They are flexible and can be very informative
Need of informative data, possibly replicates of R
(and M) measurements
- Need of more biomarker measurements of dietary
exposure
22 / 23
![Page 33: A Bayesian hierarchical model to integrate dietary ... · R measurements, available on ˘10% of the sample, were imputed as E(RjQ)-Data were log-transformed to approximate normality-Residuals](https://reader033.fdocuments.us/reader033/viewer/2022050221/5f668e70ad6a4643724fe85f/html5/thumbnails/33.jpg)
Acknowledgments
Pietro Ferrari, Hannah Lennon, Martyn Plummer
- WCRF/AICR for funding the project
- Mattias Johansson, Veronique Chajes
- Ray Carroll, Paul Gustafson, Victor Kipnis, HeatherBowles
EPIC collaborators
23 / 23