Univariate Twin Analysis
description
Transcript of Univariate Twin Analysis
1
Univariate Twin Analysis
OpenMx Tc26 2012Hermine Maes, Elizabeth Prom-Wormley, Nick Martin
2
Overall Questions to be Answered
•Does a trait of interest cluster among related individuals?
•Can clustering be explained by genetic or environmental effects?
•What is the best way to explain the degree to which genetic and environmental effects affect a trait?
3
Family & Twin Study Designs
•Family Studies
•Classical Twin Studies
•Adoption Studies
•Extended Twin Studies
4
The Data
•Australian Twin Register
•18-30 years old, males and females
•Work from this session will focus on Body Mass Index (weight/height2) in females only
•Sample size
•MZF = 534 complete pairs (zyg = 1)
•DZF = 328 complete pairs (zyg = 3)
569 total MZ pairs351 total DZ pairs569 total MZ pairs351 total DZ pairs
5
A Quick Look at the Data
6
Classical Twin Studies Basic Background
•The Classical Twin Study (CTS) uses MZ and DZ twins reared together
•MZ twins share 100% of their genes
•DZ twins share on average 50% of their genes
•Expectation- Genetic factors are assumed to contribute to a phenotype when MZ twins are more similar than DZ twins
7
Classical Twin Study
Assumptions
•MZ twins are genetically identical
•Equal Environments of MZ and DZ pairs
8
Basic Data Assumptions
•MZ and DZ twins are sampled from the same population, therefore we expect :-
•Equal means/variances in Twin 1 and Twin 2
•Equal means/variances in MZ and DZ twins
•Further assumptions would need to be tested if we introduce male twins and opposite sex twin pairs
9
“Old Fashioned” Data Checking
MZ DZT1 T2 T1 T2
mean21.3
521.3
421.4
521.4
6variance 0.73 0.79 0.77 0.82covariance(T1-
T2)0.59 0.25
Nice, but how can we actually be sure that these means and variances are truly the same?
10
Univariate AnalysisA Roadmap1- Use the data to test basic assumptions
(equal means & variances for twin 1/twin 2 and MZ/DZ pairs)
Saturated Model
2- Estimate contributions of genetic and environmental effects on the total variance of a phenotype
ACE or ADE Models
3- Test ACE (ADE) submodels to identify and report significant genetic and environmental contributions
AE or CE or E Only Models10
11
Saturated Twin Model
12
Saturated Code Deconstructed
mMZ1 mMZ2 mDZ1 mDZ2mean MZ = 1 x 2
matrixmean DZ = 1 x 2
matrix
13
Saturated Code Deconstructed
vMZ1 cMZ21
cMZ21 vMZ2
T1 T2
T1
T2
vDZ1 cDZ21
cDZ21 vDZ2
T1 T2
T1
T2
covMZ = 2 x 2 matrix
covDZ = 2 x 2 matrix
14
Time to Play... with OpenMx
Please Open the FiletwinSatCon.R
Estimated ValuesT1 T2 T1 T2Saturated Model
mean MZ DZcov T1 T1
T2 T210 Total Parameters Estimated
Standardize covariance matrices for twin pair correlations (covMZ & covDZ)
mMZ1, mMZ2, vMZ1,vMZ2,cMZ21mDZ1, mDZ2, vDZ1,vDZ2,cDZ21
16
Estimated Values
10 Total Parameters Estimated
Standardize covariance matrices for twin pair correlations (covMZ & covDZ)
mMZ1, mMZ2, vMZ1,vMZ2,cMZ21mDZ1, mDZ2, vDZ1,vDZ2,cDZ21
T1 T2 T1 T2Saturated Model
mean MZ 21.34 21.35 DZ 21.45 21.46cov T1 0.73 T1 0.77
T2 0.59 0.79 T2 0.24 0.82
17
Fitting Nested Models
•Saturated Model• likelihood of data without any constraints
• fitting as many means and (co)variances as possible
•Equality of means & variances by twin order• test if mean of twin 1 = mean of twin 2
• test if variance of twin 1 = variance of twin 2
•Equality of means & variances by zygosity• test if mean of MZ = mean of DZ
• test if variance of MZ = variance of DZ
Estimated Values
T1 T2 T1 T2Equate Means & Variances across Twin
Ordermean MZ DZcov T1 T1
T2 T2Equate Means Variances across Twin Order
& Zygositymean MZ DZcov T1 T1
T2 T2
19
Estimates
T1 T2 T1 T2Equate Means & Variances across Twin
Ordermean MZ 21.35 21.35 DZ 21.45 21.45cov T1 0.76 T1 0.79
T2 0.59 0.76 T2 0.24 0.79Equate Means Variances across Twin Order
& Zygositymean MZ 21.39 21.39 DZ 21.39 21.39cov T1 0.78 T1 0.78
T2 0.61 0.78 T2 0.23 0.78
20
StatsModel ep -2ll df AIC
diff -2ll
diffdf
p
Saturated 104055.9
3176
7521.9
3
mT1=mT2 8 4056176
9518 0.07 2
0.97
mT1=mT2 &varT1=VarT2 6
4058.94
1771
516.94
3.01 40.56
ZygMZ=DZ 4
4063.45
1773
517.45
7.52 60.28
No significant differences between saturated model and models where means/variances/covariances are equal by zygosity and between twins
21
Questions?
This is all great, but we still don’t understand the genetic and
environmental contributions to BMI
22
Patterns of Twin Correlation
rMZ = 2rDZAdditive
DZ twins on average
share 50% of additive
effects
rMZ = rDZShared
Environment
rMZ > 2rDZ
Additive &Dominanc
e
DZ twins on
average share 25%
of dominance effects
???
A = 2(rMZ-rDZ)C = 2rDZ - rMZ E = 1- rMZ
Additive & “Shared
Environment”
23
BMI Twin Correlations
0.30
0.78
A = 2(rMZ-rDZ)
C = 2rDZ - rMZ
E = 1- rMZA = 0.96
C = -0.20
E = 0.22
ADE or ACE?
24
Univariate AnalysisA Roadmap
1- Use the data to test basic assumptions inherent to standard ACE (ADE) models
Saturated Model
2- Estimate contributions of genetic and environmental effects on the total variance of a phenotype
ADE or ACE Models
3- Test ADE (ACE) submodels to identify and report significant genetic and environmental contributions
AE or E Only Models
25
ADE ModelA Few Considerations
•Dominance refers to non-additive genetic effects resulting from interactions between alleles at the same locus
•ADE models also include effects of interactions between different loci (epistasis)
•In addition to sharing on average 50% of their DNA, DZ twins also share about 25% of effects due to dominance
26
ADE Model Deconstructed Path Coefficients
a1 x 1 matrix
d1 x 1 matrix
e1 x 1 matrix
27
ADE Model DeconstructedVariance Components
a1 x 1 matrix
d1 x 1 matrix
e1 x 1 matrix
aT
eT
*
*
*
dT
28
ADE Model DeconstructedMeans & Covariances
expMean
expMean
expMean= 1 x 2 matrix
A+D+E
A+D+E
T1T2
T2
T1
covMZ & covDZ= 2 x 2 matrix
29
ADE Model DeconstructedMeans & Covariances
A+D+E
0.5A A+D+E
T1T2
T2
T1
A+D+E
A A+D+E
T1T2
T2
T1
30
ADE Model DeconstructedMeans & Covariances
A+D+E0.5A+0.25
D
0.5A+0.25D
A+D+E
T1T2
T2
T1
A+D+E A+D
A+D A+D+E
T1T2
T2
T1
31
ADE Model Deconstructed
4 Parameters Estimated
ExpMeanVariance due to AVariance due to DVariance due to E
32
Univariate AnalysisA Roadmap
1- Use the data to test basic assumptions inherent to standard ACE (ADE) models
Saturated Model
2- Estimate contributions of genetic and environmental effects on the total variance of a phenotype
ADE or ACE Models
3- Test ADE (ACE) submodels to identify and report significant genetic and environmental contributions
AE or E Only Models
33
Time to Play... with OpenMx
Please Open the FiletwinACECon.R
34
Testing Specific Questions
•‘Full’ ADE Model•Comparison against Saturated
•Nested Models•AE Model•test significance of D
•E Model vs AE Model•test significance of A
•E Model vs ADE Model•test combined significance of A
& D
35
AE Model Deconstructed
AeModel <- mxModel( AdeFit, name="AE" )
Object containing Full Model
New Object ContainingAE Model
AeModel <- omxSetParameters( AeModel, labels="c11", free=FALSE, values=0)
AeFit <- mxRun(AeModel)
The d path will not be estimated
Fixing the value of the c path to zero
36
Goodness of Fit Stats
ep -2ll df AICdiff -2ll
diffdf
p
Saturated
ADE
AE
DE no! no! no! no! no! no! no!
E
Goodness-of-Fit Statistics
ep -2ll df AICdiff -2ll
diffdf
p
Saturated
104055.9
3176
7521.9
3- - -
ADE 44063.4
5177
3517.4
57.52 6
0.28
AE 34067.6
6177
4519.6
64.21 1
0.06
E 24591.7
9177
51041.
79528.
32
0.00
38
What about the magnitudes of genetic and environmental
contributions to BMI?
39
Univariate AnalysisA Roadmap
1- Use the data to test basic assumptions inherent to standard ACE (ADE) models
Saturated Model
2- Estimate contributions of genetic and environmental effects on the total variance of a phenotype
ADE or ACE Models
3- Test ADE (ACE) submodels to identify and report significant genetic and environmental contributions
AE or E Only Models
40
Estimated Values
a d e c a2 d2 e2 c2
ADE -
AE -
E -
Estimated Values
a d e c a2 d2 e2
ADE 0.57 0.54 0.41 - 0.41 0.37 0.22
AE 0.78 - 0.42 - 0.78 - 0.22
E - - 0.88 - - - 1.00
42
Okay, so we have a sense of which model
best explains the data...or do we?
43
BMI Twin Correlations
0.30
0.78
A = 2(rMZ-rDZ)
C = 2rDZ - rMZ
E = 1- rMZ
ADE or ACE?
44
Giving the ACE Model a Try
(if there is time)
45
Goodness-of-Fit Statistics
Model ep -2ll df AICdiff -2ll
diffdf
p
Saturated
ADE
ACE
46
Goodness-of-Fit Statistics
Model ep -2ll df AICdiff -2ll
diffdf
p
Saturated 104055.9
3176
7521.93 - - -
ADE 44063.4
5177
3517.45 7.52 6
0.28
ACE 44067.6
6177
3521.66
11.73
60.07
47
Estimated Values
a d e c a2 d2 e2 c2
ADE
AE
ACE
AE
E
Estimated Values
a d e c a2 d2 e2 c2
ADE 0.57 0.54 0.41 - 0.41 0.37 0.22 -
AE 0.78 - 0.42 - 0.78 - 0.22 -
ACE 0.78 0.00 0.42 - 0.78 0.00 0.22 -
AE - - 0.56 0.68 - - 0.41 0.59
E - - 0.88 - - - 1.00 -
49
Conclusions
•BMI in young OZ females (age 18-30)
•additive genetic factors: highly significant
•dominance: borderline significant
•specific environmental factors: significant
•shared environmental factors: not
50
Potential Complications
•Assortative Mating
•Gene-Environment Correlation
•Gene-Environment Interaction
•Sex Limitation
•Gene-Age Interaction
51
Thank You!