THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

78
THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS James J. Lee University of Minnesota Twin Cities

description

THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

Transcript of THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

Page 1: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE GENETIC ARCHITECTURES OF

PSYCHOLOGICAL TRAITSJames J. Lee

University of Minnesota Twin Cities

Page 2: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THREE LAWS OF BEHAVIOR GENETICS

• First Law. All behavioral traits are heritable.

• Second Law. The effect of being raised in the same family is smaller than the effect of genes.

• Third Law. A substantial portion of the variance in behavioral traits is not accounted for by genes or families.

Eric Turkheimer, the coiner of the Three Laws of Behavior Genetics.

Page 3: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

EVIDENCE FROM CLASSICAL QUANTITATIVE GENETICS

The Minnesota Adolescent Adoption Study (Scarr & Weinberg, 1978; Scarr, 1997)

90 100 110 120 130

8090

100

110

120

130

BIOLOGICAL FAMILIES

MIDPARENT IQ

OFF

SP

RIN

G IQ

β = 0.61± 0.07

90 100 110 120 130

8090

100

110

120

130

ADOPTIVE FAMILIES

MIDPARENT IQ

OFF

SP

RIN

G IQ

β = 0.13 ± 0.08

Page 4: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

EVIDENCE FROM CLASSICAL QUANTITATIVE GENETICS

BIOLOGICAL FAMILIESADOPTIVE FAMILIES

The Sibling Interaction and Behavior Study (McGue et al., 2007)

Page 5: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE SEARCH FOR CAUSAL VARIANTS AT THE DNA LEVEL

• If studies of twins and other kinships support the Three Laws, it seems justified to search for the causal loci at the DNA level.

• This is the aim of genome-wide association studies (GWAS).

A research subject provides DNA by spitting into a tube with a preservative.

Page 6: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

BACKLASH AGAINST GWAS OF PSYCHOLOGICAL TRAITS

• Correlations between common variants and phenotypes such as general cognitive ability (g) and schizophrenia have turned out to be very small.

• We have just reported three common SNPs that each account for ~0.02% of IQ variance (Rietveld et al., 2014).

Page 7: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

BACKLASH AGAINST GWAS OF PSYCHOLOGICAL TRAITS

• In response, a fellow at the Center for Genetics and Society wrote a blog post called “The Stupidity of Smart Genes.”

• Some academics are scarcely more charitable. Kevin Mitchell of Trinity College Dublin: “The idea that this trait is determined by common variants … is really unproven.”

Page 8: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

MY RESPONSE TO THE BACKLASH

• We seem to have a paradox: if traits are as heritable as implied by classical studies, then where are the genes?

• I argue that the heritability is hiding in plain sight: there are thousands of causal variants, each of which exerts a small effect—which means that it is difficult to find any single one.

Page 9: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

MY RESPONSE TO THE BACKLASH

• I provide an estimate of the total GWAS sample size required to capture the entire heritability (due to common variants) of a phenotype like g.

• Most importantly, I argue that chasing down thousands of DNA variants with small effects is a worthy scientific enterprise.

Page 10: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

HERITABILITY ESTIMATED DIRECTLY FROM DNA DATA

• A critic might claim that GWAS of psychological traits cannot be guaranteed to produce more “hits” as sample size grows.

• Perhaps “indirect” heritability estimates from studies of twins, adoptees, etc. are flawed and there are not that many causal loci after all. Richard Nixon and Forrest Gump may be

slightly less similar at the DNA level than most other random pairs of people.

Page 11: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

HERITABILITY ESTIMATED DIRECTLY FROM DNA DATA

• In recent years a new method, often called GCTA (after the software package Genome-wide Complex Trait Analysis), obtains “direct” estimates of heritability from DNA data.

• Think of two people who are not related to you. Richard Nixon and Forrest Gump may be

slightly less similar at the DNA level than most other random pairs of people.

Page 12: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

HERITABILITY ESTIMATED DIRECTLY FROM DNA DATA

• If we genotype/sequence all three individuals, you will turn out by chance to be slightly more similar at the genetic level to person A than to person B.

• Are you also phenotypically more similar to A than to B? Richard Nixon and Forrest Gump may be

slightly less similar at the DNA level than most other random pairs of people.

Page 13: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

HERITABILITY ESTIMATED DIRECTLY FROM DNA DATA

• In a large sample of unrelated people, we look at all pairs of people and calculate their genetic and phenotypic similarities.

• Higher heritability means that genetically similar people will tend to be phenotypically similar. Richard Nixon and Forrest Gump may be

slightly less similar at the DNA level than most other random pairs of people.

Page 14: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

HERITABILITY ESTIMATED DIRECTLY FROM DNA DATA

E (yy0) = A�2A + I�2

E ,

where Aij =1

p

pX

k=1

zikzjk

!• y: the vector of phenotypic values • σA

2: additive genetic variance • σE

2: residual variance • A: the matrix of “relatedness” coefficients • I: the identity matrix • zik: the standardized gene count of person i at locus k

Page 15: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

HERITABILITY ESTIMATED DIRECTLY FROM DNA DATA

• According to GCTA, the heritability of g is roughly 0.45 (Davies et al., 2011; Chabris et al., 2012).

• This is actually a lower bound on h2 because many causal variants (particularly those where one allele is rare) are probably not captured by SNP chips.

Peter Visscher, quantitative geneticist and a developer of GCTA.

Page 16: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

HERITABILITY ESTIMATED DIRECTLY FROM DNA DATA

• We have studied the conditions under which GCTA provides a valid estimate of SNP-based heritability (Lee & Chow, 2014).

• If the causal variants tend to be less well tagged (a realistic case), then GCTA will be biased downward.

• Thus, h2GCTA < h2

SNP < h2.My postdoctoral supervisor, Carson Chow, goes to the supermarket.

Page 17: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

HERITABILITY ESTIMATED DIRECTLY FROM DNA DATA

0.0

0.2

0.4

0.6

0.8

1.0

GR

EM

L H

ER

ITA

BIL

IY E

STI

MA

TE

VERY WEAKLY TAGGED

WEAKLY TAGGED

MODERATELY TAGGED

STRONGLY TAGGED

VERY STRONGLY TAGGED

The purple horizontal line corresponds to the true h2SNP in our simulations.

Page 18: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

HERITABILITY ESTIMATED DIRECTLY FROM DNA DATA

• “Direct” estimates based on DNA data and “indirect” estimates based on the correlations between relatives are thus fully consistent.

• “Our results unequivocally confirm that a substantial proportion of individual differences in human intelligence is due to genetic variation” (Davies et al., 2011).

Peter Visscher, quantitative geneticist and a developer of GCTA.

Page 19: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

HERITABILITY ESTIMATED DIRECTLY FROM DNA DATA

• Some trait-associated SNPs might only be correlated (in linkage disequilibrium) with untyped causal variants.

• How can we be sure that GCTA-estimated heritability reflects common variants?

• Basic principle of psychometrics. Two dichotomously scored items can show a strong correlation only if their pass rates are similar.

A

b

l

0 0.5 10

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

B

b

l

0 0.5 10

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

0 0.5 10

0.5

1

1.5

2

2.5

3x 10−3

correlation size

frequ

ency

D

SNP index i

SNP

inde

x j

C

2000 4000 6000 8000

2000

4000

6000

80000

0.2

0.4

0.6

0.8

1

A color-coded correlation matrix of SNPs on chromosome 22.

Page 20: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

HERITABILITY ESTIMATED DIRECTLY FROM DNA DATA

• This same principle also applies to genetics!

• Two SNPs can show strong linkage disequilibrium (LD) only if their allele frequencies are similar.

• Therefore, a substantial h2GCTA implies that common variants play a large role.

A

b

l

0 0.5 10

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

B

b

l

0 0.5 10

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

0 0.5 10

0.5

1

1.5

2

2.5

3x 10−3

correlation size

frequ

ency

D

SNP index i

SNP

inde

x j

C

2000 4000 6000 8000

2000

4000

6000

80000

0.2

0.4

0.6

0.8

1

A color-coded correlation matrix of SNPs on chromosome 22.

Page 21: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

T A… …

C G… …

C… …

C… …

C… …

T A… …

T A… …

G

G

G

Locus 1 MAF = 3/7

Locus 2 MAF = 3/7

Page 22: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

T A… …

T G… …

T… …

T… …

C… …

T A… …

T A… …

G

G

G

Locus 1 MAF = 1/7

Locus 2 MAF = 3/7

Page 23: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE NUMBER OF CAUSAL VARIANTS: THE “POLY” IN POLYGENIC

• The simulations and mathematical arguments by Lee and Chow (2014) show that GCTA can be valid even if there is just one trait-associated SNP.

• Can we find other evidence supporting the notion that missing heritability is distributed among many variants of very small effect?

Peter Visscher, quantitative geneticist and a developer of GCTA.

Page 24: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE NUMBER OF CAUSAL VARIANTS: THE “POLY” IN POLYGENIC

• GCTA has an advantage over classical pedigree-based methods. It can partition h2 among different parts of the genome.

• E.g., we can determine how much heritability is contributed by each chromosome.

Peter Visscher, quantitative geneticist and a developer of GCTA.

Page 25: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE NUMBER OF CAUSAL VARIANTS: THE “POLY” IN POLYGENIC

• Basic idea. Calculate separate realized genetic similarities for different parts of the genome.

• Suppose that there are many causal loci on chr1, but none on chr2. Then chr1 genetic similarity will predict phenotypic similarity, whereas chr2 genetic similarity will not.

Peter Visscher, quantitative geneticist and a developer of GCTA.

Page 26: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

PARTITIONING SCHIZOPHRENIA HERITABILITY AMONG CHROMOSOMES

Lee et al. (2012)

Page 27: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

PARTITIONING SCHIZOPHRENIA HERITABILITY AMONG CHROMOSOMES

• The remarkable correlation between chromosome length and heritability contribution suggests that many loci contribute to SCZ liability (Gottesman & Shields, 1967).

• E.g., if there were only ten loci, each on a different chromosome, we would not see such a relationship.

Prof. Emeritus Irving Gottesman, a pioneer in the genetic study of mental illness.

Page 28: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE NUMBER OF CAUSAL VARIANTS: THE “POLY” IN POLYGENIC

• We know that there are many causal variants. But can we get more precise?

• Even if a GWAS dataset has too little power to yield many “hits,” it still contains substantial information about the trait’s genetic architecture. Naomi Wray and Peter Visscher

introduced a method to estimate parameters of genetic architectures in their 2009 study of schizophrenia.

Page 29: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE NUMBER OF CAUSAL VARIANTS: THE “POLY” IN POLYGENIC

• We have seen how GCTA exploits this information in the estimation of heritability.

• It is possible to get out more than just h2.

• Approximate Bayesian polygenic analysis (ABPA) estimates the total number of genotyped SNPs that are associated with the trait (Stahl et al., 2012).

Naomi Wray and Peter Visscher introduced a method to estimate parameters of genetic architectures in their 2009 study of schizophrenia.

Page 30: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE NUMBER OF CAUSAL VARIANTS: THE “POLY” IN POLYGENIC

• Suppose that we estimate SNP regression coefficients in a GWAS and use them to predict the phenotypes of individuals in a new sample.

• The cross-validation R2 is the predictive power of the estimated coefficients in the new sample.

Eli Stahl introduced ABPA in 2012, extending a method devised by Visscher and colleagues.

Page 31: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE NUMBER OF CAUSAL VARIANTS: THE “POLY” IN POLYGENIC

• Suppose that we bin the SNP effects estimated in the GWAS (“training sample”) by p-value.

• If the GWAS results in every p-value bin—even in the bins corresponding to large p-values—show at least a small cross-validation R2, then the trait must be highly polygenic.

Eli Stahl introduced ABPA in 2012, extending a method devised by Visscher and colleagues.

Page 32: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE NUMBER OF CAUSAL VARIANTS: THE “POLY” IN POLYGENIC

• What if the heritability were due to just a few variants of large effect? These variants would be in a bin with low p-values, and all other bins would show no cross-validation.

• A failure to observe this pattern implies polygenicity.

Eli Stahl introduced ABPA in 2012, extending a method devised by Visscher and colleagues.

Page 33: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE NUMBER OF CAUSAL VARIANTS: THE “POLY” IN POLYGENIC

• This logic extends to larger sample sizes.

• What if the bins corresponding to p ≥ .05 no longer cross-validate? Then all trait-associated SNPs must have p < .05!

• The number of SNPs meeting the cutoff p < .05 is then an upper bound on the total number of SNPs with nonzero regression coefficients.

Eli Stahl introduced ABPA in 2012, extending a method devised by Visscher and colleagues.

Page 34: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE NUMBER OF CAUSAL VARIANTS: THE “POLY” IN POLYGENIC

• Simulations can be used to determine what values of summary statistics (e.g., cross-validation R2 values of different p-value bins) are likely given the parameters (e.g., number of trait-associated SNPs).

• Working backward from the simulation results leads to Bayesian posterior distributions.

Eli Stahl introduced ABPA in 2012, extending a method devised by Visscher and colleagues.

Page 35: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE POLYGENIC ARCHITECTURE OF SCHIZOPHRENIA

Application of ABPA to schizophrenia GWAS data has yielded an estimate of 8,300 common variants (Ripke et al., 2013).

Page 36: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

A FOURTH LAW OF BEHAVIOR GENETICS

• Results from GWAS of mental illness, education, and intelligence justify an additional “law.”

• Fourth Law. Genetic variation is caused by thousands of sites across the genome, all of which are individually responsible for a minuscule fraction of the variance (Chabris, Lee, Cesarini, Benjamin, & Laibson, in press).

My colleague Christopher Chabris, the coiner of the Fourth Law.

Page 37: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

A FOURTH LAW OF BEHAVIOR GENETICS

• The coiner of the original Three Laws has already commented on some of the evidence supporting our proposed Fourth Law (Turkheimer, 2012).

• Turkheimer suggests that this evidence points toward deemphasizing GWAS.

Eric Turkheimer, the coiner of the Three Laws of Behavior Genetics.

Page 38: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

A FOURTH LAW OF BEHAVIOR GENETICS

• Turkheimer’s arguments are important. They are related to recently expressed concerns regarding the trustworthiness of the scientific enterprise (Pashler & Wagenmakers, 2012).

• Close scrutiny, however, shows that these arguments do not apply to GWAS.

Eric Turkheimer, the coiner of the Three Laws of Behavior Genetics.

Page 39: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

ISSUE #1: REPLICABILITY OF GWAS FINDINGS

• Some have argued that GWAS findings show a poor track record of replication.

• Kernel of truth. The small effects described by the Fourth Law are difficult to distinguish from noise in poorly powered studies and require large samples to be replicated.

Page 40: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

ISSUE #1: REPLICABILITYGiven adequate sample sizes, however, the degree of

quantitative replication in GWAS is nothing short of astounding.

Page 41: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

ISSUE #1: REPLICABILITYThe best-fitting straight line is close to the line of zero intercept and unit slope (Marigorta & Navarro, 2013).

Page 42: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

WILL REPLICABILITY EXTEND TO PSYCHOLOGICAL TRAITS?

• There have been few GWAS of behavioral traits in distinct populations.

• It is possible, however, to use GCTA to estimate the genetic correlation between populations with respect to a certain phenotype.

Page 43: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

!• YEUR : European individual’s SCZ liability • Xj : number of SCZ + genes (0, 1, or 2) at the jth locus • αj : average effect of gene substitution on SCZ liability at

the jth locus • E : individual’s “residual” with respect to SCZ liability—a

composite of environmental effects, nonlinear (non-additive) interactions, etc.

YEUR

= ↵0

+X1

↵1

+ · · ·+XL↵L| {z }European breeding value

+E

YAFR

= �0

+W1

�1

+ · · ·+WK�K| {z }African breeding value

+E

WILL REPLICABILITY EXTEND TO PSYCHOLOGICAL TRAITS?

Page 44: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

!• YAFR : African individual’s SCZ liability • Wj : number of SCZ + genes (0, 1, or 2) at the jth locus • βj : average effect of gene substitution on SCZ liability at

the jth locus • E : individual’s “residual” with respect to SCZ liability—a

composite of environmental effects, nonlinear (non-additive) interactions, etc.

YEUR

= ↵0

+X1

↵1

+ · · ·+XL↵L| {z }European breeding value

+E

YAFR

= �0

+W1

�1

+ · · ·+WK�K| {z }African breeding value

+E

WILL REPLICABILITY EXTEND TO PSYCHOLOGICAL TRAITS?

Page 45: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

The genetic correlation between two phenotypes is simply the correlation between their respective breeding values.

YEUR

= ↵0

+X1

↵1

+ · · ·+XL↵L| {z }European breeding value

+E

YAFR

= �0

+W1

�1

+ · · ·+WK�K| {z }African breeding value

+E

WILL REPLICABILITY EXTEND TO PSYCHOLOGICAL TRAITS?

Page 46: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

de Candia et al. (2013) used GCTA to estimate that the correlation between European and African breeding values with respect to schizophrenia is greater than 0.60.

YEUR

= ↵0

+X1

↵1

+ · · ·+XL↵L| {z }European breeding value

+E

YAFR

= �0

+W1

�1

+ · · ·+WK�K| {z }African breeding value

+E

WILL REPLICABILITY EXTEND TO PSYCHOLOGICAL TRAITS?

Page 47: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

WILL REPLICABILITY EXTEND TO PSYCHOLOGICAL TRAITS?

• The latest GWAS meta-analysis of schizophrenia included a number of East Asian samples (Ripke et al., 2014).

• The concordance between Europeans and East Asians is strong.

Page 48: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

ISSUE #2: CORRELATION VS. CAUSATION

• GWAS of unrelated individuals can only tell us that a given SNP is correlated with the phenotype.

• But we want to know whether variation at the genomic site causes variation in the phenotype.

Sir Ronald Fisher, the founder of both population genetics and modern statistics.

Page 49: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

ISSUE #2: CORRELATION VS. CAUSATION

• Since a given SNP is correlated with many other variants in its genomic region, picking out the causal variant (if any) is a challenge.

• Here I address the problem of whether a GWAS signal might be attributable to confounding with an environmental variable.

Sir Ronald Fisher, the founder of both population genetics and modern statistics.

Page 50: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

ISSUE #2: CORRELATION VS. CAUSATION

• The simplest means of addressing confounding is the family-based design.

• By Mendel’s Law of Segregation, a parent passes on a random gene from each homologous pair to a given offspring.

father’s genome

offspring’s genome

mother’s genome

Page 51: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

ISSUE #2: CORRELATION VS. CAUSATION

• Whether a heterozygous parent (“+−”) passes on the “+” or “−” gene to its offspring is equivalent to randomized treatment status in experimental design.

• If there is no selection bias, a within-family correlation between “+” transmission and the phenotype means that the marker must be linked and associated with a causal variant.

father’s genome

offspring’s genome

mother’s genome

Page 52: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

ISSUE #2: CORRELATION VS. CAUSATION

• Within-family designs are not statistically powerful, but they can be used to check that studies of unrelated individuals are not unduly contaminated by confounding.

• So far, family-based studies have affirmed the results of standard GWAS (Rietveld et al., 2013).

father’s genome

offspring’s genome

mother’s genome

Page 53: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

BUT WHY IS CAUSAL INFERENCE SO SIMPLE HERE?

SNP 1 SNP 2 SNP 3 SNP 4 SNP 5 SNP 6 SNP 7 SNP 8 SNP 9

phenotype

This is the simplest possible causal system (directed acyclic graph). If there is no confounding, every partial regression coefficient is equal to its corresponding average effect.

Page 54: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

BUT WHY IS CAUSAL INFERENCE SO SIMPLE HERE?

• Why are genetic and environmental causes not confounded more severely?

• Anthropomorphic answer. When Nature pushes up the frequencies of some alleles and pushes down others, she can only tell which alleles are correlated with fitness. She cannot tell which alleles cause higher fitness.

The Papilio caterpillar, which has evolved to look like a snake.

Page 55: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

BUT WHY IS CAUSAL INFERENCE SO SIMPLE HERE?

• Nevertheless, Nature seems to adjust allele frequencies in the correct way more often than not.

• She can only do this if gene-trait correlation is a robust guide to gene-trait causation. Be thankful that we live in such a universe!

The Papilio caterpillar, which has evolved to look like a snake.

Page 56: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

ISSUE #3: THE SCIENTIFIC WORTH OF SMALL EFFECTS

• One might object that only large effect sizes are scientifically significant (as opposed to statistically significant in a large enough sample).

• On this view the Fourth Law automatically discredits further inquiry into the genetic causes of behavior.

The clinical psychologist Paul Meehl, a vocal critic of significance testing.

Page 57: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

ISSUE #3: THE SCIENTIFIC WORTH OF SMALL EFFECTS

• This critique draws on the penetrating writings of Meehl (1978, 1990).

• Meehl thought that the null hypothesis is often a strawman because of ubiquitous biases and an abundance of alternative explanations.

The clinical psychologist Paul Meehl, a vocal critic of significance testing.

Page 58: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

ISSUE #3: THE SCIENTIFIC WORTH OF SMALL EFFECTS

• In such cases the rejection of the null hypothesis is not scientifically valuable.

• In GWAS, however, we have every reason to believe that the null hypothesis is true more often than not.

The clinical psychologist Paul Meehl, a vocal critic of significance testing.

Page 59: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE POLYGENIC ARCHITECTURE OF SCHIZOPHRENIA

~8,300 common variants seems to be a lot—but there are ~8 million common variants in the entire genome!

Page 60: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

ISSUE #3: THE SCIENTIFIC WORTH OF SMALL EFFECTS

• Against a large background of null effects, accepting the alternative hypothesis of a small effect is an inherently meaningful step toward the underlying biology.

• Perhaps to the surprise of some, the latest GWAS meta-analysis of schizophrenia implicates acquired immunity (Ripke et al., 2014).

The clinical psychologist Paul Meehl, a vocal critic of significance testing.

Page 61: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

WHAT KINDS OF ENHANCERS HARBOR SCHIZOPHRENIA VARIANTS?

Page 62: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

COMPRESSED SENSING: ADDRESSING THE N ≪ P PROBLEM• Point 1. Heritability is not

missing; it is hiding in plain sight among thousands of variants (many of them common).

• Point 2. Replicability crisis? Distinguishing causation from correlation? The Lykken-Meehl crud factor? Unlike much of behavioral science, GWAS is remarkably free from these problems.

Over a million people attend the Minnesota State Fair each year.

Page 63: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

COMPRESSED SENSING: ADDRESSING THE N ≪ P PROBLEM• But it is one thing to say

that there is scientific gold buried somewhere. It is quite another to dig it up!

• Can we identify enough variants to make meaningful scientific inferences without n greater than the number of protons in the Universe?

Over a million people attend the Minnesota State Fair each year.

Page 64: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

COMPRESSED SENSING: ADDRESSING THE N ≪ P PROBLEM• In Statistics 101, many of us

learned that the sample size (n) must exceed the number of RHS variables (p) for the partial regression coefficients to be identified.

• Recent work in the theory of compressed sensing (CS) has shown that coefficient recovery is possible in the n ≪ p case (Candes, Romberg, & Tao, 2006).

Terence Tao is the most distinguished SMPY participant and perhaps the most famous mathematician in the world.

Page 65: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

COMPRESSED SENSING: ADDRESSING THE N ≪ P PROBLEM

Consider the noisy linear system y = Ax+e, where A 2 Rn⇥p is the designmatrix and x 2 Rp has s nonzero elements. If n > Cs log p for some constantC, then the solution of the LASSO problem

minx̂

�ky �Ax̂k2L2

+ �kx̂kL1

with a suitable choice of � obeys

kx̂� xk2L2 �2

E

ns polylog p,

where �2E is the variance of the residuals in e.

Page 66: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

COMPRESSED SENSING: ADDRESSING THE N ≪ P PROBLEM• Simply statable CS theorems

assume that the RHS variables (e.g., genetic variants) are uncorrelated. But in reality a genetic variant is in LD with nearby genetic variants. So do CS ideas apply here?

• If you squint at the GWAS covariance matrix from a distance, it looks diagonal. So it might be reasonable to expect that LASSO will still perform well (up to GWAS precision).

A

b

l

0 0.5 10

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

B

b

l

0 0.5 10

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

0 0.5 10

0.5

1

1.5

2

2.5

3x 10−3

correlation size

frequ

ency

D

SNP index i

SNP

inde

x j

C

2000 4000 6000 8000

2000

4000

6000

80000

0.2

0.4

0.6

0.8

1

A color-coded correlation matrix of SNPs on chromosome 22.

Page 67: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

COMPRESSED SENSING: ADDRESSING THE N ≪ P PROBLEM

Vattikuti, Lee, Chang, Hsu, & Chow (2014)

Page 68: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

GIANT SNP

L1 SNP, proxy

L1 SNP, not proxy

MR SNP

Page 69: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

COMPRESSED SENSING: ADDRESSING THE N ≪ P PROBLEM• Can we tell when a GWAS has

crossed n > C s log p?

• Yes! Certain observable quantities (e.g., the typical p-value of called nonzeros) begin to decline sharply.

• Applying this method to real GWAS data indicates that for a trait with h2≈0.50, n > 30s triggers the phase transition to good performance.

The theoretical physicist Stephen Hsu entertains a visitor to Michigan State University.

Page 70: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

IMPORTANT SCIENTIFIC QUESTIONS: WHY TAKE THE ROAD TO 30S?

“Man may be excused for feeling some pride at having risen … to the very summit of the organic scale; and the fact of his having thus risen, instead of having been aboriginally placed there, may give him hope for a still higher destiny in the distant future.

Page 71: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

IMPORTANT SCIENTIFIC QUESTIONS: WHY TAKE THE ROAD TO 30S?

“[But] man with all his noble qualities, with sympathy which feels for the most debased, with benevolence, which extends not only to other men but to the humblest living creature, with his god-like intellect which has penetrated into the movements and constitution of the solar system …

Page 72: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

IMPORTANT SCIENTIFIC QUESTIONS: WHY TAKE THE ROAD TO 30S?

“… with all these exalted powers, Man still bears in his bodily frame the indelible stamp of his lowly origin.”—CHARLES DARWIN, THE DESCENT OF MAN

Page 73: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

IMPORTANT SCIENTIFIC QUESTIONS: WHY TAKE THE ROAD TO 30S?

• Darwin knew no genes; we do. Can we trace the genetic basis of the evolutionary change that Darwin described?

• Recent spectacular advances in the sequencing of ancient hominin DNA suggest that the answer may be yes.

Page 74: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE HUMAN FAMILY TREE

1.8 mya?

500 kya

380 kya

Prüfer et al. (2014)

Page 75: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE GENETICS OF ANCIENT HOMININS

• Usable DNA was recently recovered from a Denisovan-like hominin who died more than 300 kya (Meyer et al., 2014).

• I will now show you a comparison of sequences from Neanderthals and modern humans. An artist’s reconstruction of a

human-Neanderthal hybrid child.

Page 76: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

THE GENETICS OF ANCIENT HOMININS

TTCTTCCACTCACTCATCACCATAAA

This is the modern human sequence encompassing rs1487441, one of the “IQ hits” identified by Rietveld et al. (2014). A is the “plus” allele; G is the “minus” allele.

The ancestors of Neanderthals and Denisovans split from our lineage ~500 kya. Neanderthals probably did a lot of evolving since then … but it is still fun to ask: What allele did Neanderthals carry at this site?

TTCTTCCACTCACTC TCACCATAAAG

Page 77: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

IMPORTANT SCIENTIFIC QUESTIONS: WHY TAKE THE ROAD TO 30S?

Page 78: THE GENETIC ARCHITECTURES OF PSYCHOLOGICAL TRAITS

PLEASE CITE THESE PAPERS!• Vattikuti S, Lee JJ, Chang CC, Hsu SDH, Chow CC (2014). Applying compressed

sensing to genome-wide association studies. GigaScience, 3, 10.

• Lee JJ, Chow CC (2014). Conditions for the validity of SNP-based heritability estimation. Human Genetics, 133, 1011-1022.

• Rietveld CA, Esko T, Davies G, Pers TH, Benyamin B, Chabris CF, Emilsson V, Johnson AD, Lee JJ, de Leeuw C, et al. (2014). Common genetic variants associated with cognitive performance identified using the proxy-phenotype method. Proceedings of the National Academy of Sciences USA, 111, 13790-13794.

• Chabris CF, Lee JJ, Cesarini D, Benjamin DJ, Laibson DI (in press). The fourth law of behavior genetics. Current Directions in Psychological Science.