Adaptive Evolution of Gene Expression in Drosophila · Article Adaptive Evolution of Gene...
Transcript of Adaptive Evolution of Gene Expression in Drosophila · Article Adaptive Evolution of Gene...
Article
Adaptive Evolution of Gene
Expression in DrosophilaGraphical Abstract
Highlights
d Adaptive evolution of gene expression is pervasive in
Drosophila
d Stabilization and adaptation of gene expression follow
distinct molecular clocks
d Gene function determines the rate of expression adaptation
d Sex-specific adaptation of gene expression occurs
predominantly in males
Nourmohammad et al., 2017, Cell Reports 20, 1385–1395August 8, 2017 ª 2017 The Authors.http://dx.doi.org/10.1016/j.celrep.2017.07.033
Authors
Armita Nourmohammad,
Joachim Rambeau, Torsten Held,
Viera Kovacova, Johannes Berg,
Michael Lassig
[email protected] (A.N.),[email protected] (M.L.)
In Brief
Drosophila presents an evolutionary
conundrum: there is ubiquitous genomic
adaptation, yet it has been impossible to
identify system-wide signals of
adaptation for gene expression.
Nourmohammad et al. develop a method
to infer stabilizing and directional
selection from expression data. They
show that adaptation dominates the
evolution of gene expression in
Drosophila.
Cell Reports
Article
Adaptive Evolution of GeneExpression in DrosophilaArmita Nourmohammad,1,4,* JoachimRambeau,2 Torsten Held,2 Viera Kovacova,3 Johannes Berg,2 andMichael Lassig2,*1Joseph-Henri Laboratories of Physics and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA2Institut f€ur Theoretische Physik, Universitat zu Koln, Z€ulpicher Str. 77, 50937 Koln, Germany3CECAD, Universitat zu Koln, Joseph-Stelzmann-Str. 26, 50931 Koln, Germany4Lead Contact
*Correspondence: [email protected] (A.N.), [email protected] (M.L.)http://dx.doi.org/10.1016/j.celrep.2017.07.033
SUMMARY
Gene expression levels are important quantitativetraits that link genotypes to molecular functions andfitness. In Drosophila, population-genetic studieshave revealed substantial adaptive evolution at thegenomic level, but the evolutionary modes of geneexpression remain controversial. Here, we presentevidence that adaptation dominates the evolution ofgene expression levels in flies. We show that 64%of the observed expression divergence across sevenDrosophila species are adaptive changes driven bydirectional selection. Our results are derived fromtime-resolved data of gene expression divergenceacross a family of related species, using a probabi-listic inference method for gene-specific selection.Adaptive gene expression is stronger in specific func-tional classes, including regulation, sensory percep-tion, sexual behavior, and morphology. Moreover,we identify a large group of genes with sex-specificadaptation of expression, which predominantly oc-curs in males. Our analysis opens an avenue to mapsystem-wide selection on molecular quantitativetraits independently of their genetic basis.
INTRODUCTION
Several studies have found evidence for widespread adaptive
evolution of the Drosophila genome (Andolfatto, 2005; Mustonen
and Lassig, 2007; Sella et al., 2009). This includes adaptive
changes in the non-coding sequence, consistent with classical
ideas on the importance of regulatory evolution for phenotypic
adaptation (King and Wilson, 1975). Gene expression levels are
important molecular phenotypes that quantify the effects of regu-
lation on organismic traits and fitness. Insights on how genome
evolution affects gene expression have come from studies of
quantitative trait loci (QTLs); see Fraser (2011); Romero et al.
(2012), and Pai et al. (2015) for reviews. These studies compare
lineage- or species-specific difference in the expression QTLs,
in line with Orr’s sign test for selection on quantitative traits
(Orr, 1998). Due to the limited number of QTLs, the sign test is
only applicable to gene groups that have been pre-determined
based on criteria other than selection on expression levels. In
CellThis is an open access article under the CC BY-N
yeast, at least 10% of the genes have been inferred to undergo
adaptive evolution of expression (Fraser et al., 2010). By extend-
ing the sign test to include information on outgroup species, it has
been possible to identify lineage-specific positive selection on
cis-regulatory expression QTLs in functional gene classes of
mice (Fraser et al., 2011) and plants (Riedel et al., 2015).
A similar approach has been used to correlate population-spe-
cific environmental variables with expression SNPs; this has
shown that local adaptation of the human population is driven
by gene expression in a number of gene classes (Fraser, 2013).
In flies, expression-QTL analysis has been used to estimate cis
and trans effects on expression (Genissel et al., 2008; Wittkopp
et al., 2008) and to compare the evolution of expression and
that of the underlying regulatory sequence (Coolon et al., 2014);
related studies have been performed in yeast (Bullard et al.,
2010; Artieri and Fraser, 2014). These QTL studies have brought
specific insights into modes of gene expression evolution in spe-
cific functional classes. However, given the complexity of the reg-
ulatory genotype-to-phenotype map and the limited sensitivity of
QTL studies, our understanding of how genome-wide adaptive
changes relate to mRNA and protein levels has remained incom-
plete (Hoekstra and Coyne, 2007; Fraser, 2011; Pai et al., 2015).
An alternative approach is to analyze the evolution of gene
expression by methods of quantitative genetics, without
explicit reference to genetic evolution of the QTL (Rifkin et al.,
2003; Khaitovich et al., 2004, 2005; Lemos et al., 2005; Rifkin
et al., 2005; Gilad et al., 2006; Whitehead and Crawford,
2006; Zhang et al., 2007; Bedford and Hartl, 2009; Fraser
et al., 2011; Romero et al., 2012; Pai et al., 2015). These studies
compare the expression divergence across species, the varia-
tion within species, and the expected behavior for neutral evo-
lution (Lynch and Hill, 1986). A broad picture of evolutionary
constraint on gene expression levels caused by stabilizing
selection has emerged in a number of species, including
Drosophila (Rifkin et al., 2003; Lemos et al., 2005; Rifkin et al.,
2005; Gilad et al., 2006; Bedford and Hartl, 2009; Romero
et al., 2012). Mutation accumulation experiments in Drosophila
show that the neutral expression divergence generated by
random mutations in the lab significantly exceeds the natural
expression variation, indicating strong negative selection on
most random mutations affecting gene expression (Rifkin
et al., 2005). A comparative study between human and chim-
panzee has produced signatures of predominantly neutral evo-
lution of gene expression (Khaitovich et al., 2004, 2005). Other
studies in primates have identified stabilizing selection, as well
Reports 20, 1385–1395, August 8, 2017 ª 2017 The Authors. 1385C-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Figure 1. Phylogenetic Tree and Evolu-
tionary Distances of 7 Drosophila Species
Phylogeny of the Drosophila genus, as re-
constructed in Drosophila 12GenomesConsortium
et al. (2007) from synonymous sequence diver-
gence. Six clades are marked by colored triangles;
their ancestral nodes aremarkedby colored circles.
The table specifies the species contained in each of
the clades and the clade divergence time tC (see
Experimental Procedures).
as lineage- and tissue-specific directional expression changes
(Gilad et al., 2006; Blekhman et al., 2008; Brawand et al., 2011;
Romero et al., 2012). However, it has remained difficult to
demonstrate that positive selection, as opposed to relaxed
stabilizing selection, is the evolutionary cause of expression
divergence (Fraser, 2011). Thus, estimating the genome-wide
contribution of adaptation to the evolution of gene expression
is an outstanding problem.
In this paper, we show that adaptation is the prevalent evolu-
tionary mode of gene expression in the Drosophila genus. We
infer directional selection driving adaptation, together with con-
servation under stabilizing selection, and we show that these
forces act on different scales of evolutionary time. Our inference
is based on theoretical results on the evolution of molecular
quantitative traits (Held et al., 2014; Nourmohammad et al.,
2013a, 2013b), using solely the dependence of gene expression
divergence on the divergence time of 7 Drosophila species.
Moreover, the method only relies on the phenotypic observables
and does not depend on number and effects of the underlying
QTL; these molecular determinants of gene expression are often
unknown and vary considerably among genes.
RESULTS
Pattern of Gene Expression DivergenceWe use gene expression data from samples of males and females
(Zhang et al., 2007), which cover 6,332 orthologous genes in seven
Drosophila species. A phylogenetic tree of these species is shown
in Figure 1. The dataset of Zhang et al. (2007) is obtained from spe-
cies-specific microarrays, which makes it suited to cross-species
analysis. Gene expression levels are defined by a standard trans-
formation ofmRNAcounts,whichaccounts for differences in assay
1386 Cell Reports 20, 1385–1395, August 8, 2017
sensitivity among experimental probes
(Quackenbush, 2002). The transformation
method and its implications for evolu-
tionary analysis are detailed in Experi-
mental Procedures and Supplemental
Experimental Procedures (Figure S1). We
use these data to estimate the mean
expression level of a gene within each spe-
cies, its total heritable expression variance
D (referred to as expression diversity), and
its non-heritable expression variance be-
tween biological replicates. For each pair
of species, we obtain the cross-species
expression divergence D of a gene as the
squared difference between the species mean levels. Cross-spe-
cies differences in expression for a single gene are noisyand reflect
thephysiology of that gene,but averagesover all or large classesof
genes showaclear evolutionary pattern that can be comparedwith
model expectations. In particular, the time-dependent expression
divergence hDiji, where i, j labels a given pair of species and
angular brackets denote averages over genes, plays a central
role in our analysis, as explained in Box 1. We define the rescaled
divergence as
Uij =
Di j
D0
; (Equation 1)
where the trait scale D0 is defined such that Uijz1 for neutral
evolution in the limit of long divergence times (details of this defi-
nition are given in Experimental Procedures and Box 1). The evo-
lution of these divergencemeasures depends only weakly on the
effect distribution of expression QTL and on the amount of
recombination between these loci, which is key to quantitative
genetics approaches (Lynch and Hill, 1986; Leinonen et al.,
2013; Nourmohammad et al., 2013a, 2013b; Held et al., 2014).
To obtain a genome-wide evolutionary picture of gene
expression in Drosophila, we evaluate the aggregate time-
dependent divergence for all genes and species in our dataset
(Supplemental Experimental Procedures). Grouping the spe-
cies into 6 clades, we obtain a consistent pattern of divergence
UðtÞ as a function of divergence time t (Figure 2). We can attri-
bute this pattern to biological divergence of expression levels,
because the species-specific design of microarrays sup-
presses technical errors that depend on evolutionary distance
(Zhang et al., 2007). To test this prerequisite for evolutionary
analysis, we compare the mean expression levels for specific
Box 1. Trait Evolution in a Fitness Seascape
FITNESS MODEL
The schematic shows the evolution of a quantitative trait in a
single-peak fitness seascape (green curves). The distribution
of trait values within a species (gray curves) changes over a
macro-evolutionary period t, which can be observed as
cross-species divergence of the mean trait values DðtÞ (grayarrow). The fitness seascape constrains trait values around a
fitness peak by stabilizing selection, and evolutionary displace-
ments of this peak generate directional selection (green arrow).
The minimal fitness model has two parameters: the stabilizing
strength c is proportional to the inverse square width of the
fitness peak, and the driving rate ymeasures the mean square
displacement of the fitness peak per unit of evolutionary time
(see Experimental Procedures and Supplemental Information).
Lower plane: in a typical realization, the population mean trait
(black line) follows the moving fitness optimum (green line)
with delay and additional fluctuations.
TIME-DEPENDENT DIVERGENCE
The rescaled mean square displacement UðtÞ is plotted
against the rescaled divergence time t. Neutral evolution (gray):
UðtÞ reaches a saturation value of U0 = 1 with a relaxation time
of t0 1 (in units of the inverse mutation rate). Conservation
(blue): in a single-peak fitness landscape, UðtÞ has a smaller
saturation value, Ustab 1=c, which is reached faster than at
neutrality, tstab Ustab < 1. Adaptation (green): in a fluctuating
fitness seascape, there is a linear surplus UadðtÞ, which mea-
sures the amount of trait adaptation. We use the nonlinear rela-
tion between the trait divergence UðtÞ and the divergence time
t to infer the fitness parameters ðc; yÞ.
LINEAGE- AND GENE-SPECIFIC INFERENCE
Based on a joint probabilistic description of trait evolution
andfitness fluctuations,we can infer the likelihoodof the fitness
parameters, stabilizing strength c and driving rate y, for individ-
ual genes. The inference involves summingover all evolutionary
histories of mean and optimal trait values across the phylogeny
(black and green lines) that lead to the observed values
E1;E2;., at the terminal nodes (shown here for three species).
Over macro-evolutionary distances, this sum is dominated by
the most parsimonious lineage-specific evolutionary history
and can be evaluated analytically (see Experimental Proced-
ures and Supplemental Information). The evolutionary histories
on different branches mutually constrain each other because
they are connected at the branch points (yellow diamonds).
Cell Reports 20, 1385–1395, August 8, 2017 1387
Figure 2. Adaptive Evolution of Gene
Expression
The time-dependent divergence (rescaled) UðtÞfrom all genes is plotted against the divergence
time t for six partial species clades (small squares)
and for the entire Drosophila genus (large square).
Species clades and divergence times (scaled by
the rate of synonymous mutation) are defined by
the phylogeny of the Drosophila genus (Figure 1).
Trait divergence values are scaled by the asymp-
totic long-term limit under neutral evolution (see
text and Experimental Procedures). These data are
shown with theoretical curves UðtÞ under direc-
tional selection (green line), under stabilizing se-
lection (blue line), and for neutral evolution (gray
line). Inferred model parameters are stabilizing
strength c = 18:4 and driving rate y = 0:08 (Box 1)
(see Experimental Procedures and Supplemental
Experimental Procedures). We infer a time-
dependent adaptive component of the expression
divergence UadðtÞ (green shaded area); the com-
plementary component UeqðtÞ (blue shaded area)
is generated by genetic drift under stabilizing
selection. Adaptation accounts for a fraction uad =Uad=U= 64% of the expression divergence across the Drosophila genus ðtDros: = 1:4Þ. See Figure S3 for a
comparison of the data to models of time-independent stabilizing selection (Bedford and Hartl, 2009); see also Figures S1, S2, and S4–S7.
gene classes across species. We find no distance-dependent
differences, which provides strong evidence that our data are
free of technical divergence caused by a species bias in probe
sensitivity (Figure S1).
The rescaled expression divergence data in Figure 2
showmacro-evolution of expression levels. The average expres-
sion divergence has two distinct molecular clocks: a rapid
increase on timescales t below the D. melanogaster (D. mel)-
D. simulans (D. sim) divergence time is followed by a slower
increase on larger timescales. This pattern is clearly incompat-
ible with neutral evolution, where the rescaled divergence
would follow a uniform linear pattern on short timescales and
saturate to 1 on timescales given by the inverse point mutation
rate (gray line in Figure 2, to be compared with the aggregate
divergence plot in Box 1). The actual pattern shows stronger
evolutionary constraint, which is clearly visible already within
the D. mel-D. sim-D. yakuba (D. yak) clade: the species
pair D. mel-D. yak has about twice the divergence time but
only 1.2 times the expression divergence compared to the
pair D. mel-D. sim. Hence, the characteristic constraint time is
of the order tmelsim, about a factor of 10 shorter than the neutral
saturation time. This pattern indicates evolution under substan-
tial stabilizing selection, in qualitative agreement with previous
studies (Supplemental Experimental Procedures) (Rifkin et al.,
2003; Lemos et al., 2005; Bedford and Hartl, 2009) and with a
standard QST/FST analysis (Leinonen et al., 2013). However, the
expression divergence increases with the divergence time
throughout the Drosophila genus (green shaded area) and
does not show evidence of saturation for larger values of diver-
gence time t: This observation is in accordance with a similar
pattern of the expression divergence observed previously
(Zhang et al., 2007) and is backed up by our probabilistic analysis
reported later. In the following, we show that the increase of
expression divergence beyond tmelsim reflects adaptive evolu-
tion of gene expression over macro-evolutionary timescales,
1388 Cell Reports 20, 1385–1395, August 8, 2017
and we provide a parsimonious explanation for the separation
of molecular clocks.
Fitness Model for Gene ExpressionThe inference of adaptation is based on a minimal dynamical
model of selection: gene expression levels E evolve in a sin-
gle-peak fitness seascape fðE; tÞ= f const:3ðE EðtÞÞ2(Nourmohammad et al., 2013a; Held et al., 2014). This model
is illustrated in Box 1 and formally defined in Experimental Pro-
cedures. The fitness peak EðtÞ for a given gene performs a
random walk over macro-evolutionary periods, which maps
continual changes of the optimal expression of that gene.
Despite its simplicity, the seascape model combines two
salient features of selection on gene expression: stabilizing se-
lection generates evolutionary constraint, and directional se-
lection drives long-term adaptive changes. These selection
components are measured by two parameters: the stabilizing
strength c, which is proportional to the inverse square width
of the fitness peak, and the driving rate y, which measures the
mean square displacement of the peak position EðtÞ per unitof evolutionary time. The fitness peak value f is arbitrary,
because only fitness differences between individuals matter
for the evolution of a species.
The fitness seascapemodel captures distinct selective causes
of adaptive evolution. Long-term environmental shifts can lead
to changes in the optimal expression levels that accumulate
over macro-evolutionary periods. The co-evolution of genes in
networks acts in a similar way: changes in the expression of
one gene generate time-dependent selection on the expression
of functionally correlated genes. This time dependence broadly
describes cross-gene epistasis in regulatory and metabolic
pathways. The seascape model is not concerned with individual
environmental shifts or epistatic changes; it describes the evolu-
tionary system biology of cumulative effects over macro-evolu-
tionary periods and over groups of genes. These effects can
be inferred from our kind of dataset and give rise to a random
walk model for fitness peak positions EðtÞ. Later, we extend
this model to include larger, punctuated shifts of fitness peaks.
Inference of Adaptive EvolutionTime-dependent selection generates complex evolutionary dy-
namics of expression levels for a given gene. The top diagram
of Box 1 shows a typical pattern: the population mean level fol-
lows the fitness peak displacements with some delay; additional
deviations of mean and optimal trait value are generated by ge-
netic drift. The fitness seascape model provides a simple con-
ceptual and computational basis to infer these dynamics. The
simplest inference scheme is based on aggregate time-depen-
dent expression divergence data; a probabilistic extension to in-
dividual genes is discussed later. Box 1 shows the analytical
form of the rescaled trait divergence UðtÞ in a fitness seascape
with positive stabilizing strength and driving rate (c> 0; y> 0;
green solid line); the corresponding form UeqðtÞ in a fitness land-
scape of the same stabilizing strength and zero driving rate
(c> 0; y= 0; blue solid line) reaches the saturation value Ustab
(Nourmohammad et al., 2013a; Held et al., 2014). The saturation
time is inversely proportional to the stabilizing strength
tstab Ustab 1=c and is shorter than expected from neutral
evolution t 1 (gray solid line). The resulting decomposition
UðtÞ=UeqðtÞ+UadðtÞ (Equation 2)
determines the adaptive fraction uadðtÞ=UadðtÞ=UðtÞ=hDadðtÞi=hDðtÞi of the trait divergence, which is driven by direc-
tional selection; the complementary fraction 1 uadðtÞ is gener-ated by genetic drift under stabilizing selection. In the linear
regime UðtÞzUstab +UadðtÞ, which covers all species clades in
this dataset, the fitted amplitudes provide simple estimates of
the selection parameters (Held et al., 2014):
cz2
Ustab
; yz2UadðtÞ
t: (Equation 3)
Here we determine these parameters, together with the trait
scale D0, by a maximally conservative inference procedure,
which produces the smallest value of stabilizing strength c
compatible with the data (Experimental Procedures).
Howmuch adaptation is in the evolutionary process? This can
bemeasured by the fitness fluxF, which is the cumulative fitness
gain through adaptive changes over an evolutionary period
(Mustonen and Lassig, 2007, 2010; Held et al., 2014). For a
gene with evolving mean expression level GðtÞ, the fitness flux
at a given time t is defined as the rate dGðtÞ=dt of expressionchange multiplied by the fitness gradient vFðG; tÞ=vG, where
FðG; tÞ is the mean fitness of the population at time t (Supple-
mental Experimental Procedures). Positive fitness flux under
time-dependent selection does not imply a net gain in fitness,
because fitness gains through adaptation can be offset by losses
through displacements of the fitness peak. We can compare
these dynamics to walking on an escalator that moves in the
opposite direction. The quantity F corresponds to the walker’s
total number of uphill steps on the escalator but is unrelated to
an absolute height gain. In the seascape model, the fitness flux
is proportional to the driving rate y and to the stabilizing strength
c, because a population under stronger selection follows the
fitness peak more closely and accumulates more adaptation.
Hence, the cumulative fitness flux F over an evolutionary period
t has the expectation value h2NFðtÞi= 2 cy t (scaled by the
effective population size N) (Held et al., 2014). Equation 3 estab-
lishes a simple relation between fitness flux and the adaptive part
of the expression divergence:
h2NFðtÞiz 2uadðtÞ1 uadðtÞ: (Equation 4)
Thus, the time-dependent pattern of expression divergence
discriminates directional selection in a genuine fitness seascape
from purely stabilizing selection in a static fitness landscape. The
joint inference of these selection components provides a more
powerful signal of adaptation than QST/FST analysis (Leinonen
et al., 2013), which would infer only stabilizing selection from
these data. In Supplemental Experimental Procedures, we
discuss in detail the relationship between the test for selection
based on trait divergence and other trait-based selection tests:
the QST/FST test (Leinonen et al., 2013), Ornstein-Uhlenbeck
models (Hansen, 1997; Bedford and Hartl, 2009), and the
McDonald-Kreitman test (McDonald and Kreitman, 1991).
Fitness Seascape of Drosophila Gene ExpressionWe first use the aggregate time-dependent divergence data to
infer a gene-averaged fitness seascape of expression levels in
Drosophila (Figure 2). The least-square fitted seascape model
(green line) contains stabilizing and directional selection, as illus-
trated in the center plot of Box 1. This model explains the
observed pattern UðtÞ: the evolutionary constraint between
neighboring species (here D. mel, D. sim, and D. yak) is caused
by stabilizing selection, and the approximately linear long-term
increase signals adaptation.
Our inference also shows that stabilizing selection alone
cannot explain the Drosophila expression data. In a static fitness
landscape with substantial stabilizing strength, genetic drift gen-
erates a rapidly saturating pattern of DðtÞ that is not observed in
the data (Figure S3). A previous study of this dataset suggests a
pattern with slow saturation on timescales on the order of the
Drosophila genus divergence time (Bedford and Hartl, 2009);
such a pattern would imply weak stabilizing selection ðc< 1Þand near-neutral evolution of gene expression levels (see Sup-
plemental Experimental Procedures for a detailed discussion).
Compared to the seascape model, the weak-selection land-
scape model provides a suboptimal fit to the observed DðtÞdata (Figure S3). This ranking of models is confirmed and quan-
tified by the probabilistic analysis.
Probabilistic Inference of Adaptively Regulated GenesNext, we extend the inference of selection to the noisy patterns
of individual genes. Using probabilistic extension of the selection
test, we obtain gene-specific posterior likelihood distributions of
stabilizing strength and fitness fluxQðc;FÞ (Box 1) (Experimental
Procedures). Maximizing this function allows us to infer, for each
gene, the most likely values of stabilizing strength c and fitness
flux F (or, equivalently, of c and y) given its observed expression
Cell Reports 20, 1385–1395, August 8, 2017 1389
A B Figure 3. Probabilistic Inference of Adaptive
Gene Expression
(A) Distribution of maximum-likelihood values of the
scaled cumulative fitness flux, 2NF, inferred for indi-
vidual genes (see Experimental Procedures and Sup-
plemental Experimental Procedures). Our inference
classifies 54% of all genes as adaptively regulated
(2NF> 4, green shaded part of the distribution).
(B) Bayesian inference of fitness models. The posterior
log-likelihood score Sðc;FÞ favors the optimal
seascape model (c = 18:4, 2NF = 3:8; green square)
over the best landscape model (ceq = 16, F= 0; blue
square) and the neutral model (c= 0, F= 0; gray
square).
See also Figures S1, S2, S6, and S7.
values in 7 Drosophila species on the phylogeny of Figure 1 (see
also Box 1).
This analysis shows again the dominant role of adaptation in
gene expression: for 54% of all genes, we infer a significant
maximum-likelihood fitness flux F across the Drosophila genus;
we classify these genes as adaptively regulated (Table S1). Fig-
ure 3A shows the distribution of maximum-likelihood values of
the fitness flux for individual genes, which determines the in-
ferred clade-specific fraction of adaptive expression divergence
(Table 1). This fraction uadðtÞ increases with clade divergence
time, in accordance with the aggregate data of Figure 2. Be-
tween D. mel and D. sim, which diverged about 2–3 mya, 92%
of the expression divergence can be attributed to genetic drift
under stabilizing selection. Across the entire Drosophila genus,
which had its last common ancestor about 40 mya ago, we infer
64% of the expression divergence to be adaptive.
The Bayesian scheme also allows us to quantify the overall sta-
tistical significance of our selection inference. In Figure 3B, we
plot the cumulative log-likelihood score for all genes as a function
of stabilizing strength c and cumulative fitness flux F. As shown
by a log-likelihood test, the global maximum-likelihood seascape
model is strongly favored over themaximum-likelihood landscape
model ðP< 103600Þ and over neutral evolution ðP< 105400Þ (seeEquation 35 in Supplemental Experimental Procedures). This
analysis rejects neutral evolution and evolution under static stabi-
lizing selection in a robust way: it does not require model assump-
tions on the adaptive dynamics, and the ranking of models is
stable under alternate evaluation of species divergence times
(Figure S2). We conclude that the long-term increase of expres-
sion divergence beyond the D. mel-D. sim divergence time, as
observed in Figure 2, is a statistically significant signal of adaptive
evolution.
Drift and Adaptation Follow Distinct Molecular ClocksThe fitness seascape model interprets the two molecular clocks
observed in gene expression divergence in terms of distinct
evolutionary forces: the rapid short-term increase is caused by
genetic drift, whereas the slower long-term increase is caused
by adaptation. Because genetic drift and adaptation differ in
tempo, the relative contribution of adaptation to expression
divergence depends on evolutionary time: the adaptive part is
small for the youngest species clades, but adaptation becomes
dominant across the entireDrosophila genus (green shaded area
in Figure 2). This nonlinearity is a specific evolutionary feature of
1390 Cell Reports 20, 1385–1395, August 8, 2017
quantitative traits with a complex molecular basis, which can
have individual loci under weak selection (Sunyaev and Roth,
2013). Substitutions at these loci generate predominantly diffu-
sive divergence of expression on short timescales ðt < tstabÞ;this pattern is modified by stabilizing and directional selection
on longer timescales.
The existence of two molecular clocks has an immediate
consequence: the power of inference methods for trait adap-
tation increases with evolutionary time span. Quantitative ge-
netics studies over short divergence times cannot distinguish
neutral evolution from evolution under selection, because the
divergence pattern is dominated by the diffusive molecular
clock in both cases (Box 1). For example, the observation of
drift-dominated behavior in primates (Khaitovich et al., 2004)
is consistent with neutral evolution but cannot exclude stabiliz-
ing or directional selection. Thus, it is compatible with the
signal of adaptive evolution based on expression QTLs in
humans (Fraser, 2013). In contrast, the data from the
7 Drosophila species spans both short and long divergence
times and displays two molecular clocks. This divergence
pattern is no longer consistent with neutral evolution and is
indicative of macro-evolutionary adaptation of gene expres-
sion in Drosophila.
Testing Alternative Evolutionary ScenariosThe minimal seascape model explains the pattern of gene
expression divergence across the Drosophila genus in a parsi-
monious way. But are there equally parsimonious alternative
modes of selection or demography that are consistent with the
data? To assess the specificity and robustness of the
seascape-based inference, we characterize the statistics of
gene expression levels in a number of alternative modes of evo-
lution by analytical approximations and simulations, and we
compare the results to the Drosophila data.
First, demographic effects may increase or decrease the
effective population size in a specific lineage, which affects the
stabilizing strength c for all genes. As shown in Figure S4, line-
age-specific changes in effective population size that persist
over sufficiently long evolutionary periods can be traced in the
aggregate time-dependent divergence UðtÞ. Such effects are
not observed in our data, which suggests that long-term demo-
graphic effects do not play a dominant role in the evolution of
Drosophila gene expression levels (Figure S4). This result does
not exclude short-term changes of population size, which
Table 1. Selection Parameters and Amount of Adaptation
Gene classes (gene
number) c 2NFD. mel-D. sim D. mel-D. yak D. vir-D. moj D. mel-D. ana D. mel-D. pse Dros. (D. mel-D. moj)
uad (%) a (%)
All genes (6,332) 18.4 3.8 7 23 48 59 61 64 54
Broad codon
usage (1,176)
15.6 3.9 9 25 49 61 62 66 57
Narrow codon
usage (501)
18.0 2.4 5 15 36 48 49 53 18
High expression
(553)
14.3 1.7 1 8 27 39 40 44 0
Dros., Drosophila genus; c, maximum-likelihood stabilizing strength; 2NF, maximum-likelihood fitness flux; uad, clade-dependent adaptive fraction
of the gene expression divergence; a, fraction of adaptively regulated genes across the Drosophila genus, given by the condition 2NFa > 4.
occurred, for example, in the evolution of the D. mel lineage
(Lachaise et al., 1988). Such changes can be traced in sequence
polymorphism spectra (Glinka et al., 2003; Haddrill et al., 2005;
Stephan and Li, 2007; Thornton et al., 2007), but they have
only minor effects on gene expression levels (Figure S4).
Next, we ask whether theDrosophila data can be explained by
lineage- and gene-specific relaxation of stabilizing selection. We
consider a specific non-adaptive mode of expression changes:
functional genes evolve under stabilizing selection in a static
fitness landscape, but individual genes can (partially) lose func-
tion at a given point in their evolutionary history, which relaxes
selection on their expression. We model loss of function as sto-
chastic events occurring at a small rate, independently for each
gene and on each lineage. This model produces a divergence
function UðtÞ with a long-term nonlinearity that is not seen in
the U data (Figure S5). The most direct way to discriminate be-
tween relaxation of selection and adaptive evolution is to use a
directional bias: most functional genes are upregulated by stabi-
lizing selection (a similar bias has been exploited in expression
QTL studies) (Fraser et al., 2010; Fraser, 2011, 2013). Hence, in
the loss-of-function mode, a comparison of expression levels
for a given gene would show small cross-species differences at
higher expression levels (i.e., between the lineages with a func-
tional gene), together with large deviations at lower levels (i.e., in
the lineages with lost gene function). Accordingly, the distribution
of expression divergence values for a given species pair would
show a broad tail generated by the loss events (Figure S5). These
features are not observed in our data, indicating that relaxed sta-
bilizing selection alone cannot explain the evolution ofDrosophila
expression levels (Figure S5). Loss of gene function does happen
in our phylogeny, but affected genes will often lose expression
altogether and hence will be suppressed in our dataset.
We also compare the Drosophila data with alternative models
of adaptive evolution. For example, individual genes can undergo
a (partial) neo-functionalization that requires a major change
in their expression. We describe this mode of evolution by a
punctuated fitness seascape, inwhich large shifts of the peakpo-
sition are stochastic events occurring at a small rate (Held et al.,
2014). This process produces an aggregate divergence function
UðtÞ that is compatible with the data, but a broad tail in the distri-
bution of expression divergence values is not observed
(Figure S5). We conclude that gradual but continual changes in
optimal levels, as described by our minimal model, are the
dominant evolutionary force driving the adaptation of gene
expression in Drosophila.
Functional Determinants of SelectionBy applying our inference to specific classes of genes, we can
get a more detailed view on adaptation of gene expression in
Drosophila. First, we observe a strong correlation between
codon usage and adaptation: genes with specific codons
show strongly reduced adaptive expression divergence and
lower average fitness flux than genes with broad codon usage
(Figure 4; Table 1). Specific codon usage is known to be preva-
lent in highly expressed genes (Ikemura, 1985); consistently, we
find stronger conservation of expression and lower levels of
fitness flux in this class (Table 1). Different codons for the
same amino acid differ in their efficiency of translation (Ikemura,
1985; Shields et al., 1988), which implies that genes with broad
codon usage have a higher potential for adaptive changes at
the post-transcriptional level. Here we find stronger adaptation
at the mRNA level in this gene class, which suggests a two-tier
mode of evolution: adaptive mRNA changes lay the ground on
which coherent adaptive tuning of protein levels can build.
At the same time, we find no significant correlation between
fitness flux for expression changes and adaptation of the amino
acid sequence, as measured by a McDonald-Kreitman test (Fig-
ure S6) (McDonald and Kreitman, 1991). This decoupling makes
sense, because expression changes of a gene are caused by cis-
and trans-regulatory sequence changes but do not require
evolution of its coding sequence. At the broad level of analysis
afforded by our dataset, we conclude that for a given gene,
expression level and coding sequence evolve independently to
a large degree. For a metabolite or a transcription factor, adap-
tive changes of its cellular concentration are often coupled with
conservation of its function.
Our gene-specific inference can be used to detect functional
gene classes associated with adaptive evolution of regulation.
A full ranking of gene classes by enrichment in adaptively regu-
lated genes with associated p values is reported in Table S1.
Gene functions associated with enhanced adaptation of expres-
sion include sensory perception, regulation, neural maturation,
regulation of growth, aging, and morphology. Adaptively regu-
lated functions also include response to UV radiation, which
has been identified as an important climate-mediated trait in hu-
mans (Hancock et al., 2011; Fraser, 2013). Adaptive evolution of
Cell Reports 20, 1385–1395, August 8, 2017 1391
A B Figure 4. Adaptation of Gene Expression
Depends on Codon Usage Bias
(A) The aggregate time-dependent divergence
UðtÞ for genes with broad codon usage ðOÞ andfor genes with specific codon usage ðPÞ is
shown, together with theoretical curves under
directional selection (dashed and dashed-dotted
lines); the theoretical curve inferred for all genes is
shown for comparison (solid line; cf. Figure 2).
Codon usage ismeasured by the effective number
of codons (Supplemental Experimental Proced-
ures) (Wright, 1990); inferred model parameters
are listed in Table 1.
(B) The distribution of the cumulative fitness flux
2NF is plotted against the effective number of co-
dons n (circle, average; line, median; box, 50%
around median; bars, 70% around median) (Sup-
plemental Experimental Procedures) (Wright, 1990).
See also Figures S1, S2, S4, S5, and S7.
genes related to growth, regulation, and morphology has been
previously inferred by expression QTL and comparative studies
of gene regulation in other species (Fraser et al., 2011; Fraser,
2011; Romero et al., 2012). Here we identify these categories
from a quantitative, system-wide scan for adaptively regulated
genes. This points to the power of our phenotype-based infer-
ence scheme, which is not confounded by the combinatorial
complexity of cis-regulatory sequence in higher eukaryotes.
Sex-Specific Evolution of ExpressionWe test the role of expression differentiation between male and
female samples for adaptive evolution across the Drosophila
genus. The sex specificity of a given gene (Zhang et al., 2007),
defined as the difference between its male and female expres-
sion level Emf =Em Ef , is a distinct trait whose evolutionary
pattern can be analyzed by our method. We can distinguish
two modes of evolution: conservation of sex specificity main-
tained by stabilizing selection and sex-specific adaptation of
expression (Figure 5A). Most genes of our dataset have well-
conserved and often small sex specificity; these genes evolve
their expression levels coherently between males and females
(Zhang et al., 2007). The remaining 19% of the genes have a sig-
nificant cumulative fitness flux Fmf of their specificity trait; we
classify them as undergoing sex-specific adaptation of expres-
sion in theDrosophila genus. These genes cover all four chromo-
somes of the Drosophila genome.
Gene functions associated with sex-specific adaptation of
expression include regulation of translation, reproduction,
post-mating behavior, and (immune) response to biotic stimuli
(Table S2). To understand the distribution of these adaptive pro-
cesses between sexes, we apply our inference to classes of
genes with different species-averaged sex bias of expression
(Assis et al., 2012). For male-biased genes, the aggregate diver-
gence Umf signals substantial sex-specific adaptation (Fig-
ure 5B). Consistently, fitness flux Fmf is strongly enhanced in
genes that are predominantly expressed in males (Figure 5C).
Fitness flux is lower in other classes, including genes expressed
predominantly in females.
Altogether, we find a remarkable evolutionary asymmetry be-
tween sexes: male bias in expression is associated with adaptive
1392 Cell Reports 20, 1385–1395, August 8, 2017
evolution of expression (orange shaded areas in Figures 5B and
5C), whereas female bias in expression is under weaker direc-
tional selection and primarily reflects conserved physiological
differences between male and female organisms. This result
complements a previously observed evolutionary asymmetry at
the sequence level: genes with male-biased expression show
increased amino acid divergence (Zhang et al., 2007). As sug-
gested by a McDonald-Kreitman test, this increase can be asso-
ciated with adaptive evolution of gene function (Figure S6).
DISCUSSION
Wehave shown that adaptive regulation accounts for most of the
macro-evolutionary divergence in gene expression across the
Drosophila genus. Genes differ considerably in the amount of
adaptation, depending on their codon usage, sexual differentia-
tion, and functional class. These results provide evidence for
system-wide adaptation of gene regulation in Drosophila at the
primary level of transcription, notwithstanding further evolu-
tionary complexities at the level of translation (Romero et al.,
2012; Artieri and Fraser, 2014). It remains to be seen whether a
similar prevalence of adaptation in the evolution of expression
will be found in different species.
Our inference of adaptation exploits the complex dependence
of the expression divergence on the evolutionary distance be-
tween species. It reflects two fundamental evolutionary features
of quantitative traits. First, such traits generate a divergence
pattern with two distinct molecular clocks: at a short evolutionary
distance, the divergence is always near the expected value
under neutrality; at a longer distance, it depends jointly on stabi-
lizing and directional selection (Figure 2; Box 1). This feature
reconciles seemingly contradictory results of previous studies:
analysis of closely related species produces a signal of neutral
evolution (Khaitovich et al., 2004, 2005), whereas evolutionary
constraint becomes apparent for more distant species (Rifkin
et al., 2003; Lemos et al., 2005; Rifkin et al., 2005; Gilad et al.,
2006; Bedford and Hartl, 2009; Romero et al., 2012). Second,
the phenotypic evolution of gene expression decouples from de-
tails of its genetic basis. This explains why we find overall strong
selection on gene expression levels even though selection on
A
B C
Figure 5. Sex-Specific Evolution of Gene
Expression
(A) Schematic showing conservation of sex speci-
ficity (left panel) versus sex-specific adaptation of
expression (right panel). The sex-specificity trait
(brown line) is defined as the difference between
male and female expression levels (purple and blue
lines). The schematics show all three lines as func-
tions of evolutionary time.
(B) The time-dependent divergence of the sex-
specificity trait Umf for all genes ð,Þ, genes with
male-biased expression ðOÞ, and genes with fe-
male-biased expression ðPÞ is shown, together
with theoretical curves under directional selection.
(C) The distribution of the cumulative fitness flux for
the sex-specificity trait 2NFmf is plotted against
the species-averaged sex specificity Emf (circle,
average; line, median; box, 50% around median;
bars, 70% around median) (Supplemental Experi-
mental Procedures). Sex-specific adaptation
(2NFmf > 4:5, orange shaded part) occurs predom-
inantly in male-biased genes.
See also Figures S1 and S4–S7.
individual QTL is often weak (Sunyaev and Roth, 2013). The
probabilistic extension of our inference scheme, which is based
on gene-specific expression divergence, identifies functional
gene classes associated with adaptive evolution of regulation.
The selection model underlying our analysis is a single-peak
fitness seascape, which contains components of stabilizing
and directional selection on a quantitative trait (Box 1). These
components are well-established notions of quantitative ge-
netics on micro-evolutionary timescales. Each of them can
provide a snapshot of the predominant selection pressure in a
population. However, the description of selection remains
incomplete as a description of selection over macro-evolu-
tionary periods. If selection on a trait is directional at a given
evolutionary time, will that selection relax after the trait value
has significantly adapted in the direction of selection? If selec-
tion is stabilizing, can we assume the optimal trait value will
remain invariant in the context of a different species? To
address these questions, we need a conceptual and quantita-
tive synthesis of stabilizing and directional selection. The sin-
gle-peak seascape model arguably provides the simplest
such synthesis. It also provides a simple picture of continual
adaptation over macro-evolutionary periods: a species follows
a moving fitness peak, and this process generates positive
fitness flux but no net increase in fitness.
Our method of selection inference can be applied to a spec-
trum of molecular quantitative traits with a complex genetic
basis, provided that comparative data from multiple, sufficiently
diverged species are available. Such traits include genome-
wide protein levels, protein-DNA binding interactions, and enzy-
matic activities. For most of these traits, we have only partial
knowledge of the underlying genetic loci and their effects on trait
and fitness. Our method complements QTL studies and opens a
way to infer quantitative phenotype-fitness maps at the systems
level.
EXPERIMENTAL PROCEDURES
Sequence Data and Evolutionary Tree
A synonymous genome sequence is used to estimate the species divergence
times ti j (scaled in units of the inverse point mutation rate m1 (Drosophila
12 Genomes Consortium et al., 2007). The resulting phylogeny (Drosophila
12 Genomes Consortium et al., 2007) is shown in Figure 1, where we also
compare the divergence times ti j with divergence measures based on amino
acid distances.
Expression Data and Primary Analysis
We use genome-wide expression data from 7 Drosophila species (D. mela-
nogaster [D. mel], D. simulans [D. sim], D. yakuba [D. yak], D. ananassae
[D. ana], D. pseudoobscura [D. pse], D. virilis [D. vir], and D. mojavensis
[D. moj]), obtained in Zhang et al. (2007) (GEO: GSE6640). These data contain
mRNA intensity measurements for a number of biological replicates from adult
(5–7 days post-eclosion) males and females in each species. For quality
assessment, a number of technical replicates were obtained for each biolog-
ical replicate. Moreover, specific microarray platforms were designed for each
of these species, which allows for a reliable comparison of expression levels
across species (Figure S1) (Zhang et al., 2007). We restrict the analysis to
the 6,332 genes that have unambiguous one-to-one orthologs across all lines
and are tested by at least four probes in eachmicroarray platform (see Supple-
mental Information for details).
The expression levels Eai;s;k are labeled by gene a; species i; sex s, and bio-
logical replicates k: The levels are transformed to mean 0 and cross-gene vari-
ance 1 for each replicate (Z-transformation of microarrays) (Quackenbush,
2002). In the Supplemental Experimental Procedures, we show that this trans-
formation accurately captures evolutionary information and that our results are
robust under quantitative details of the transformation (Figure S1). Further-
more, the assays do not generate a spurious signal of expression divergence
across species (Figure S1). For a given gene a, we estimate the variance
across biological replicates dai and genetic mean Gai in each species and the
divergence Daij = ðGa
i Gaj Þ2 between any two species i, j. These divergence
data inform our primary inference of adaptation. For each pair of species,
we evaluate the aggregate divergence hDiji and the rescaled divergence
hUiji given by Equation 1, using a trait scale D0 obtained by our model fit
(described later). For each clade C in our phylogeny, we obtain the aggregate
data ðtC;UCÞ shown in Figures 2, 4A, and 5B by averaging ti j and Uij over all
Cell Reports 20, 1385–1395, August 8, 2017 1393
pairs of species i; j˛C that are connected via the root of the clade. Unlike
pairwise divergence between species, clade-specific divergence data allow
unbiased error analysis and model ranking, because they are only weakly
correlated through the structure of the phylogeny (Figure 1).
Inference of Selection
The minimal fitness seascape for a given gene takes the form
fðE; tÞ= f c
2NE20
ðE EðtÞÞ2;
where the optimal trait value EðtÞ performs an Ornstein-Uhlenbeck process
with mean square displacement yE20 per unit m1 of evolutionary time
(Box 1); E20 is the average genetic variation of expression in the long-term limit
of neutral evolution (Nourmohammad et al., 2013b).
We use the time dependence of the clade-specific aggregate expression
divergence ðtC; hDCðtÞiÞ to infer the fitness parameters of stabilizing strength
c and driving rate y. We treat the trait scale D0 as an additional fit parameter
and assume that the saturation due to stabilizing selection occurs at the latest
possible time (i.e., for the largest Ustab) consistent with the data, resulting in a
best model with conservative estimates of stabilizing strength c. This assump-
tion is necessary because the resolution of the data on short evolutionary time-
scales is bounded by the D. mel-D. sim divergence. Using this procedure, we
infer a global fitness seascape with parameters ðc = 18:4; y =0:08Þ and a re-
sulting average fitness flux 2NF = 3:8 per gene across the Drosophila genus
ðtDros: = 1:4Þ (Figure 2). The fitness flux is independent of the trait scale D0
and hence of the preceding assumption. Moreover, we use the trait scale D0
and the neutral sequence diversity p0 determined from synonymous polymor-
phisms (Begun et al., 2007) to estimate the expected aggregate trait diversity
within a given species hDi p0D0 (Supplemental Experimental Procedures).
This estimate of hDi determines the sampling error of the observed expression
divergence D (Figure 2 shows error-corrected divergence data). Conversely,
the trait scaleD0 can be inferred directly fromdata of hDi (Supplemental Exper-
imental Procedures), but such data are not available for all species in the pre-
sent set.
Control fits of the same data to equilibriummodels, including the well-known
Ornstein-Uhlenbeck dynamics for the population mean trait (Hansen, 1997;
Bedford and Hartl, 2009), are shown in Figure S3. To account for the expres-
sion noise due to the limited number of biological replicates in each species,
we use a probabilistic extension of this test. We evaluate the Bayesian poste-
rior probability distribution for the stabilizing strength and fitness flux in individ-
ual genesQðc;F jEaÞ, given their samplemean data Ea = ðEa1 ;.;Ea
7 Þ. This pro-duces gene-specific expectation values ca and Fa (Figures 3A, 4B, and 5C).
This method uses gene-specific trait scales D0, which account for differences
in mutational variance between genes. We use a conservative condition on
fitness flux 2NFa > 4 to infer adaptively regulated genes (Table S1). The cumu-
lative log-likelihood score Sðc;FÞ=Pa log Qðc;F jEaÞ quantifies the statistical
significance of our inference (Figure 3B).
Analysis of Alternative Evolutionary Scenarios
To test for lineage-specific demographic effects, we compare the aggregate
rescaled divergence U= hDðtÞi=D0 data to theoretical functions Uðt; tiÞcomputed for an alternative model with a change in effective population size
on the phylogenetic branch of species i (Figure S4).We also examine two alter-
native selection scenarios: relaxed stabilizing selection by partial loss of func-
tion (ca switches to a reduced value with rate g) and punctuated fitness peak
shifts (Ea jumps by an amount on the order of E0 with a rate on the order of ym)
(Figure S5). The observed distributions of cross-species expression differ-
ences are consistent with the minimal seascape model but at variance with
both alternative models (Figure S5).
Analysis of Specific Gene Classes
To infer sex-specific evolution, we define specificity traits as differences
between male and female expression levels Eamf ;i =Ea
m;i Eaf;i for each gene
(Zhang et al., 2007). Genes with sex-specific adaptive evolution of expression
are identified by a condition on the cumulative fitness flux for the specificity
1394 Cell Reports 20, 1385–1395, August 8, 2017
trait 2NFamf > 4:5 (Table S2). Genes with male- and female-biased expression
are identified using the results of Assis et al. (2012).
Simulation Tests
We simulate Fisher-Wright evolution to validate our probabilistic inference
scheme and to establish its robustness under trait epistasis (Figure S7).
SUPPLEMENTAL INFORMATION
Supplemental Information includes Supplemental Experimental Procedures,
seven figures, and two tables and can be found with this article online at
http://dx.doi.org/10.1016/j.celrep.2017.07.033.
AUTHOR CONTRIBUTIONS
A.N., J.R., T.H., V.K., J.B., and M.L. designed the research and analyzed the
data. A.N., T.H., and M.L. wrote the article.
ACKNOWLEDGMENTS
We acknowledge discussions with P. Andolfatto, N. Barton, A. Beyer, H.
Fraser, M. quksza, L.B. Oliver, J. Plotkin, S. Schiffels, P. Shah, and D. Sturgill.
This work has been supported by the James S. McDonnell Foundation (A.N.),
U.S. National Science Foundation grant PHY1305525 (A.N.), and Deutsche
Forschungsgemeinschaft grant SFB 680. We acknowledge the U.S. National
Science Foundation grant PHY1125915 to the Kavli Institute of Theoretical
Physics (UCSB), where part of this work was performed.
Received: September 6, 2016
Revised: April 15, 2017
Accepted: July 13, 2017
Published: August 8, 2017
REFERENCES
Andolfatto, P. (2005). Adaptive evolution of non-coding DNA inDrosophila. Na-
ture 437, 1149–1152.
Artieri, C.G., and Fraser, H.B. (2014). Evolution at two levels of gene expression
in yeast. Genome Res. 24, 411–421.
Assis, R., Zhou, Q., and Bachtrog, D. (2012). Sex-biased transcriptome evolu-
tion in Drosophila. Genome Biol. Evol. 4, 1189–1200.
Bedford, T., and Hartl, D.L. (2009). Optimization of gene expression by natural
selection. Proc. Natl. Acad. Sci. USA 106, 1133–1138.
Begun, D.J., Holloway, A.K., Stevens, K., Hillier, L.W., Poh, Y.P., Hahn, M.W.,
Nista, P.M., Jones, C.D., Kern, A.D., Dewey, C.N., et al. (2007). Population ge-
nomics: whole-genome analysis of polymorphism and divergence in
Drosophila simulans. PLoS Biol. 5, e310.
Blekhman, R., Oshlack, A., Chabot, A.E., Smyth, G.K., and Gilad, Y. (2008).
Gene regulation in primates evolves under tissue-specific selection pressures.
PLoS Genet. 4, e1000271.
Brawand, D., Soumillon, M., Necsulea, A., Julien, P., Csardi, G., Harrigan, P.,
Weier, M., Liechti, A., Aximu-Petri, A., Kircher, M., et al. (2011). The evolution of
gene expression levels in mammalian organs. Nature 478, 343–348.
Bullard, J.H., Mostovoy, Y., Dudoit, S., and Brem, R.B. (2010). Polygenic and
directional regulatory evolution across pathways in Saccharomyces. Proc.
Natl. Acad. Sci. USA 107, 5058–5063.
Coolon, J.D., McManus, C.J., Stevenson, K.R., Graveley, B.R., and Wittkopp,
P.J. (2014). Tempo and mode of regulatory evolution in Drosophila. Genome
Res. 24, 797–808.
Drosophila 12 Genomes Consortium, Clark, A., Eisen, M., Smith, D., Bergman,
C., Oliver, B., Markow, T., Kaufman, T., Kellis, M., Gelbart, W., Iyer, V., et al.
(2007). Evolution of genes and genomes on the Drosophila phylogeny. Nature
450, 203–218.
Fraser, H.B. (2011). Genome-wide approaches to the study of adaptive gene
expression evolution: systematic studies of evolutionary adaptations involving
gene expressionwill allowmany fundamental questions in evolutionary biology
to be addressed. BioEssays 33, 469–477.
Fraser, H.B. (2013). Gene expression drives local adaptation in humans.
Genome Res. 23, 1089–1096.
Fraser, H.B., Moses, A.M., and Schadt, E.E. (2010). Evidence for widespread
adaptive evolution of gene expression in budding yeast. Proc. Natl. Acad. Sci.
USA 107, 2977–2982.
Fraser, H.B., Babak, T., Tsang, J., Zhou, Y., Zhang, B., Mehrabian, M., and
Schadt, E.E. (2011). Systematic detection of polygenic cis-regulatory evolu-
tion. PLoS Genet. 7, e1002023.
Genissel, A., McIntyre, L.M., Wayne, M.L., and Nuzhdin, S.V. (2008). Cis and
trans regulatory effects contribute to natural variation in transcriptome of
Drosophila melanogaster. Mol. Biol. Evol. 25, 101–110.
Gilad, Y., Oshlack, A., Smyth, G.K., Speed, T.P., and White, K.P. (2006).
Expression profiling in primates reveals a rapid evolution of human transcrip-
tion factors. Nature 440, 242–245.
Glinka, S., Ometto, L., Mousset, S., Stephan, W., and De Lorenzo, D. (2003).
Demography and natural selection have shaped genetic variation inDrosophila
melanogaster: a multi-locus approach. Genetics 165, 1269–1278.
Haddrill, P.R., Thornton, K.R., Charlesworth, B., and Andolfatto, P. (2005). Mul-
tilocus patterns of nucleotide variability and the demographic and selection
history of Drosophila melanogaster populations. Genome Res. 15, 790–799.
Hancock, A.M., Witonsky, D.B., Alkorta-Aranburu, G., Beall, C.M., Gebremed-
hin, A., Sukernik, R., Utermann, G., Pritchard, J.K., Coop, G., and Di Rienzo, A.
(2011). Adaptations to climate-mediated selective pressures in humans. PLoS
Genet. 7, e1001375.
Hansen, T.F. (1997). Stabilizing selection and the comparative analysis of
adaptation. Evolution 51, 1341–1351.
Held, T., Nourmohammad, A., and Lassig, M. (2014). Adaptive evolution ofmo-
lecular phenotypes. J. Stat. Mech. 9, P09029.
Hoekstra, H.E., and Coyne, J.A. (2007). The locus of evolution: evo devo and
the genetics of adaptation. Evolution 61, 995–1016.
Ikemura, T. (1985). Codon usage and tRNA content in unicellular and multicel-
lular organisms. Mol. Biol. Evol. 2, 13–34.
Khaitovich, P., Weiss, G., Lachmann, M., Hellmann, I., Enard, W., Muetzel, B.,
Wirkner, U., Ansorge, W., and Paabo, S. (2004). A neutral model of transcrip-
tome evolution. PLoS Biol. 2, E132.
Khaitovich, P., Hellmann, I., Enard, W., Nowick, K., Leinweber, M., Franz, H.,
Weiss, G., Lachmann, M., and Paabo, S. (2005). Parallel patterns of evolution
in the genomes and transcriptomes of humans and chimpanzees. Science
309, 1850–1854.
King, M.C., and Wilson, A.C. (1975). Evolution at two levels in humans and
chimpanzees. Science 188, 107–116.
Lachaise, D., Cariou, M., David, J., Lemeunier, F., Tsacas, L., and Ashburner,
M. (1988). Historical biogeography of the Drosophila melanogaster species
subgroup. In Evolutionary Biology, Volume 22, M. Hecht, B. Wallace, and G.
Prance, eds. (Springer), pp. 159–225.
Leinonen, T., McCairns, R.J., O’Hara, R.B., and Merila, J. (2013). Q(ST)-F(ST)
comparisons: evolutionary and ecological insights from genomic heterogene-
ity. Nat. Rev. Genet. 14, 179–190.
Lemos, B., Meiklejohn, C.D., Caceres, M., and Hartl, D.L. (2005). Rates of
divergence in gene expression profiles of primates, mice, and flies: stabilizing
selection and variability among functional categories. Evolution 59, 126–137.
Lynch, M., and Hill, W.G. (1986). Phenotypic evolution by neutral mutation.
Evolution 40, 915–935.
McDonald, J.H., and Kreitman, M. (1991). Adaptive protein evolution at the
Adh locus in Drosophila. Nature 351, 652–654.
Mustonen, V., and Lassig, M. (2007). Adaptations to fluctuating selection in
Drosophila. Proc. Natl. Acad. Sci. USA 104, 2277–2282.
Mustonen, V., and Lassig, M. (2010). Fitness flux and ubiquity of adaptive evo-
lution. Proc. Natl. Acad. Sci. USA 107, 4248–4253.
Nourmohammad, A., Held, T., and Lassig, M. (2013a). Universality and pre-
dictability in molecular quantitative genetics. Curr. Opin. Genet. Dev. 23,
684–693.
Nourmohammad, A., Schiffels, S., and Lassig, M. (2013b). Evolution of molec-
ular phenotypes under stabilizing selection. J. Stat. Mech. 1, P01012.
Orr, H.A. (1998). Testing natural selection vs. genetic drift in phenotypic evolu-
tion using quantitative trait locus data. Genetics 149, 2099–2104.
Pai, A.A., Pritchard, J.K., andGilad, Y. (2015). The genetic andmechanistic ba-
sis for variation in gene regulation. PLoS Genet. 11, e1004857.
Quackenbush, J. (2002). Microarray data normalization and transformation.
Nat. Genet. 32 (Suppl), 496–501.
Riedel, N., Khatri, B.S., Lassig, M., and Berg, J. (2015). Multiple-line inference
of selection on quantitative traits. Genetics 201, 305–322.
Rifkin, S.A., Kim, J., andWhite, K.P. (2003). Evolution of gene expression in the
Drosophila melanogaster subgroup. Nat. Genet. 33, 138–144.
Rifkin, S.A., Houle, D., Kim, J., and White, K.P. (2005). A mutation accumula-
tion assay reveals a broad capacity for rapid evolution of gene expression. Na-
ture 438, 220–223.
Romero, I.G., Ruvinsky, I., and Gilad, Y. (2012). Comparative studies of gene
expression and the evolution of gene regulation. Nat. Rev. Genet. 13, 505–516.
Sella, G., Petrov, D.A., Przeworski, M., and Andolfatto, P. (2009). Pervasive
natural selection in the Drosophila genome? PLoS Genet. 5, e1000495.
Shields, D.C., Sharp, P.M., Higgins, D.G., and Wright, F. (1988). ‘‘Silent’’ sites
in Drosophila genes are not neutral: evidence of selection among synonymous
codons. Mol. Biol. Evol. 5, 704–716.
Stephan,W., and Li, H. (2007). The recent demographic and adaptive history of
Drosophila melanogaster. Heredity (Edinb) 98, 65–68.
Sunyaev, S.R., and Roth, F.P. (2013). Systems biology and the analysis of ge-
netic variation. Curr. Opin. Genet. Dev. 23, 599–601.
Thornton, K.R., Jensen, J.D., Becquet, C., and Andolfatto, P. (2007). Progress
and prospects in mapping recent selection in the genome. Heredity (Edinb) 98,
340–348.
Whitehead, A., and Crawford, D.L. (2006). Neutral and adaptive variation in
gene expression. Proc. Natl. Acad. Sci. USA 103, 5425–5430.
Wittkopp, P.J., Haerum, B.K., and Clark, A.G. (2008). Regulatory changes un-
derlying expression differences within and between Drosophila species. Nat.
Genet. 40, 346–350.
Wright, F. (1990). The effective number of codons used in a gene. Gene 87,
23–29.
Zhang, Y., Sturgill, D., Parisi, M., Kumar, S., and Oliver, B. (2007). Constraint
and turnover in sex-biased gene expression in the genus Drosophila. Nature
450, 233–237.
Cell Reports 20, 1385–1395, August 8, 2017 1395
Cell Reports, Volume 20
Supplemental Information
Adaptive Evolution of Gene
Expression in Drosophila
Armita Nourmohammad, Joachim Rambeau, Torsten Held, Viera Kovacova, JohannesBerg, and Michael Lässig
Ω(τ)
Ω(τ)
Ω(τ)
mea
n ex
pres
sion
leve
lm
ean
expr
essi
on le
vel
(D)
D.sim*D. sim
D. melD. yak
D. anaD. pse
D. virD. moj
-0.25
0
0.25
0.5
D.sim*D. sim
D. melD. yak
D. anaD. pse
D. virD. moj
-0.5
0
0.5
1
1.5
2
0 2 4 6 80
0.2
0.4
0.6
0.8
1
0 2 4 6 80
0.2
0.4
0.6
0.8
1
0 2 40
0.2
0.4
0.6
0.8
1
0 2 4 6 80
0.2
0.4
0.6
0.8
1
0 2 4 6 80
0.2
0.4
0.6
0.8
1
0 2 4 6 80
0.2
0.4
0.6
0.8
1
mel-sim, τ ~ 0.11 µ−1 mel-yak, τ ~ 0.27 µ−1 moj-vir, τ ~ 0.71 µ−1
mel-ana, τ ~ 1.12 µ−1 mel-pseud, τ ~ 1.17 µ−1 Dros. genus, τ ~ 1.38 µ−1
clade-specific expression divergence, DC
cum
ulat
ive
dist
ribut
ion
func
tion
(E)
expr
essi
on le
vel
D.sim*D. sim
D. melD. yak
D. anaD. pse
D. virD. moj
8
9
10
11
(A)0 0.5 1 1.5
0
0.1
0.2
0.3
0.4
divergence time, τ
resc
aled
div
erge
nce,
expr
essi
on le
vel
expr
essi
on le
vel
D.sim*D. sim
D. melD. yak
D. anaD. pse
D. virD. moj-1
-0.5
0
0.5
1
D.sim*D. sim
D. melD. yak
D. anaD. pse
D. virD. moj
8
9
10
11
(B)
0 0.5 1 1.50
0.02
0.04
0.06
0.08
divergence time, τ
(C)
0 0.5 1 1.50
0.02
0.04
0.06
0.08
divergence time, τ
resc
aled
div
erge
nce,
resc
aled
div
erge
nce,
Ω(τ)
Figure S1: Transformation of expression data and testing for technical expression divergence. Related toFigures 2, 3, 4, 5.
1
Figure S1: Transformation of expression data and testing for technical expression divergence. Related toFigures 2, 3, 4, 5. The following statistics are compared between (A) raw intensities, (B) Z-transformed intensities,and (C) quantile-normalized intensities. Top panels: Average expression intensities across all genes are shown for allbiological replicates of female (circle) and male (triangle) organisms (error bars indicate standard deviation). Centerpanels: Clustering of expression intensities for all genes (horizontal axis), and all replicates (vertical axis, denoted by“species sex ID”) by Euclidean distance (see section 1 of SI). For raw intensities, the replicates of each species clustertogether but cross-species differences do not reflect evolutionary distances, as shown by the scrambled phylogenieson the right hand side. Z-transformed and quantile-normalized intensities recover the species clades of the sequence-based Drosophila phylogeny; cf. Fig. 1. Tree branches are colored by species, as in the top panels. Bottom panels:The aggregate (rescaled) divergence for clades, ΩC (filled squares) and for individual pairs of species, Ωij (emptysquares), is plotted against divergence time, τ (as in Fig. 2). The rescaling of the expression divergence is done by acommon denominator D0 consistent with Fig. 2 in the main text. The dependence of expression divergence on τ ismasked for raw intensities, but consistent for Z-transformed and quantile-normalized intensities. In addition, clade-based statistics is seen to substantially reduce the noise of the expression divergence data. We conclude that a Z- orquantile transformation of the data is essential to capture evolutionary information, but our results are robust undervariants of the transformation. See section 1 of SI. (D) The species-specific aggregate mean gene expression level〈Ei〉 is plotted for different classes of genes. Top panel: genes with varying level of sequence divergence across 7Drosophila species: 10% highest divergence (dark blue triangles), 20% medium divergence (medium blue squares)and 10% lowest divergence (light blue triangles); Bottom panel: highly expressed genes (green triangles), and male-biased gene (orange triangles). The error bars show the standard deviation of the mean in each class. In all classes,we find no significant species dependence of the class averages 〈Ei〉. (E) Cumulative distribution of clade-specificexpression divergence (unscaled) DC , estimated for Drosophila clades (Fig. 2) are indistinguishable in gene classeswith varying levels of sequence divergence; the color code is similar to (D, top panel). We conclude that the assay isfree of technical divergence; see section 1 of SI.
2
bio. replicatevariance (δ)
diversity (∆sim)
divergence(D)
cross-genevariance (V)
0.030.040.05
0.1
0.20.30.40.5
1.0
dimorphism(∆ mf )
D. melD. simD. yakD. anaD. pseD. virD. moj
(B)synonymous divergence, τ(A)
0 0.5 1 1.50
0.05
0.1
0.15
0.2
0.25
amin
o ac
id d
iver
genc
e, τ~
Figure S2: Sequence and gene expression variation. Related to Figures 2, 3, 4. (A) Pairwise amino acid sequencedivergence vs. divergence time from synonymous sequence (circle) and the clade-specific divergence times (squareswith color for clades as in Fig. 1). We conclude evolutionary trees based on amino acid distances are less suitablefor our analysis (cf. the discussion on control analysis of equilibrium models in section 2 of SI.) (B) Gene-averagedexpression variance across biological replicates 〈δ〉 (, equation 4), expression diversity 〈∆〉 (5, equation 6), male-female expression dimorphism 〈∆mf 〉 (, equation 7), clade divergence 〈D〉 (4, equation 8 and used in Figs. 2, 4),and cross-gene variance of expression V = 〈Γ2
i 〉 ≈ 1 (×). We find a clear ranking 〈δi〉 < 〈∆〉sim . 〈∆mf 〉 <〈Dij〉 < Vi. The color code for single-species data is shown in the legend, colors for clades are as in Fig. 1.
3
0 0.5 1 1.50
0.1
0.2
0.3
0.4
0.5
0.6
0 0.5 1 1.50
0.1
0.2
0.3
0.4
0.5
0.6
divergence time, τ rescaled amino acid distance , τ(A) (B)~
expr
essio
n di
verg
ence
, D
expr
essio
n di
verg
ence
, D
Figure S3: Fitness landscape models as control. Related to Figure 2. (A) Clade-specific gene expression diver-gence, DC (unscaled, filled squares), together with pairwise expression divergence, Dij (empty squares), is plottedagainst the divergence time estimated from four-fold synonymous sites (Drosophila 12 Genomes Consortium et al.,2007) (Fig. 1). The seascape model with the trait scale D0 as a fit parameter (green solid line; stabilizing strengthc∗ = 18.4, driving rate υ∗ = 0.08; as in Fig. 2) explains these data; this model is discussed in the main text. An al-ternative seascape model with the trait scale inferred from the D. simulans diversity data (dashed green line; c = 18.6,υ = 0.07) is very similar, which serves as a consistency check. The landscape models with the trait scale as a fitparameter (solid blue line; ceq < 1) and with the trait scale inferred from the diversity data (dashed blue line; ceq = 8)provide a significantly poorer fit; see section 2 of SI for the likelihood comparison of these models. In particular,neither of the equilibrium models can explain the evolution of expression in the youngest clades: the model withdiversity from data overestimates the divergence Dmel−yak and Dmel−sim, the model with inferred diversity over-estimates the relative divergence Dmel−yak/Dmel−sim. (B) The same clade-specific gene expression divergence DC(filled squares) and pairwise expression divergence Dij (empty squares) are plotted against the amino-acid sequencedistance of Fig. S2A (Bedford and Hartl, 2009), uniformly rescaled to give the same scaled genus divergence timeτDros. = 1.4 as in (A). We find the same ranking of models, but all fits become poorer due to the nonlinearities ofthe amino acid divergence times (cf. Fig. S2A). See section 2 of SI for a detailed comparison with the results ofref. (Bedford and Hartl, 2009).
4
Ω(τ)
resc
aled
div
erge
nce,
divergence time, τ(C)
0 0.5 1 1.50
0.03
0.06
0.09
0 0.5 1 1.50
0.03
0.06
0.09
(A)
(B)
0.0 0.5 1.0 1.50.00
0.02
0.04
0.06
0.0 0.5 1.0 1.50.00
0.02
0.04
0.06
0.0 0.5 1.0 1.50.00
0.01
0.02
0.0 0.5 1.0 1.50.00
0.01
0.02
0 0.5 1 1.50
0.03
0.06
0.09
0 0.5 1 1.50
0.03
0.06
0.09
0 0.5 1 1.50
0.03
0.06
0.09
0 0.5 1 1.50
0.03
0.06
0.09
0 0.5 1 1.50
0.03
0.06
0.09
Figure S4: Test of lineage-specific demography. Related to Figures 2, 4, 5. We compare the polarized (rescaled)divergence ΩC,i with species i as outgroup (equation 36,4) to background data from partial clades excluding speciesi (5); both quantities are plotted against the clade divergence time. (A) Left panel: Data for clades with outgroupD. melanogaster. Center and right panels: Evolution with a reduced or enhanced effective population size Ni in theoutgroup lineage. Analytical curves and simulation results are shown for Ni = 3N (dashed lines, N) Ni = N/2(dashed-dotted lines, 4) in a fitness landscape (stabilizing strength c = 20, driving rate υ = 0; center panel) andseascape (c = 20, υ = 0.09; right panel). (B) Same as (A), with outgroup D. mojavensis. (C) Data for each of theother five species chosen as outgroup. These data give no evidence of long-term lineage-specific demography. Theanalytical and simulation results show that lineage-specific demography under stabilizing selection does not confoundthe signal of adaptive evolution in the time-dependent divergence Ω, shown in Fig. 2. Lineage-specific demography isintroduced in section 3, simulation details are given in section 5 of SI.
5
Ω(τ)
Ω(τ)
expression difference
coun
ts
expression difference expression difference expression difference
fract
ion
fract
ion
fract
ion
(A)
(B) (C) (D)
divergence time, τ divergence time, τ divergence time, τ
-9 -6 -3 0 3 6 9
10-1
10-3
10-2
-9 -6 -3 0 3 6 9-9 -6 -3 0 3 6 9
10-1
10-3
10-2
10-1
10-3
10-2
−4 −2 0 2 4101
102
103
resc
aled
div
erge
nce,
0.0 0.5 1.0 1.50.00
0.10
0.20
0.0 0.5 1.0 1.50.00
0.10
0.20
0.0 0.5 1.0 1.50.00
0.10
0.20
resc
aled
div
erge
nce,
resc
aled
div
erge
nce,
Ω(τ)
Figure S5: Test of alternative selection scenarios. Related to Figures 2, 4, 5. (A) Distributions of clade-specificexpression level differences, PC(∆E) (equation 39, color code as in Fig. 1), standard-normalized to mean 0 andvariance 1. These distributions are approximately Gaussian (black line: standard normal distribution). (B) Minimalseascape model. Top panel: Time-dependent (rescaled) divergence Ω(τ) (bullets: simulation results; line: analyticalcurve as in Fig. 2). Bottom panel: Standard-normalized distributions of trait differences, Pτ (∆E), from simulationsfor τ = 0.21, 0.69 and 1.37 (green, orange, and blue bullets) are of Gaussian form (dotted line). The same quantitiesare shown for two alternative fitness models: (C) Loss-of-function model. Functional genes evolve in a static fitnesslandscape of stabilizing strength c = 4.5; individual genes lose function with rate γ = 0.04µ, resulting in reducedselection (c → 0.01 c). The loss events generate a nonlinearity in Ω(τ) and a broad tail in Pτ (∆E) that are notobserved in the data. (D) Punctuated fitness seascape. Individual genes jump to a new, uncorrelated fitness peakwith rate 0.16µ. These dynamics also generate a broad tail in Pτ (∆E). The Drosophila data of ΩC (Fig. 2) andof PC(∆E) together favor the minimal seascape model over both alternatives. The loss-of-function model and thepunctuated seascape model are introduced in section 3, simulation details are given in section 5 of SI.
6
1 2 3 4 5 6 7−1
−0.5
0
0.5
adap
tive
subs
titut
ions
, α seq
fitness flux, 2NΦ −2 −1 0 1 2
−1
−0.5
0
0.5
1
1.5
average sex specificity, Emf
adap
tive
subs
titut
ions
, α seq
(A) (B)
Figure S6: Adaptive gene expression versus adaptive evolution of protein sequence. Related to Figures 2, 3, 5.(A) The distribution of αseq = (DnPs/DsPn)− 1, denoting the fraction of adaptive amino acid substitutions (Smithand Eyre-Walker, 2002), is plotted against the cumulative fitness flux of gene expression, 2NΦ reported in Table S1and shown in Fig. 3A (circle: average; line: median; box: 50% around median; bars: 70% around median). Wefind no correlation between these statistics, which suggests that adaptive gene expression is an independent modeof evolution. (B) The distribution of αseq plotted against the average sex specificity Emf signals increased adaptiveprotein evolution in genes with sex-biased expression, which is strongest in male-biased genes (cf. the results ofref. (Zhang et al., 2007)). For the definition of sex-biased expression, see section 4 of SI.
7
input fitness flux, 2N Φin
infe
rred
fitne
ss fl
ux, 2
N Φ
input stabilizing strength, c in
infe
rred
stab
ilizin
g st
reng
th, c
fitne
ss fl
ux, 2
N Φ
epistasis strength, ε2
0
1
2
3
4
10110010-1020 3040
100
101
102
10532110-1
10-1
100
10-2
10-3
10-4
10-5
101
0.05 0.1 0.25 0.5 1 2.5 5 10
(A) (B) (C)
Figure S7: Simulation tests of the inference scheme. Related to Figures 2, 3, 4, 5. (A,B) Distributions ofthe cumulative fitness flux 2NΦα and stabilizing strength cα inferred from simulated expression data are plottedagainst the simulation input parameters 2NΦin and cin (red line: median, box: 50% around the median, bar: 75%around median). This simulation analysis supports that the inferred gene-specific maximum likelihood values (Φα, cα)reported in Table S1 and shown in Fig. 3B are on average conservative estimates of the underlying evolutionaryparameters (cin,Φin). See section 5 of SI for simulation details. (C) Selection inference for epistatic traits. Simulationresults of the actual fitness flux (4) are compared to flux values inferred by the standard test based on the time-dependent (rescaled) divergence Ω(τ) (, see section 2 of SI). Both quantities are plotted against the strength ofepistasis, ε2, defined as the ratio of epistatic and additive trait variance (section 5 of SI); horizontal lines show theactual fitness flux without epistasis (ε2 = 0). Simulations are shown for selection parameters (c = 4.5, υ = 0.4)(green) and (c = 4.5, υ = 0) (blue). We conclude that our inference of adaptive evolution based on the aggregaterescaled divergence Ω(τ) (Fig. 2) is not confounded by trait epistasis. See section 5 of SI for simulation details.
8
Supplemental Procedures
1. Data and primary analysis
Sequence data and phylogenetic tree. Our inference procedure requires the following global sequence-based information (which does not include expression QTL):
(a) A phylogenetic tree of the 7 Drosophila species included in this study. Here we use the tree of theDrosophila 12 Genome Consortium (Drosophila 12 Genomes Consortium et al., 2007), which is basedon genome-wide divergence at synonymous sequence sites. This tree determines six clades of phyloge-netically related species (Fig. 1), which are used in our analysis of time-dependent expression divergence(Figs. 2 and 4A,5B).
(b) Divergence times between all pairs of species, scaled in units of the inverse neutral point mutation rate.The tree of Fig. 1 uses a lineage-specific mutation rate to infer the length of its 12 branches. The scaleddivergence time τij for a given species pair (i, j) is the sum of the lengths of the branches connectingthese species. The scaled divergence time of a clade C is defined as an average over species pairs,
τC =1
|C1||C \ C1|∑i∈C1
∑j∈C\C1
τij , (1)
where C is the set of species in the clade and (C1, C2) is the partitioning of this set defined by the rootnode.
An accurate inference of divergence times is an important prerequisite for our evolutionary analysisof gene expression. The times τij have been inferred in ref. (Drosophila 12 Genomes Consortiumet al., 2007) from synonymous sequence divergence, accounting for saturation effects due to multiplemutations. We can compare these times with the analogous times τij inferred from amino acid sequencedivergence, which have been used in a previous study (Bedford and Hartl, 2009). Fig. S2A shows ascatter plot (τij , τij) for all species pairs, and the clade divergence time (τC , τC), as defined in eq. (1).Compared to the molecular clock of neutral evolution, the amino acid times τij are seen to suffer fromsignificant inhomogeneities within the Drosophila genus. We conclude that the τij values provide anonlinear measure of divergence times, which is less suitable for evolutionary analysis than the timesτij inferred from synonymous sequence.
Expression data. We use genome-wide expression data from 7 Drosophila species obtained by ref. (Zhanget al., 2007) (GEO: GSE6640). These data are well suited for our analysis. They cover several clades ofspecies that are well comparable at the organismic level and sufficiently diverged for adaptive evolution ofexpression to be detectable (section 2). Moreover, Drosophila has larger effective population size, highermutation rates, and shorter generation times than typical mammalian species (Gilad et al., 2006a), and adap-tive evolution has been detected at the genomic level by several methods (Andolfatto, 2005; Mustonen andLassig, 2007; Sella et al., 2009). Hence, compared to more recent data from other species (Brawand et al.,2011; Perry et al., 2012; Tsankov et al., 2010), the Drosophila expression data of Zhang et al. (Zhang et al.,2007) are a suitable target for the inference of adaptive evolution. These data contain mRNA intensity mea-surements for a number of biological replicates (4 − 7) from the adult (5 − 7 days post eclosion) malesand females in each species. Specific microarray platforms were designed for each of these species, al-lowing for a reliable comparison of expression levels across species. Each platform has an array of probes
9
mapped to assembled genome sequences and to GLEANR gene annotations by the Drosophila 12 GenomesConsortium (Drosophila 12 Genomes Consortium et al., 2007), which also provides sequence homologytables. For each species, at least four hybridizations, including technical (dye-flipped) replicates for eachof the biological replicates were performed. We restrict the analysis to the 6332 genes that have unam-biguous one-to-one orthologs across all lines and are tested by at least four probes in each microarrayplatform. We obtain a set of expression levels Eαi,s,κ (defined as log2 intensities) labelled by gene numberα ∈ 1, . . . , g=6332, species i ∈ mel, sim, yak, ana, pse, vir, moj (Fig. 1), sex s ∈ m, f, and biolog-ical replicates κ ∈ 1, . . . , ki,s = 4− 7; biological replicates contain similar amounts of genetic material.The data contain two strains of D. simulans from the Tucson Drosophila Stock Center: (D. sim: 14021-0251.011, and D. sim: 14021-0251.198), which are used to estimate the genetic variance of expression (seebelow).
Transformation of expression levels. A measured raw-probe microarray signal is largely influenced bynon-biological factors, such as varying total RNA abundances, labeling and hybridization efficiency, thataffect all probes on a chip. The data provided by Zhang et al. (Zhang et al., 2007) is log-2 transformation ofthe intensities after a primary batch correction, using the method of variance stabilizing transformation (Hu-ber et al., 2002). Similar to previous evolutionary analysis on the same dataset (Bedford and Hartl, 2009),we perform a standard Z-transform normalization for each replicate, by defining a linear transformation ofthe intensities (Quackenbush, 2002),
Eαi,s,κ →Eαi,s,κ − 〈Ei,s,κ〉√
Vi,s,κ, (2)
where 〈Ei,s,κ〉 and Vi,s,κ denote mean and variance of the expression across all genes in a given replicate(i, s, κ). The transformed levels Eαi,s,κ are shifted to mean 0 and normalized to variance 1 across all genesin each biological replicate.
Evolutionary implications of the transformation. From an evolutionary point of view, any transfor-mation is a heuristic to make quantitative trait data more comparable between species. Specifically, thetransformation should minimize the ratio of non-evolutionary noise compared to evolutionary signal. Thevalidity of a specific transformation scheme has to be judged from consistency of the results. Here we showthat the Z transformation (2) produces a consistent evolutionary signal and its results are robust under quan-titative details of the transformation; we also verify that the species-specific assay of ref. (Zhang et al., 2007)does not generate spurious signal of divergence across species.
(a) The Z transformation captures evolutionary information. As shown in Fig. S1A (top panel), the averageintensities of probes across all genes 〈Ei,s,κ〉 are comparable between biological replicates of a givenstrain, but differ substantially among species and even between the two strains of D. simulans. Clus-tering of the expression intensities based on Euclidian distance between homologues across biologicalreplicates shows the masking of evolutionary information in the raw intensities of the probes: these in-tensities cluster together for replicates of the same species; however, cross-species differences betweenintensities of homologues lead to a scrambled phylogeny (Fig. S1A, center panel). This masking canalso be seen in a plot of the aggregate time-dependent rescaled divergence Ω defined in equation (9)(Fig. S1A, bottom panel). In contrast, for Z-transformed data, the clustering of gene expression levelsproduces a phylogeny that recovers the species clades of the sequence-based Drosophila phylogeny,
10
and the aggregate rescaled expression divergence Ω shows a consistent dependence on divergence time(Fig. S1B (bottom panel), cf. Fig. 2). Therefore, the Z transform is essential to capture the evolutionaryinformation in the data (Quackenbush, 2002).
(b) Results are robust under variants of the transformation. In order to test the sensitivity of our resultsto the specific choice of transformation, we performed the commonly used quantile normalization onthe raw expression intensities (Bolstad et al., 2003) (implemented in the R-package “preprocessCore”as the function “normalize.quantiles”). Quantile normalization forces the observed distributions of rawintensities to be the same across all replicates, and equal to the distribution obtained by taking theaverage of each quantile across samples. For quantile normalized expression intensities, clusteringagain recovers the clades of the sequence-based Drosophila phylogeny, and we obtain aggregate time-dependent divergence Ω(τ) very similar to those of Z-transformed data (Fig. S1C). In this paper, weuse the Z-transformation as normalization method, because it is a more conservative choice in pre-processing of data and it does not homogenize the expression distributions across species.
(c) Absence of “technical divergence”. The expression levels were measured by species-specific microarrayplatforms (Zhang et al., 2007) designed to eliminate confounding effects of sequence divergence onhybridization and hence, to make expression levels suitable for cross-species comparison. We can testthis property in the Z-transformed data. If the assay has technical bias, we would expect its effects to bemore pronounced in genes with higher level of sequence divergence. Specifically, assume an imperfectassay with a hybridization bias towards a given species i∗ and stochastic degradation of sensitivity inother species. In a minimal model, the technical effect ∆E of an amino acid mutation is a stochasticvariable with mean A and variance B. The resulting observed expression level follows a biased randomwalk dependent on the sequence divergence (mismatch density) dαii∗ ,
Eαi = Eαi∗ −Adαii∗ + χαi with 〈χαi 〉 = 0, 〈χαi χαi 〉 = Bdαii∗ (3)
and positive constants A,B. A biased assay generates a “technical” aggregate expression divergence〈Dii∗〉 = A2〈d2
ii∗〉 + B〈dii∗〉 that would confound our evolutionary analysis. In addition, it leads tospecies-dependent aggregate mean expression levels 〈Ei〉 = 〈Ei∗〉 − A〈dii∗〉. The Z-transformationeliminates the aggregate bias over all genes, but species-dependent averages 〈Ei〉 would still be observ-able in classes of genes with high and low sequence divergence. In Fig. S1D (top), we plot 〈Ei〉 in theclasses of genes with 10% highest, 20% medium and 10% lowest level of sequence divergence (mea-sured by the total branch length of gene-specific phylogenies inferred from the divergence of synony-mous sites across orthologues (Drosophila 12 Genomes Consortium et al., 2007)). Fig. S1D (bottom)shows the same average for two classes of genes studied in this paper, highly-expressed genes and male-biased genes (as defined below). In all classes, we find no significant species dependence of the classaverages 〈Ei〉. In addition, Fig. S1E shows the distributions of clade-specific expression divergence DC(unscaled) for gene classes with varying levels of sequence divergence (similar to Fig. S1D). We find nosignificant difference in the expression divergence distributions across such gene classes, indicating thatour inference of adaptation based on gene expression divergence in Fig. 2 is not prompted by technicaldivergences in the assay. Overall, we conclude that the assay is free of technical divergence.
Expression statistics within and across species. Using the normalized expression levels, we can defineaverages and natural variation of expression at three different levels:
11
(a) The mean and (unbiased) variance of expression across biological replicates characterize the distributionof expression levels for a given genotype. Here we estimate these quantities from the data of eachreplicates,
Eαi,s =1
ki
∑κ
Eαi,s,κ, δαi,s =1
ki − 1
∑κ
(Eαi,s,κ − Eαi,s)2, (4)
and we define the sample mean and variance,
Eαi =1
2(Eαi,m + Eαi,f ), δαi =
1
4(δαi,m + δαi,f ). (5)
(b) The genetic mean and diversity of expression characterize the distribution of heritable expression dif-ferences in a population. Heritable components of quantitative traits are often inferred from “commongarden” breeding experiments under standardized environmental conditions. The genetic mean anddiversity for a given gene are defined in terms of the data within one species,
Γαi = Eαi ± SEΓ, ∆αi = VarEαi −
1
kiδαi , (6)
where SEΓ is the standard error for estimating the population mean expression from ni geneticallyindependent samples in species i, each of which is an average over ki independent biological replicates,SEΓ = ((∆α
i + δαi /ki)/ni)1/2. The unbiased estimate of variance among genetically distinct samples
is denoted by VarEαi ; the standard error for estimating the expected expression value of each geneticsample from its ki biological replicates propagates in evaluating the expression diversity within eachspecies ∆α
i as given by equation (6). The data of ref. (Zhang et al., 2007) limit the direct information ondiversity to a broad estimate from two D. simulans strains, ∆α
sim = 12(Eαsim1 − Eαsim2)2 − δαsim/ksim.
Therefore, we infer the aggregate diversity 〈∆〉 self-consistently from the model parameters, using thepattern of gene expression divergence (Fig. 2) and the sequence heterozygosity; see equation (20) below.We use the estimated diversity to determine the sampling error of the observed expression divergenceD. Consistently, the model estimate for the expression diversity in D. simulans is very similar to theobserved value 〈∆sim〉 (section 2).
Similarly, we define the expression dimorphism between males and females in each species,
∆αi,mf =
1
2(Eαi,m − Eαi,f )2 − 1
kiδαi . (7)
(c) The expression divergence is defined as the squared difference between population means,Dαij = (Γαi −
Γαj )2, and characterizes evolutionary expression differences between two species. Here we estimate thedivergence for a given gene from the cross-species data, accounting for propagation of error in evaluatingthe species average gene expression level,
Dαij = (Eαi − Eαj )2 − 2〈∆〉 − 1
kiδαi −
1
kjδαj . (8)
Here we have substituted the species expression diversity by the model fit parameter of aggregate diver-sity 〈∆〉 and have set the number of genetically independent samples to ni = 1.
12
Equations (6) and (8) follow Wright’s decomposition of the variance of a quantitative trait into intra- andinter-species components (Wright, 1950), which underlies the quantitative genetics summary statistics FSTand QST (see section 2). For the analysis of sex-specific evolution (section 4), we use the same rationale forthe sex-specificity traits Eαi,mf = Eαi,m − Eαi,f .
In Fig. S2B, we compare gene-averaged values of expression variance across biological replicates, diver-sity, dimorphism and divergence (these averages are denoted by angular brackets), as well as the cross-genevariance of expression. We find a clear ranking 〈δi〉 < 〈∆sim〉 . 〈∆i,mf 〉 < 〈Dij〉 < Vi for all species iand j, where Vi = 〈Γ2
i 〉 ≈ 1 by our normalization. In the Ω test for selection on gene expression, we usedivergence estimates given by equation (8) in aggregate measures across groups of species and classes ofgenes. However, our data set has a low number of genetic samples per species. Hence, single-gene estimatesof diversity and divergence are noisy, which calls for a probabilistic inference of selection. The Ω test andits probabilistic extension for individual genes are described in section 2.
Time-dependent aggregate (rescaled) divergence, Ω. The aggregate expression divergence Ωij for agiven species pair (i, j) is defined as
Ωij =〈Dij〉D0
(9)
The gene-specific expression divergence Dαij is given by equation (8). Angular brackets denote averages
over all genes in our dataset, 〈Dij〉 = 1g
∑αD
αij . The denominator D0 = limτ→∞〈D(τ, c = 0)〉 is chosen
such that the rescaled trait divergence Ωij = 1 for neutral evolution in the limit of long divergence times(section 2). The asymptotic averaged divergence in neutrality D0 (9) is related to the scale E2
0 , previouslydefined as the average genetic variation of trait in the long-term limit of neutral evolution in ref. (Nourmo-hammad et al., 2013b), by D0 = 2E2
0 . The rescaled divergence ΩC for a species clade C is defined as anaverage over species pairs,
ΩC =1
|C1||C \ C1|∑i∈C1
∑j∈C\C1
Ωij , (10)
in analogy with the definition (1) of clade divergence times. We also define aggregate divergence ΩGij andΩGC for specific gene classes G, using restricted averages 〈. . . 〉G .
2. Inference of selection on gene expression
Evolutionary model. We consider the evolution of gene expression levels under genetic drift, mutation,and selection given by a fitness model with peak displacements on macro-evolutionary time scales. In theminimal seascape model (Held et al., 2014; Nourmohammad et al., 2013a), the fitness of a given genedepends on its expression level E and on evolutionary time t,
f(E, t) = f∗ − c0
(E − E∗(t)
)2. (11)
The expression value of maximum fitness, E∗(t), performs an Ornstein-Uhlenbeck random walk with dif-fusion constant υ0, average value E and stationary mean square deviation r2D0/2, where r2 is a constant oforder 1, and D2
0 is the trait scale introduced in eq. 9. This process is defined by the Langevin equation
d
dtE∗(t) = − υ0
r2D0(E∗(t)− E) + η(t), (12)
13
where η(t) is the random variable of a delta-correlated Gaussian process with average 0 and variance υ0.These random variables are assumed to be independent for each gene and on each lineage. The Ornstein-Uhlenbeck fitness seascape should not be confused with a previous Ornstein-Uhlenbeck model for the evolu-tion of quantitative traits under stabilizing selection (Beaulieu et al., 2012; Bedford and Hartl, 2009; Butlerand King, 2004; Hansen, 1997; Hansen et al., 2008; Kalinka et al., 2010; Rohlfs et al., 2014) (a detailedcomparison is given below).
The minimal seascape model captures two kinds of selection on gene expression in a unified way:
(a) Stabilizing selection. This type of selection constrains the intra- and inter-population variation of ex-pression levels to values around E∗(t). We define the dimensionless stabilizing strength
c = N D0 c0, (13)
where N is the effective population size. In the limit case υ0 = 0, the fitness seascape reduces toa static fitness landscape, f(E) = f∗ − c0(E − E∗)2, and stabilizing selection is the only selectiveforce. This provides a simple interpretation of the selection parameter c: it compares the (hypothetical)genetic load c0D0/2 of a neutrally evolving trait evaluated in the landscape f(E) and the actual geneticload 1/2N in the same landscape, assuming a mutation-selection-drift equilibrium at low mutationrates (Nourmohammad et al., 2013a). This parameter signals the regimes of weak (c . 1) and strong(c & 1) stabilizing selection (Nourmohammad et al., 2013b).
(b) Directional selection. In a fitness seascape, this type of selection triggers adaptive response of thepopulation mean trait in the direction of fitness peak displacements. We define the scaled driving rate
υ =2υ0
µD0. (14)
This parameter measures mean square displacement of the fitness peak, in units of trait scale D0 andper unit 1/µ of evolutionary time. In macro-evolutionary seascapes, υ is sufficiently low for populationto follow fitness peak displacements; such seascapes are a joint model of stabilizing and directionalselection (Held et al., 2014). The values of υ inferred from our data fall in this regime (see section 2).Because the seascape dynamics is a short-range Markov process, the mean square peak displacementover a scaled evolutionary time τ is then simply D0 υτ/2. (Here we express υ in units of µ and τ inunits of 1/µ, which differs slightly from the notation in refs. (Held et al., 2014; Nourmohammad et al.,2013a).) In the long-term regime υτ r2, the fitness peak dynamics becomes stationary with meanE and variance r2D0/2. This regime turns out to be well beyond the divergence times in our speciessample. Hence, the statistics of Drosophila gene expression levels and our inference of selection areindependent of r2.
Fitness flux. This measure of adaptation is defined as the speed of movement on a fitness land- or seascapeby genotype or heritable phenotype changes in a population (Held et al., 2014; Mustonen and Lassig, 2010).The cumulative fitness flux associated with the population mean expression level Γ(t) of a gene in a fitnessseascape f(E, t) is given by
Φ(τ) =
∫ τ
t=0
∂f(Γ, t)
∂Γ
dΓ(t)
dtdt. (15)
This quantity measures the total amount of adaptation over a macro-evolutionary period τ in a populationhistory. This quantity satisfies the fitness flux theorem (Mustonen and Lassig, 2010), which generalizes the
14
Fisher’s fundamental theorem of natural selection to mutation-selection-drift processes. As shown by the fit-ness flux theorem, the average cumulative fitness flux over parallel evolutionary histories, in units of 1/2N ,measures the importance of adaptation compared to genetic drift: adaptation is substantial if 〈2NΦ(τ)〉 & 1.For a stationary adaptive process in the minimal seascape (11), the average scaled cumulative fitness fluxtakes the simple form (Held et al., 2014; Nourmohammad et al., 2013a)
〈2NΦ(τ)〉 ' 2cυ τ, (16)
up to factors of order π0. The exact functional form of the fitness flux is given in reference (Held et al.,2014). A population evolving under strong stabilizing selection i.e., in a sharply peaked fitness seascape(c 1), follows the movements of the fitness peak, measured by the driving rate υ, more closely, and,hence, accumulates a larger fitness flux over time. Therefore, it is intuitive that the averaged cumulativefitness flux of the population eq. 16 is proportional to the product of the stabilizing strength c and the drivingrate υ.
The cumulative fitness flux is closely related to the time-dependent fraction of expression divergencethat is adaptive, ωad(τ) (equation 24). We introduce the shorthand Φ = Φ(τDros.) with τDros. = 1.4 (Fig. 1);this quantity measures the amount of adaptation across the Drosophila genus. By the probabilistic inferencemethod discussed below, we obtain expectation values 2NΦα of the rescaled fitness flux for individualgenes over the divergence time of the Drosophila genus (equation 32). We use these values to describe theoverall statistics of expression adaptation (Fig. 3A), to infer differences in adaptation between gene classes(Fig. 4; Table 1), and to define significantly adaptive genes (using a threshold 2NΦα > 4; Table S1). For theanalysis of sex-specific adaptation (Fig. 5), we define an analogous fitness flux 2NΦmf for sex-specificitytraits (section 4).
Evolutionary modes of quantitative traits. In the minimal seascape model, the aggregate time-dependent(rescaled) divergence Ω defined by equation (9) depends on the divergence time τ and on the selectionparameters of stabilizing strength c and driving rate υ; the exact form of this function is given in ref. (Heldet al., 2014). We can use the behavior of the rescaled time-dependent divergence Ω to distinguish threemodes of evolution:
(a) Neutral evolution (c = 0). The rescaled trait divergence has an initially linear increase due to mutationsand genetic drift, and it approaches a maximum value 1 with a scaled relaxation time of 1,
Ω0(τ) 'τ for τ 1,1 for τ 1.
(17)
(b) Evolution under stabilizing selection (c & 1, υ = 0). In a static fitness landscape, the rescaled traitdivergence approaches a smaller maximum value, Ωstab(c) < 1, with a proportionally shorter relaxationtime (Held et al., 2014),
Ωeq(τ) '
[1 + G(c)] τ for τ Ωstab(c)Ωstab(c) for τ Ωstab(c).
(18)
Over a wide range of evolutionary parameters, which includes the inferred values for the data set of thisstudy, the maximum value depends on the stabilizing strength in a simple way, Ωstab(c) ∼ 1/(2c), withcorrections for weaker selection and for larger nucleotide sequence diversity (Nourmohammad et al.,
15
2013b). The factor [1+G(c)] captures the short-time constraint on the trait divergence due to stabilizingselection, compared to neutrality (eq. 17). The functional form of G(c) is given explicitly in ref. (Nour-mohammad et al., 2013b). Over a wide range of the stabilizing strength c, this constraint remains weakand Ω(τ) evolves near neutrality (Nourmohammad et al., 2013b), as long as τ Ωstab(c).
(c) Adaptive evolution under stabilizing and directional selection (c & 1, υ > 0). In a genuine fitnessseascape, the divergence acquires an adaptive component,
Ω(τ) = Ωeq(τ) + Ωad(τ) =
[1 + G(c)] τ for τ Ωstab(c)Ωstab(c) + 1
2υ [τ − 2Ωstab(c)], for τ Ωstab(c),(19)
with corrections for τ approaching the saturation time of fitness peak displacements, r2/υ. The fullanalytical form of the functions Ω0(τ) (equation 17), Ωeq(τ) (equation 18), and Ω(τ) (equation 19) isgiven in refs. (Held et al., 2014; Nourmohammad et al., 2013a).
Moreover, our analysis shows that the trait scale D0 equals, up to a selection-dependent coefficient, the ratioof the expected trait diversity 〈∆〉 and the neutral sequence diversity π0 within a given species,
D0 =〈∆〉(c)π0
[1 + G(c)]−1. (20)
Importantly, the relation (20) is robust under changes of the effective population size. We expect suchchanges to affect 〈∆〉 and π0 in the same way but to leave their ratio invariant. This is consistent withthe role of D0 in our macroevolutionary analysis. At neutrality, D0 = 〈∆〉0/π0 is simply the mutationalvariance of a quantitative trait, as defined in refs. (Chakraborty and Nei, 1982; Lynch and Hill, 1986; Lynchand Walsh, 1998), up to a rescaling of evolutionary time to units of the inverse point mutation rate 1/µ.This relation remains approximately valid under stabilizing selection over a wide range of parameters c, forwhich G(c) 1 (Nourmohammad et al., 2013b). This implies a universal quasi-neutral short-term behaviorof the divergence (Held et al., 2014; Nourmohammad et al., 2013a),
〈D(τ)〉 ' D0[1 + G(c)] =〈∆〉(c)π0
τ for τ Ωstab(c). (21)
Ω-test for selection on quantitative traits. The time-dependence of divergence provides a joint test forstabilizing and directional selection on quantitative traits. We can infer the selection parameters of a seascapemodel by fitting the function Ω(τ) (equation 19) and the corresponding trait scale D0 (equation 9) to data(τ,D). This method has the following properties:
(a) The inference of selection requires data on time-dependent divergence (τ,D) from species with differentdivergence times in the regime τ & Ωstab. In the quasi-neutral regime τ . Ωstab, the rescaled time-dependent divergence Ω is insensitive to selection (equations 17–19).
(b) By the decomposition (equation 19), the time-dependent ratio
ωad(τ) =Ωad(τ)
Ω(τ)≡ 〈Dad(τ)〉〈D(τ)〉
(22)
defines the adaptive fraction of the trait divergence. The complementary fraction, 1−ωad(τ), is attributedto genetic drift under stabilizing selection.
16
(c) We can approximate the divergence (equation 19) by the linear form Ω(τ) ≈ Ωstab + Ωad(τ) = Ωstab +υτ/2. Therefore, already a linear fit to data produces simple estimates of stabilizing strength and drivingrate,
c ≈ 1
2Ωstab, υ ≈ 2Ωad(τ)
τ, (23)
and infers the adaptive fraction of expression divergence, which is related to the average scaled fitnessflux (Held et al., 2014) (equation 16),
ωad(τ) ≈ Ω(τ)− Ωstab
Ω(τ), 〈2NΦ(τ)〉 ≈ 2Ωad(τ)
Ω− Ωad(τ)(24)
The quantities ωad(τ) and 〈2NΦ(τ)〉 are independent of the trait scale D0.
(d) The rescaled trait divergence Ω (equation 9) decouples from the genetic basis of the trait. Specifi-cally, it depends only weakly on the number and effect size of the underlying QTL (Held et al., 2014;Nourmohammad et al., 2013b), on the amount of recombination between these sites (Held et al., 2014;Nourmohammad et al., 2013b), and on the nonlinearities in the genotype-phenotype map (trait epista-sis; see section 5 and Fig. S7C). The time dependent Ω(τ) also decouples from details of the selectiondynamics; it can be applied to punctuated adaptive processes, which have fewer and larger peak dis-placements (Held et al., 2014) (section 3).
(e) A variant of the Ω test consists in directly inferring D0 from diversity data in any species of the dataset by equation 20, as discussed previously in ref. (Held et al., 2014). Given the scarce information ontrait diversity in our data set, we do not use this version of the test for our inference of selection in thepresent paper (however, we perform a consistency check based on diversity estimates in D. simulans).
Comparison of the Ω test with related methods. Our inference method for selection on quantitative traitscan be compared with three well-known selection tests for phenotypic and genomic data:
(a) QST/FST ratio test for selection on quantitative traits. The summary statistics FST and QST measure theexpected fraction of the total genetic variation harbored in a pair of populations that can be attributedto the divergence between these populations; the complementary fraction is attributed to the diversitywithin populations. FST refers to neutrally evolving sequence loci (Lande, 1992; Wright, 1943, 1950),which can be regarded as a “pseudo-trait” with aggregate divergence and diversity. QST is the analo-gous measure for quantitative traits under selection (Spitze, 1993). The expected dependence of thesemeasures on divergence time can be expressed in terms of the rescaled divergence Ω (equation 9),
FST(τ) =〈D(τ)〉0
〈D(τ)〉0 + 2〈∆〉0' Ω0(τ)
Ω0(τ) + 2π0(25)
QST(τ) =〈D(τ)〉
〈D(τ)〉+ 2〈∆〉' Ω(τ)
Ω(τ) + 2π0(26)
where we use expectation values 〈. . . 〉 in an ensemble of parallel-evolving populations and the subscript0 refers to neutral evolution. The QST/FST test (Leinonen et al., 2013) stipulates that a quantitative traitis evolving at neutrality if QST/FST = 1, under stabilizing selection if QST/FST < 1, and under direc-tional selection if QST/FST > 1. The data set of this study shows aggregate values QST/FST between0.6 for the mel-sim clade and 0.8 across the entire Drosophila genus; these values are obtained using
17
equations (10), (25), and (26). Hence, this test signals broad stabilizing selection but no directionalselection. In contrast, the time-dependent divergence test infers both stabilizing and directional selec-tion from the linear dependence Ω(τ) (Fig. 2 and equation 19). This inference shows a conceptuallyimportant point: stabilizing and directional selection are not mutually exclusive, but joint features ofselection on macro-evolutionary time scales.
The QST/FST test infers adaptive evolution under quite restrictive conditions: QST/FST > 1, or equiv-alently Ω > Ω0, implies that directional selection is the dominant selection component for short di-vergence times. In the seascape model, this requires large driving rates (υ & 1), sufficiently large peakdisplacement amplitudes r, and sufficiently large stabilizing strength c (Held et al., 2014). Hence, valuesQST/FST > 1 are most likely to be observed for individual traits that have undergone a large shift of theoptimal trait value in their recent evolutionary history, which is in accordance with data from a numberof studies; see (Le Corre and Kremer, 2012; Leinonen et al., 2013) for comprehensive review. In thisstudy, which uses aggregate data over large classes of genes, we do not expect, and do not find, valuesQST/FST > 1. We note that on sufficiently short time scales, the QST/FST and the test based on thetrait divergence are always insensitive to selection, because the trait divergence is in the quasi-neutralregime (equation 21).
(b) Ornstein-Uhlenbeck model for quantitative trait evolution. This phenomenological model describes aquantitative trait evolving under genetic drift and stabilizing selection (Beaulieu et al., 2012; Butlerand King, 2004; Hansen, 1997; Hansen et al., 2008) and has been applied to the evolution of geneexpression (Bedford and Hartl, 2009; Kalinka et al., 2010; Rohlfs et al., 2014) (a detailed comparisonwith the results of ref. (Bedford and Hartl, 2009) is given below). The model is defined by a Langevinequation for the population mean trait,
d
dtΓ(t) = −λ (Γ− E∗) + ηΓ(t), (27)
where ηΓ(t) is the random variable of a delta-correlated Gaussian process with average 0 and varianceσ2/N . The model constants λ and σ2 are usually regarded as independent fit parameters. The Ornstein-Uhlenbeck dynamics of the population mean trait Γ(t) around a fixed optimal trait value E∗ (equation27) should not be confused with the Ornstein-Uhlenbeck dynamics of the time-dependent optimumE∗(t) in our seascape model (equation 11).
A Langevin equation similar to (27) can be derived from more general population-genetic models forthe evolution of a quantitative trait E in a static fitness landscape f(E) = −c0 (E − E∗)2, which havebeen introduced in refs. (de Vladar and Barton, 2011; Nourmohammad et al., 2013b). In these models,the population mean trait follows the Ornstein-Uhlenbeck process
d
dtΓ(t) = −2〈∆〉 c0 (Γ− E∗)− 2µ (Γ− Γ0) + ηΓ(t), (28)
where Γ0 is the genetic mean trait in the long-term limit of neutral evolution and ηΓ(t) is the randomvariable of a delta-correlated Gaussian process with average 0 and variance 〈∆〉/N . Comparison withequation (27) determines the Ornstein-Uhlenbeck coefficients in terms of the stabilizing strength and theaverage trait diversity (λ = 2〈∆〉 c0, σ2 = 〈∆〉). Equation (28) contains an additional mutational term(−2µ)(Γ − Γ0), which implies that the expectation value 〈Γ〉 differs from the optimum trait value E∗.We note that the diffusion constant 〈∆〉/N determines the behavior of the trait divergence (equation 21),and of the QST/FST ratio in the quasi-neutral regime (τ Ωstab). The Ornstein-Uhlenbeck model has
18
been generalized to account for lineage-specific stabilizing selection in a phylogeny (Beaulieu et al.,2012; Butler and King, 2004; Hansen, 1997; Hansen et al., 2008; Kalinka et al., 2010; Rohlfs et al.,2014); however, inferring independent selection parameters for each lineage may lead to overfitting ofour data set. Instead, we use the seascape model (11) to infer lineage- and gene-specific changes of thetrait optimum E∗(t) using a single additional selection parameter υ.
(c) McDonald-Kreitman test for adaptive sequence evolution (McDonald and Kreitman, 1991). The sequence-based test for selection evaluates the ratio of the cross- to intra-species sequence variation for a sequenceclass under putative selection (e.g., non-synonymous mutations in protein-coding sequence) and com-pares it to the analogous ratio for bona fide neutral changes (e.g., synonymous mutations.) Positiveselection in the query sequence is inferred if ratio in the sequence class is larger than that of the neutralexpectation. In contrast, the selection test for quantitative traits based on the time-dependent divergenceΩ(τ) requires only data from traits under selection, but from three or more species with divergencetimes beyond the equilibrium relaxation time Ωstab. These differences highlight distinct evolutionarycharacteristics of quantitative traits. First, such traits have a quasi-neutral regime of macro-evolutionarydivergence times (equation 21) that has no direct analogue in sequence evolution (Nourmohammadet al., 2013b). Second, in most cases we do not have a gauge of neutrally evolving traits analogous tosynonymous sequence.
Application of the Ω test to gene expression data. We apply the Ω test to the Drosophila gene expressiondata of ref. (Zhang et al., 2007) as follows. To evaluate rescaled expression divergence data (τC ,ΩC) forsix Drosophila species clades (equations 1 and 10), we estimate the aggregate unscaled gene expressiondivergence across clades, 〈DC〉, and fit the model function 〈D(τ)〉 to these data. This fit contains the threeparameters (D0, c, υ). The time scale of stabilizing selection observed in the data (i.e., the first bend in thedivergence curve) equals 〈Dstab〉 ∼ D0/c. We treat the trait scale D0 as an additional fit parameter anddetermine this scale by assuming that the saturation due to stabilizing selection occurs at the latest possible(i.e., for the largest Ωstab) consistent with the data, resulting in a best model with conservative estimates ofstabilizing strength c. The rescaled expression divergence with the inferred trait scale, Ω(τ) = 〈D(τ)〉/D0,is plotted in Fig. 2. The best-fit seascape model has parameters (c∗ = 18.4, υ∗ = 0.08) (green line in Fig. 2);this model explains the divergence data and produces evidence for adaptive evolution of gene expression.Using the decomposition into adaptive and drift components (green and blue shaded areas), we obtain acumulative fitness flux 〈2NΦ〉 = 3.8 across the entire Drosophila genus (equations 23 and 24). We notethat the inference of fitness flux decouples from the scale D0. The probabilistic extension of this test toindividual genes is discussed below.
As a consistency check, we use the aggregate expression diversity in D. simulans, 〈∆〉sim (estimatedfrom two strains in this data set), and the average heterozygosity of synonymous sites in the D. simulanspopulation, π0 = 0.018 (Begun et al., 2007), to estimate the trait scale D0 from equation (20). The resultingoptimal seascape model has parameters (c = 18.6, υ = 0.07), which are very similar to the values quotedabove (see Fig. S3A).
Control analysis of equilibrium models. We can also compare the aggregate expression data (τC ,ΩC) tofitness landscape models of time-independent stabilizing selection:
(a) We can infer a landscape model from the divergence data. This model variant has two independentparameters, the stabilizing strength c and the trait scale D0, which we set to its maximum fitted value
19
to obtain a conservative estimate of stabilizing strength. It provides a significantly worse fit to the datathan the seascape model (Fig. S3A). In particular, it cannot explain the pattern of expression diver-gence between close species. The model predicts a quasi-neutral linear growth of the divergence withDmel−yak/Dmel−sim ≈ τmel−yak/τmel−sim ≈ 2 (equation 21), which drastically overestimates the ob-served ratio Dmel−yak/Dmel−sim ≈ 1.2. This model also fails to infer substantial stabilizing selection(ceq = 0.25).
(b) With divergence and diversity inferred from data, the landscape model has a single free parameter, thestabilizing strength c. In contrast to the seascape model, the best-fit landscape model provides a poor fitto the data (Fig. S3A). It captures the average rescaled divergence Ω across the Drosophila clades, butfails to describe the systematic amplitude differences between these clades. In particular, the landscapemodel drastically overestimates the divergence of close species, Dmel−yak and Dmel−sim. Compared tothe landscape model with fitted trait scale, this model variant also produces a worst fit to the data. Theprobabilistic analysis reported in equation (35) that both equilibrium models have a significantly lowerlikelihood than the seascape model.
Comparison with a previous study. Bedford and Hartl (BH) (Bedford and Hartl, 2009) analyze aggre-gate expression levels from the same data set and fit these data to an Ornstein-Uhlenbeck model of evolutionunder stabilizing selection (Hansen, 1997) (equation 27), which is closely related to our landscape modelinferred solely from divergence data. The Ornstein-Uhlenbeck model cannot infer adaptation; it assumesstabilizing selection and has fit parameters that determine the equilibration time and the level of satura-tion. Based on this model, BH infer stabilizing selection on expression levels, and they report an apparentsaturation of gene expression divergence. This saturation is at variance with the linear growth on timescales beyond the divergence time of D. melanogaster and D. simulans, which is inferred in Fig. 2 and inref. (Zhang et al., 2007). The analysis of ref. (Bedford and Hartl, 2009) presents the following issues of dataanalysis and of interpretation of the results:
(a) BH (Bedford and Hartl, 2009) use amino acid distances in their phylogeny. These distances are affectedby selection (Smith and Eyre-Walker, 2002). As shown in Fig. S2A, they produce a nonlinear measureof divergence times, τij , which is less suitable for evolutionary analysis than the times τij inferred fromsynonymous sequence (Drosophila 12 Genomes Consortium et al., 2007) that are used in this study. Totest the influence of amino acid sequence divergence on our inference of adaptive evolution, we repeatthe analysis with this variant of the phylogeny. We find the same ranking of models: the seascape modelexplains the data significantly better than landscape models, none of which provides a satisfactory fit tothe divergence between close species (Fig. S3B). The probabilistic analysis of equation (35) confirmsthis ranking. At the same time, it displays that amino acid times are suboptimal: they lead to a significantlikelihood cost for all models. We conclude that our inference of adaptive evolution is robust undervariations of the sequence-based phylogenies.
(b) BH (Bedford and Hartl, 2009) analyze expression divergence for pairs of species, while we group thespecies into clades (Fig. 2). These differences lead to a more noisy dependence of the expression di-vergence data on evolutionary time (Fig. S1C) and make a straightforward distinction of conservationand adaptation more difficult at the level of aggregate data. Moreover, pairwise expression divergencedata are strongly correlated through the structure of the phylogeny, which is apparent from the clus-tering of these data (open squares in Fig. S3A,B). Clade-specific divergence data are statistically moreindependent, which allows for meaningful error analysis and model ranking.
20
(c) The saturation of expression levels claimed in ref. (Bedford and Hartl, 2009) is ascribed to time scalessimilar to the neutral relaxation time (τ ∼ 1 in units of the inverse neutral point mutation rate, dashedline in Fig. S3B). Under any Ornstein-Uhlenbeck or landscape model, this pattern would imply weakstabilizing selection (ceq = 0.25 in the landscape model) and weak constraint on gene expression di-vergence. That is, gene expression would evolve near neutrality throughout the Drosophila genus: thedivergence would be 87% of the neutral divergence for τmel−sim and 68% for τDros..
Probabilistic inference of selection. Here we describe the extension of our selection inference methodto expression data of individual genes. A minimal seascape model is determined by the parameters (c, υ)or equivalently by (c,Φ), where Φ = 2cυτDros./2N denotes the expected cumulative fitness flux over thegenus divergence time (equation 16). We derive a posterior probability distribution Q(c,Φ |Eα), whereE = (Eα1 , . . . , E
α7 ) denotes the expression levels of gene α in the 7 species of our data set. This derivation
consists of three steps: we obtain the probability distribution Q(Γ | c,Φ) of population mean traits Γα =(Γα1 , . . . ,Γ
α7 ) in a given seascape model, we include sampling effects to determine the distributionQ(E |Φ),
and we use Bayes’ theorem to infer the posterior distribution Q(c,Φ |Eα).The basic building block of evolutionary statistics in the minimal seascape model has been derived pre-
viously (Held et al., 2014): the lineage propagator Gτ (Γ, E∗|Γa, E∗a) is the probability density of meanand optimal trait values (Γ, E∗), given the values (Γa, E
∗a) in an ancestral population at scaled evolution-
ary distance τ . The lineage propagator is related to the stationary distribution of the seascape dynamics,Qstat(Γ, E
∗) = limτ→∞Gτ (Γ, E∗|Γa, E∗a). Both distributions are Gaussian functions that depend on theseascape model parameters and on the neutral variance (trait scale) D0; their detailed analytical form isgiven in equations (30)–(33) and (A.15)–(A.20) of ref. (Held et al., 2014). The probability distribution ofpopulation mean traits across the Drosophila genus is the stationary distribution for its last common ancestormultiplied by the lineage propagators for all branches of the phylogeny; this expression is to be integratedover all unknown expression levels. Specifically, we obtain
Q(Γα | c,Φ, D0) =
∫Qstat(Γ
αl , E
∗l )
l−1∏i=1
Gτ(i)(Γαi , E
∗i |Γαa(i), E
∗a(i)) dΓαk+1 . . . dΓαl dE∗1 . . . dEl, (29)
where i = 1, . . . , k labels the extant species and i = k + 1, . . . , l the clade ancestor species (with l =2k − 1 = 13 and the index l referring to the last common ancestor of all species), a(i) denotes the closestancestor of species i, and τ(i) is the scaled length of the branch between i and a(i). The deviations of theexpression measurements Eαi from the population mean trait Γαi can be described by a Gaussian samplingerror model with variance ∆α
i + (δαi /ki), as given by equation (6). We obtain
Q(Eα | c,Φ, D0) =1
Z
∫Q(Γα | c,Φ) exp
[−1
2
(Eαi − Γαi )2
∆αi + (δαi /ki)
]dΓα1 . . . dΓαk , (30)
where Z is a normalization factor. This multi-variate Gaussian integral can be evaluated in a straightforwardway by the saddle point method. Here we approximate the noisy cross-replicate variance of individual genesby the species average δi. Due to the limited data on heritable species-specific expression diversity ∆α
i , weuse the expected functional form of the trait diversity dependent on the trait scaleD0 and stabilizing strengthc, as given by eq. 20 and ref. (Nourmohammad et al., 2013b). Finally, Bayes’ theorem gives the posteriordistribution
Q(c,Φ |Eα) =
∫Q(Eα | c,Φ)P0(c,Φ) dD0∫
Q(Eα | c,Φ)P0(c,Φ) dD0 dc dΦ, (31)
21
where P0(c,Φ) denotes the prior distribution of seascape parameters. This distribution determines the max-imum likelihood posterior values of stabilizing strength, fitness flux, and adaptive fraction of expressiondivergence,
(cα,Φα) = arg maxc,Φ
Q(c,Φ |Eα), ωαad(τ) =(τ/τDros.) Φα
(τ/τDros.) Φα + 1/N; (32)
see equation (24). In equation (31), we use a prior distribution P0(c,Φ) ∼ exp(−ac − bΦ) with Lagrangemultipliers a, b that calibrate the average posterior values 〈c〉 and 〈Φ〉 over all genes to our inference fromaggregate data (see above). This choice generates a conservative inference of gene-specific seascape param-eters that reflects two statistical features of our data. First, gene data E explained by a seascape model withparameters (c,Φ) and a neutral trait variance D0 (see text above and ref. (Nourmohammad et al., 2013b))have a similar likelihood in a family of models with parameters (λc,Φ) and neutral trait variance λD0,where λ > 0 is a rescaling factor, as long as the stabilizing strength is above some minimum value. In otherwords, there is a residual freedom in model parameters that leaves the fitness flux Φ invariant. This freedomexists because the gene-specific diversity values ∆α
i are too noisy to be included in the inference. Our priordistribution favors posterior values c close to the minimum stabilizing strength, which are consistent withthe inference from aggregate data. Second, the distribution (29) has an algebraic tail, Q(E | c,Φ) ∼ Φ−1 for2NΦ 1, which is caused by the diffusive dynamics of the fitness peak. Our prior distribution suppressesthis tail and favors posterior values Φ close to the maximum-likelihood value Φ∗. The validation of thisinference scheme by simulation tests is described in section 5.
Statistical significance of the inference. The probabilistic extension of the Ω plays an important role inour global inference: to quantify the statistical significance of our evidence for adaptive evolution underdirectional selection. Specifically, we evaluate the cumulative log-likelihood score for all genes of ourdataset as a function of the evolutionary variables of stabilizing strength c and cumulative fitness flux Φ,
S(c,Φ) =
g∑α=1
logQ(c,Φ |Eα), (33)
where Q(c,Φ |Eα) is given by equation (31). This function is shown in Fig. 3B with its maximum shiftedto 0. The global maximum-likelihood seascape model with divergence and diversity estimated from datahas parameters
(c∗,Φ∗) = arg maxc,Φ
S(c,Φ) =
(18.4,
3.8
2N
), (34)
We can use the log-likelihood difference ∆S = S(c,Φ)− S(0, 0) to rank all other models against theircorresponding neutral model. For the cases discussed in this paper, we find the following ∆S values:
landscape, landscape,seascape 〈∆〉 inferred 〈∆〉 from data
syn. phylogeny 12464 6194 4068aa. phylogeny 9810 5452 1970
(35)
In all inference schemes (divergence times inferred from synonymous or amino acid sequence divergence),we find the same ranking: the optimal seascape model is significantly more likely than the optimal landscapemodel, and the neutral model. The landscape model with the diversity 〈∆〉 (or equivalently, trait scale D0)as a fit parameter has a higher likelihood that the landscape model with the diversity estimated from theD. simulans strains. By a log-likelihood test, the score differences ∆S translate into P values as reportedin the main text. Maximum-likelihood values analogous to equation (34) can also be defined for classes ofgenes (Table 1).
22
3. Analysis of alternative evolutionary scenarios
Lineage-specific demography. Demographic effects, such as population bottlenecks, affect the patternsof sequence variation in Drosophila (Aquadro et al., 2001; Glinka et al., 2003; Lachaise et al., 1988; Stephanand Li, 2007; Thornton et al., 2007). Here we examine the effects of strong, long-term demographic hetero-geneities on the divergence and diversity of expression levels. Specifically, we consider changes in effectivepopulation size to a valueNi = λN in a given Drosophila lineage i, which is defined by the terminal branchof species i in the phylogeny and extends over a scaled evolutionary time τi (Fig. 1). A depletion of effectivepopulation size leads to a global relaxation of stabilizing selection on gene expression, given by a reducedstabilizing strength λc in the fitness seascape (equation 11). For each clade C with i ∈ C, we define thepolarized rescaled divergence,
ΩC,i =1
|C \ C1|∑
j∈C\C1
Ωij , (36)
where (C1, C \C1) is the partitioning of clade C defined by its root node and we assume i ∈ C1. The pairwiserescaled divergence Ωij is given by equation (9). Similarly, we define the polarized divergence time,
τC,i =1
|C \ C1|∑
j∈C\C1
τij . (37)
In Fig. S4A,B, we plot polarized data (τC,i,ΩC,i) together with background data (τC ,ΩC) from partial cladesexcluding species i. Under a change of population size in lineage i with τi & Ωstab(λc), the polarized datashould follow a pattern with reduced (λ < 1) or increased (λ > 1) long-term constraint,
Ω(τ, τi) = Ωeq(τ, τi) + Ωad(τ, τi) (38)
=
[1 + G(c)] τ for τ Ωstab(λc)
12(Ωstab(λc) + Ωstab(c)) + 1
2υτ + F(λ, c) for τ τi + Ωstab(c),
where the shift F(λ, c) is generated by the demographic inhomogeneity on intermediate time scales; thispattern is shown in Fig. S4A,B for λ = 1/2 and λ = 3. A similar calculation shows that short-term pop-ulation bottlenecks have a negligible effect on the statistics of trait divergence Ω. We observe no deviationbetween polarized and background Ω data, indicating the absence of strong demographic effects shaping theevolution of expression levels. Equation (38) also shows that demographic effects do not confound the testof selection based on the time-depended divergence Ω for adaptive evolution under directional selection.For time-independent optimal trait value (υ = 0), global relaxation of stabilizing selection increases thedivergence as noted in previous studies (Fraser, 2011; Gilad et al., 2006b; Khaitovich, 2005); however, itdoes not generate the linear increase Ωad(τ) ' υτ/2 characteristic of fitness peak displacements (Fig. S4).
Gene-specific relaxation of stabilizing selection. We can also test for lineage- and gene-specific relax-ation of stabilizing selection on gene expression, which arises, for example, from a partial loss of genefunction. We model the loss dynamics by a stochastic process: with a small rate γ, individual genes switchthe stabilizing strength of their fitness seascape to a reduced value λc (with λ < 1). We choose the model pa-rameters of switch rate γ and stabilizing strength c so as to approximately match the observed Ω(τ) pattern.To discriminate between relaxed stabilizing selection and directional selection, we can use the distribu-tions of clade-specific expression differences ∆EαC , which are defined as averages over pairwise differences
23
∆Eαij = Eαi − Eαj in analogy to equation (10). The observed distributions are of approximately Gaussianform,
PC(∆E) =1√
2πDCexp
[−(∆E)2
2DC
], (39)
as shown by the collapse plot of Fig. S5A. This is in accordance with the minimal seascape model, whichpredicts a Gaussian distribution Pτ (∆E) with variance 〈D(τ)〉. In contrast, stochastic relaxation of stabi-lizing selection generates broad non-Gaussian tails increasing with divergence time τ that are not observedin the data (Fig. S5C, bottom). Furthermore, the loss dynamics generates a nonlinear time-dependent Ω(τ)(Fig. S5C, top), which is not observed in the data (Fig. 2). We conclude that relaxed stabilized selectionalone cannot explain the observed statistics of Drosophila gene expression levels. This does not excludethat relaxation of selection affects some genes in our data set and more broadly genes with complete loss offunction, which are suppressed in the set of conserved orthologs.
Punctuated directional selection. The Ornstein-Uhlenbeck dynamics of fitness peaks in the minimalseascape model (equation 27) describes the accumulation of small but continual changes of optimal ex-pression levels. Larger peak shifts can be caused by discrete ecological events, including major migrationsand speciations, and by gene-specific factors such as neo-functionalization (Lynch and Force, 2000). Herewe model such events by a punctuated fitness seascape (Held et al., 2014): with a small rate υµ/(2r2), indi-vidual genes are subject to fitness peak shifts by an amount of order D0. This stochastic model differs fromprevious models of lineage-specific selection (Bedford and Hartl, 2009; Brawand et al., 2011; Butler andKing, 2004; Hansen, 1997; Hansen et al., 2008; Kalinka et al., 2010; Rohlfs et al., 2014), where fitness peakshifts are constrained to known branch points of the phylogeny. Evolution in a punctuated fitness seascapegenerates rescaled time-dependent divergence Ω(τ) of the form (equation 19); adaptation is signalled bythe same term Ωad(τ) ' υτ/2 as in a minimal seascape of the same driving rate υ (Fig. S5D, top). Todiscriminate between the two models, we use again the distributions PC(∆E) of clade-specific expressiondifferences. In a punctuated seascape, these distributions have broad non-Gaussian tails increasing withdivergence time τC that are not observed in the data (Fig. S5D, bottom). We conclude that large peak shiftsare a subleading factor of expression changes in our data set.
Other modes of adaptation. Further evolutionary modes affecting gene expression include:
(a) Time-dependent stabilizing selection (Held et al., 2014). This type of selection can be modeled by afitness seascape of the form (11) with time-dependent stabilizing strength c(t), given by a generalizedOrnstein-Uhlenbeck process with constraint c(t) > cmin. The recurrent tightening of expression con-straint driven by increases of c(t) is a mode of adaptation that is independent of fitness peak changes.The rescaled divergence Ω does not trace this mode: as long as the expression optimum E∗ is time-independent, the function Ω(τ) reaches an asymptotic value Ωstab(cmin). This pattern is similar to evo-lutionary equilibrium in a single-peak fitness landscape and does not contain the term Ωad(τ) ' υτ/2characteristic of fitness peak displacements.
(b) Adaptive gene turnover, including sub- and neo-functionalization after gene duplication (Lynch andKatju, 2004; Lynch et al., 2001), regulatory sequence duplication (Nourmohammad and Lassig, 2011),and de novo formation of genes (Tautz and Domazet-Loso, 2011). This mode is suppressed in our dataset of conserved orthologous genes, but it is likely to be more prevalent in the complementary set ofDrosophila genes.
24
(c) Adaptation by large-effect loci. Our diffusive evolutionary model assumes that expression levels aredetermined by multiple eQTL, and changes at individual loci have only moderate effects. The evolutionof more general traits with few large-effect loci can be studied in simulations. We find that in diffusivefitness seascapes, mutations at large-effect loci are mostly deleterious. In this case, the population adaptsto the gradual changes of the expression optimum predominantly by fixation of small-effect mutations,while large-effect substitutions are suppressed. In punctuated fitness seascapes, large-effect mutationscan accelerate the adaptive response to large shifts of the fitness peak, but such shifts are a subleadingfactor of expression changes in our data set (see above).
A detailed investigation of these evolutionary modes is beyond the scope of this study. Importantly, however,they do not confound the inference of adaptation under directional selection reported here.
4. Analysis of specific gene classes
Codon usage. The effective number of codons, n, measures the redundancy of the genetic code withina given gene (Wright, 1990). This number takes values between 20 (each amino acid is determined bya specific codon) and 61 (all sense codons are used). Genes with specific codon usage (small n) tend tohave higher expression than genes with broad codon usage (Ikemura, 1985; Shields et al., 1988). Here wecompute the species-averaged effective number of codons, nα = 1
7
∑i n
αi for all genes in our data set. We
find a consistent dependence of expression adaptation on codon usage:
(a) Aggregate analysis by time-dependent divergence Ω signals strongly reduced adaptation for genes withspecific codon usage (n < 42) and an enhanced adaptation for genes with broad codon usage (n > 50),compared to the average over all genes (Fig. 4A and Table 1).
(b) The pattern of expression divergence also signals strongly reduced adaptation for genes with high aver-age expression level, Eα = 1
7
∑iE
αi > 0.9 (Table 1). Additionally, we compare the fitness flux of a
gene to its codon adaptation index (CAI), which measures the similarity between the codon usage in aspecific gene and the codon preference of highly expressed genes (Sharp and Li, 1987). Consistently,we find a reduced amount of fitness flux in genes with high codon adaptation index (CAI & 0.65); thesegenes are likely to be highly expressed.
(c) At the level of individual genes, there is a clear correlation between fitness flux Φα and effective codonnumber nα (Fig. 4B).
Inference of adaptive sequence evolution. For the genes in our data set, we estimate the fractions ofsynonymous and non-synonymous polymorphic nucleotides (Ps and Pn) from the database of Drosophilamelanogaster Genetic Reference Panel (DGRP) (Mackay et al., 2012). The corresponding nucleotide diver-gence measures (Ds and Dn) are obtained from sequence alignments between the D. melanogaster and D.simulans reference genomes (Drosophila 12 Genomes Consortium et al., 2007). The McDonald-Kreitmantest (McDonald and Kreitman, 1991; Smith and Eyre-Walker, 2002) signals adaptive evolution of aminoacids if αseq = (DnPs/DsPn) − 1 > 0. Fig. S6 shows the distribution of αseq values for classes of geneswith different amount of expression adaptation, measured by the fitness flux Φ (equation 32). We find nocorrelation between these statistics. In each class, about 30% of the genes have αseq > 0. This resultdoes not contradict the correlation of gene expression divergence and amino acid divergence reported inref. (Zhang et al., 2007), because an enhanced amino acid substitution rate measured by a Dn/Ds test (Li,1993) may be caused by adaptive changes or by relaxation of negative selection.
25
Analysis of functional gene classes. We use The Ontologizer (Bauer et al., 2008) for statistical analysis offunctional enrichment in our dataset. From a base set of all 6332 genes in our database, we identify enrichedfunctional categories in the query sets of adaptively regulated genes (2NΦα > 4) and genes with sex-specificadaption of expression (2NΦα
mf > 4.5, see below). We use the calculation method Parent-Child-Union withBonferroni correction and resampling steps of 1000. The enriched functional categories in these gene setsare reported in Tables S1 and S2 with a significance threshold P < 0.1 (multiple hypothesis test). We listthree main categories: biological processes, cellular components, and molecular functions. Each functionalcategory is assigned to a functional cluster (in bold letters) that is inferred by REVIGO (Supek et al., 2011),using the semantic similarity measure SimRel with threshold 0.5. This clustering facilitates the interpretationof functional gene classes associated with adaptation of gene expression.
Sex-specific evolution and sex bias of expression. To quantify differences of gene expression betweenmales and females, we we define the sex specificity trait of a given gene as the difference between itsexpression levels in males and in females (Zhang et al., 2007),
Eαmf,i = Eαm,i − Eαf,i. (40)
We analyze these traits by the same methods as the sex-averaged expression levels Eαi defined by equa-tion (5). Specifically, we define the rescaled time-dependent divergence Ωmf,C and the fitness flux 2NΦmf
in analogy to equations (10) and (15), and we infer gene-specific maximum-likelihood values 2NΦαmf in
analogy to equation (11). We define two conceptually distinct measures of male-female differentiation:
(a) Sex-specific adaptation. In accordance with ref. (Zhang et al., 2007), we find that most genes of ourdata set have well-conserved and often small sex specificity; these genes evolve their expression levelscoherently between males and females. We use the rescaled fitness flux 2NΦmf to delineate coherentevolution of expression levels (i.e., conservation of the specificity trait) from sex-specific adaptation(i.e., adaptive changes of the male-female expression difference), as illustrated in Fig. 5A. A set of 1155sex-specific adaptive genes is identified by the condition 2NΦα
mf > 4.5 (Table S2); we use a morestringent threshold than for Φα because the sex-specificity trait statistics has larger statistical errors.
(b) Sex bias. We identify genes with male- and female-biased expression in Drosophila using the resultsof Assis et al. (Assis et al., 2012), which are based on a number of statistical tests in the whole bodyand in gonads of males and females in D. melanogaster and D. pseudoobscura. A gene is classified asexpression sex-biased if flagged by at least three of these tests, which produces a list of 450 male-biasedand 499 female-biased genes. A related measure of bias within our data set is the species-averagedspecificity trait, Eαmf = 1
7
∑iE
αmf,i.
Our analysis establishes a relation between these two measures in our data set: strong sex-specific adaptationof expression occurs in male-biased, but not in female-biased genes. First, the aggregate rescaled divergenceΩmf in male-biased genes show evidence for adaptive evolution with a linear adaptive component υτ/2.Unbiased and female-biased genes have only a small average divergence in their sex-specificity trait that isof the order of the expression diversity (i.e., they within the error range of the observed expression levels),providing no evidence for adaptation (Fig. 5B). Second, the fitness flux Φα
mf is strongly enhanced for geneswith large Eαmf (Fig. 5C). Accordingly, 32% of male-biased genes are classified as sex-specific adaptive.Functional categories associated with sex-specific adaptation of expression are reported in Table S2.
26
5. Simulation tests
In-silico evolution of quantitative traits. We use a Fisher-Wright process for the evolution of popula-tions along the Drosophila phylogeny of Fig. 1. A population consists of N individuals with genomesa(1), . . . ,a(N). A genotype is an `-letter sequence a = (a1, . . . , a`) with alleles ak = 0, 1 (k = 1, . . . , `).It defines an expression level E(a) =
∑`k=1 Ekak with neutral variance D0 = 1
2
∑`k=1 E2
k . We use uniformsingle-locus effects Ei; our results are insensitive to the form of the effect distribution (Held et al., 2014).In each generation, the sequences undergo point mutations with a probability µτ0 per generation, where τ0
is the generation time. The sequences of next generation are then obtained by multinomial sampling with aprobability proportional to [1 + τ0f(E(a), t)], where the fitness function f(E, t) is given by equation (11).Simulations are performed with N = 100, π0 = 0.1 for traits with ` = 100, uniform effects Ei = 1, andaverage fitness optimum E = 70. We use three different types of selection (for details, see ref. (Held et al.,2014)):
(a) Minimal fitness seascape. Before each reproduction step, a new optimal trait value E∗(t+ τ0) is drawnfrom a Gaussian distribution with meanE∗(t)(1−µτ0υ/(2r
2))+E µτ0υ/(2r2) and variance µτ0υD0/2.
(b) Fitness landscape. The optimal trait value E∗ is time-independent (Fig. S4A,B). In the model of gene-specific relaxation of selection (see section 3), the stabilizing strength of individual genes switches to asmaller value, c→ 0.01c, with a small rate γ (Fig. S5C).
(c) Punctuated fitness seascape (see section 3). Before each reproduction step, a new, uncorrelated optimaltrait value is drawn with probability µτ0υ/(2r
2) from a Gaussian distribution with variance r2D0/2,where r2 is a constant of order 1 (Fig. S5D).
Validation of the probabilistic inference scheme. To test the performance of our inference scheme, wegenerate expression values Eα = (Eα1 , . . . , E
α7 ) for individual genes with trait scalesE2
0,α by Fisher-Wrightsimulations along the Drosophila phylogeny of Fig. 1. We use minimal fitness seascapes of the form (11)with input parameters (cin, υin) and a sequence diversity π0 = 4µN = 0.05. We then infer maximum-likelihood posterior values (cα, 2NΦα) by the probabilistic method described in section 2 (equation 32).In Fig. S7A, we plot the distribution of inferred fitness flux values 2NΦα against the input expectationvalue 2NΦin = 2cinυin τDros. (equation 16). The underlying simulations use a range of trait scales E2
0,α =0.25 − 4.0 appropriate for log expression levels; the inference of Φα does not require knowledge of thisscale (see section 2). Fig. S7B shows the corresponding distribution of inferred values cα as a function ofthe input stabilizing strength cin. These simulations use a uniform trait scale E2
0,α = 1 (inferring the actualscales requires sufficiently reliable gene-specific expression diversity data). The posterior values (Φα, cα)are seen to provide reasonable, on average conservative estimates of the input model parameters (cin,Φin).In particular, the inference of a significant fitness flux (Φ > 1/2N) is incompatible with evolution understatic stabilizing selection (υ = 0, c > 0) or near neutrality (c ' 1), independently of the underlying modelfor the adaptive evolution of a molecular trait.
Robustness of the inference to trait epistasis. The analytical theory underlying our inference method (Heldet al., 2014; Nourmohammad et al., 2013b) covers molecular quantitative traits with a linear genotype-phenotype map, E(a) =
∑`k=1 Ekak (see above). Here we extend this method to nonlinear traits of the
form E(a) =∑`
k=1 Ekak +∑
k<k′ Ekk′akak′ ; such nonlinearities are commonly referred to as trait epis-
27
tasis. The strength of epistasis can be defined as the ratio of nonlinear and linear neutral trait variance,ε2 = (
∑k<k′ E2
kk′)/(∑
k E2k ).
Trait epistasis introduces only minor changes to the quantitative genetics theory of refs. (Held et al.,2014; Nourmohammad et al., 2013b). In particular, the quasi-neutral growth of the trait divergence is stillgiven by equation (21), where ∆ is now the total genetic diversity of the trait.
To specifically test our inference method, we perform Fisher-Wright simulations as described above overa wide range of the parameter ε2; individual epistatic effects Ekk′ are drawn from a Gaussian distributionwith mean 0. In an ensemble of 6000 independently evolving traits, we record both the actual average fitnessflux (equation 15) and the inferred fitness flux determined from the aggregate divergence Ω (equation 24).Both quantities show no systematic dependence on ε2 (Fig. S7C), suggesting that our inference of adaptiveevolution is not confounded by trait epistasis.
28
Supplemental ReferencesAquadro, C., Bauer DuMont, V., and Reed, F. (2001). Genome-wide variation in the human and fruitfly: a comparison.
Curr Opin Genet Dev, 11(6):627–634.
Bauer, S., Grossmann, S., Vingron, M., and Robinson, P. (2008). Ontologizer 2.0–a multifunctional tool for GO termenrichment analysis and data exploration. Bioinformatics, 24(14):1650–1651.
Beaulieu, J., Jhwueng, D.-C., Boettiger, C., and O’Meara, B. (2012). Modeling stabilizing selection: expanding theOrnstein-Uhlenbeck model of adaptive evolution. Evolution, 66(8):2369–2383.
Bolstad, B., Irizarry, R., Astrand, M., and Speed, T. (2003). A comparison of normalization methods for high densityoligonucleotide array data based on variance and bias. Bioinformatics, 19(2):185–193.
Butler, M. and King, A. (2004). Phylogenetic comparative analysis: A modeling approach for adaptive evolution.American Naturalist, 164(6):683–695.
Chakraborty, R. and Nei, M. (1982). Genetic Differentiation of Quantitative Characters Between Populations orSpecies .1. Mutation and Random Genetic Drift. Genetical Research, 39(3):303–314.
de Vladar, H. and Barton, N. (2011). The statistical mechanics of a polygenic character under stabilizing selection,mutation and drift. J. R. Soc. Interface, 8(58):720–739.
Gilad, Y., Oshlack, A., and Rifkin, S. (2006a). Natural selection on gene expression. Trends Genet, 22(8):456–461.
Hansen, T., Pienaar, J., and Orzack, S. (2008). A comparative method for studying adaptation to a randomly evolvingenvironment. Evolution, 62(8):1965–1977.
Huber, W., von Heydebreck, A., Sultmann, H., Poustka, A., and Vingron, M. (2002). Variance stabilization applied tomicroarray data calibration and to the quantification of differential expression. Bioinformatics, 18 Suppl 1:S96–104.
Kalinka, A., Varga, K., Gerrard, D., Preibisch, S., Corcoran, D., Jarrells, J., Ohler, U., Bergman, C., and Tomancak, P.(2010). Gene expression divergence recapitulates the developmental hourglass model. Nature, 468(7325):811–814.
Lande, R. (1992). Neutral Theory of Quantitative Genetic Variance in an Island Model with Local Extinction andColonization. Evolution, 46(2):381.
Le Corre, V. and Kremer, A. (2012). The genetic differentiation at quantitative trait loci under local adaptation. Mol.Ecol., 21(7):1548–1566.
Li, W. (1993). Unbiased estimation of the rates of synonymous and nonsynonymous substitution. Journal of MolecularEvolution, 36(1):96–99.
Lynch, M. and Force, A. (2000). The probability of duplicate gene preservation by subfunctionalization. Genetics,154(1):459–473.
Lynch, M. and Katju, V. (2004). The altered evolutionary trajectories of gene duplicates. Trends Genet, 20(11):544–549.
Lynch, M. and Walsh, B. (1998). Genetics and analysis of quantitative traits. Sinauer Associates Inc.
Mackay, T., Richards, S., Stone, E., Barbadilla, A., Ayroles, J., Zhu, D., Casillas, S., Han, Y., Magwire, M., Cridland,J., et al. (2012). The Drosophila melanogaster Genetic Reference Panel. Nature, 482(7384):173–178.
Nourmohammad, A. and Lassig, M. (2011). Formation of regulatory modules by local sequence duplication. PLoSComput. Biol., 7(10):e1002167.
29
Perry, G., Melsted, P., Marioni, J., Wang, Y., Bainer, R., Pickrell, J., Michelini, K., Zehr, S., Yoder, A., Stephens, M.,et al. (2012). Comparative RNA sequencing reveals substantial genetic variation in endangered primates. GenomeRes., 22(4):602–610.
Rohlfs, R., Harrigan, P., and Nielsen, R. (2014). Modeling gene expression evolution with an extended Ornstein-Uhlenbeck process accounting for within-species variation. Mol. Biol. Evol., 31(1):201–211.
Sharp, P. and Li, W. (1987). The codon Adaptation Index–a measure of directional synonymous codon usage bias, andits potential applications. Nucleic Acids Res., 15(3):1281–1295.
Smith, N. and Eyre-Walker, A. (2002). Adaptive protein evolution in Drosophila. Nature, 415(6875):1022–1024.
Spitze, K. (1993). Population structure in Daphnia obtusa: quantitative genetic and allozymic variation. Genetics,135(2):367–374.
Supek, F., Bosnjak, M., Skunca, N., and Smuc, T. (2011). REVIGO summarizes and visualizes long lists of geneontology terms. PLoS ONE, 6(7):e21800.
Tautz, D. and Domazet-Loso, T. (2011). The evolutionary origin of orphan genes. Nature Rev. Genet., 12(10):692–702.
Tsankov, A., Thompson, D., Socha, A., Regev, A., and Rando, O. (2010). The role of nucleosome positioning in theevolution of gene regulation. PLoS Biol., 8(7):e1000414.
Wright, S. (1943). Isolation by distance. Genetics, 28:114–138.
Wright, S. (1950). Genetical structure of populations. Nature, 166(4215):247–249.
30