Genetical genomics in humans model organisms 2005

5
Genetical genomics in humans and model organisms Dirk-Jan de Koning and Chris S. Haley The Roslin Institute, Roslin, Midlothian, UK, EH25 9PS Gen eti cal gen omics has bee n propos ed to map loc i controlling gene-expression differences (eQTLs) that might underlie functi onal tra it var iation. We briey review the studies in model species and conclude that, although they successfu lly demon strate the utility of genetical genomics, they are too limited to unlock the full potential of this approach and some results should be interpreted with caution. We subsequently elaborate on two recent studies that use this approach in humans. The many differences between these studies complicate meaningful comparisons between them. A joint analysis of the two exp eri ments of fer s some scope for mor e powerful genetical genomics. Introduction Geneti cal gen omi cs des cri bes the combined study of  gene express ion and mark er geno types in a segr egat - ing population [1,2]. It aims to detect the genomic loci that cont rol gene -expressi on dif feren ces, thes e loci are referred to as expression q ua nt it ative trait loci (eQTLs; see Glossary). To date, most of these studie s have used model species such as mice [3–5], maize [3], rats [6] and yeast [7,8]. The experimental desig ns include recomb inant inbred lines (RI; in rodents) [4–6], F 2 or F 3 crosses (in mice and maize) [3] and haploid lines (in yeast) [7–9]. The common feature of these des igns is that, compared with ‘tr aditi ona l’ phenotype-based QTL experiments, the sizes of the exper- iments are modest to sma ll. We hav e compar ed the statistical power to detect different QTL effects among the different eQTLs studies to date and comment on potential shortcomi ngs (Bo x 1). The li mit ed siz e of expe riments can be attr ibut ed to the expe nse of gene-exp ress ion anal yses. However, this should enco urag e collabo rat ive efforts to perform more powerful eQTL studies rather than multiple studies that each lack sufcient power. Cis and trans eQTL eQTL can be classied as cis or trans acting based on the location of the transcript compared with that of the eQTL inuencing the exp ression of that transc rip t. The re is  variation between studies in exactly how cis and trans are dened , but gen erally the gen ome is divide d int o seg ments (bins; to allow for inherent inaccuracy in the mapping of eQTL) based on physical or mapping distance {e.g. 20kb in yeast [7], 5MB [4,5] or 2 cM (w3.6 MB) in mice [3] and 20 MB in rats [6]}. A QTL is cis acting if it is located in the same bin as the transcript it inuences, otherwise it is termed trans acting. Differences in microarray platform and their effect on eQTL studies Differences in performance between microarray platforms have bee n dis cus sed in det ail els ewh ere [10]. Becau se geneti cal genomics combi nes sequen ce polymorphisms wit h var iat ion in expres sio n lev els , it is imp ort ant to establish how robust the RNA measurement is against sequence variation [e.g. single nucleotide polymorphisms (SNPs)] in the transcript. The robustness of Affymetrix chips (http://www.affymetrix.com ) aga ins t spurious cis- effects resulting from SNPs in the transcripts has been evaluated by re-sequencing some of the genes with cis- effects in rats [6] and by using available SNP data in mice [5]. Bo th studie s concluded that the ef fect of SNP vari ation on the detection of  cis-acting eQTLs was limite d. An alt ern ati ve approa ch for Af fymetr ix chi ps wou ld be to study probe–eQTL interacti ons for cis -acti ng eQTL bec aus e Af fymetr ix chi ps use mul tip le pro bes to int er- rogate eac h transc ript (Ri tsert Jan sen , per sonal com- munication). Agilent 60-mer oligonucleotide arrays were shown to be robust against four SNPs or less in the probe region [11]. Major hubs of genes regulation: fact or artefact?  A common featu re of eQTL studie s is the detec tio n of ‘ho tsp ots or hub s of trans-actin g eQTL: chromo somal regions tha t af fec t the expres sio n of a muc h lar ger number Glossary Bonferroni correction : a statistical adjustment for multiple comparisons. The Bonferroni correction is simple: if a number ( n ) of outcomes are being tested instead of a single outcome, the desired threshold level ( P ) is divided by n . False discovery rate: the proportion of false-positive test results among all signicant tests (note that the FDR is conceptually different to the signicance level). Haploid line: a line that is derived by crossing two strains and subsequently manipulating the F 1 gametes to develop into fully homozygous individuals. Heritability: a statistic that estimates the proportion of variation in a trait that is attributable to genetic factors. Phenotyp ic standarddeviations: a statisticthat desc ribes thedispers ionof data about the mean. Quantita tive trait locus: gene tic locior chro moso malregionsthat contr ibut e to variability in complex quantitative traits, as identied by statistical analysis. Quantita tive traits are typi call y af fected by several genes and by the environment. Recombinant inbred lines: a strain that is formed by crossing two strains, followed by 20 or more consecutive generations of brother–sister mating or selng. The resulting lines are homozygous (and therefore xed) at each locus, enabling repeated replicates of genetically homogeneous lines to be assayed. Statistical power: a statistic that describes how effective a given experiment is to detect a certain effect. Statistical power is expressed as the proportion of teststhat are expe ctedto be sign ica nt give na cert ainexperimentand a certain effect. Correspond ing author: de Koning, D.-J. (DJ.deKoning@ BBSRC.ac .uk).  Available online 23 May 2005 Update TRENDS in Genetics Vol.21 No.7 July 2005 377 www.sciencedirect.com

Transcript of Genetical genomics in humans model organisms 2005

8/15/2019 Genetical genomics in humans model organisms 2005

http://slidepdf.com/reader/full/genetical-genomics-in-humans-model-organisms-2005 1/5

Genetical genomics in humans and model organisms

Dirk-Jan de Koning and Chris S. Haley

The Roslin Institute, Roslin, Midlothian, UK, EH25 9PS

Genetical genomics has been proposed to map loci

controlling gene-expression differences (eQTLs) that

might underlie functional trait variation. We briefly

review the studies in model species and conclude that,

although they successfully demonstrate the utility of

genetical genomics, they are too limited to unlock the

full potential of this approach and some results should

be interpreted with caution. We subsequently elaborate

on two recent studies that use this approach in humans.

The many differences between these studies complicate

meaningful comparisons between them. A joint analysis

of the two experiments offers some scope for morepowerful genetical genomics.

Introduction

Genetical genomics describes the combined study of 

gene expression and marker genotypes in a segregat-

ing population [1,2]. It aims to detect the genomic loci

that control gene-expression differences, these loci are

referred to as expression quantitative trait loci

(eQTLs; see Glossary).

To date, most of these studies have used model species

such as mice [3–5], maize [3], rats [6] and yeast [7,8]. The

experimental designs include recombinant inbred lines(RI; in rodents) [4–6], F2 or F3 crosses (in mice and maize)

[3] and haploid lines (in yeast) [7–9]. The common feature

of these designs is that, compared with ‘traditional’

phenotype-based QTL experiments, the sizes of the exper-

iments are modest to small. We have compared the

statistical power to detect different QTL effects among the

different eQTLs studies to date and comment on potential

shortcomings (Box 1). The limited size of experiments can be

attributed to the expense of gene-expression analyses.

However, this should encourage collaborative efforts to

perform more powerful eQTL studies rather than multiple

studies that each lack sufficient power.

Cis  and trans  eQTL

eQTL can be classified as cis or trans acting based on the

location of the transcript compared with that of the eQTL

influencing the expression of that transcript. There is

 variation between studies in exactly how cis and trans are

defined, but generally the genome is divided into segments

(bins; to allow for inherent inaccuracy in the mapping of 

eQTL) based on physical or mapping distance {e.g. 20kb in

yeast [7], 5MB [4,5] or 2 cM (w3.6 MB) in mice [3] and

20 MB in rats [6]}. A QTL is cis acting if it is located in the

same bin as the transcript it influences, otherwise it is

termed trans acting.

Differences in microarray platform and their effect on

eQTL studies

Differences in performance between microarray platforms

have been discussed in detail elsewhere [10]. Because

genetical genomics combines sequence polymorphisms

with variation in expression levels, it is important to

establish how robust the RNA measurement is against

sequence variation [e.g. single nucleotide polymorphisms

(SNPs)] in the transcript. The robustness of Affymetrixchips (http://www.affymetrix.com ) against spurious cis-

effects resulting from SNPs in the transcripts has been

evaluated by re-sequencing some of the genes with cis-

effects in rats [6] and by using available SNP data in mice

[5]. Both studies concluded that the effect of SNP variation

on the detection of  cis-acting eQTLs was limited. An

alternative approach for Affymetrix chips would be to

study probe–eQTL interactions for cis-acting eQTL

because Affymetrix chips use multiple probes to inter-

rogate each transcript (Ritsert Jansen, personal com-

munication). Agilent 60-mer oligonucleotide arrays were

shown to be robust against four SNPs or less in the probe

region [11].

Major hubs of genes regulation: fact or artefact?

  A common feature of eQTL studies is the detection of 

‘hotspots’ or hubs of  trans-acting eQTL: chromosomal

regions that affect the expression of a much larger number

Glossary

Bonferroni correction: a statistical adjustment for multiple comparisons. The

Bonferroni correction is simple: if a number ( n ) of outcomes are being tested

instead of a single outcome, the desired threshold level (P ) is divided by n .

False discovery rate: the proportion of false-positive test results among all

significant tests (note that the FDR is conceptually different to the significance

level).

Haploid line: a line that is derived by crossing two strains and subsequently

manipulating the F1 gametes to develop into fully homozygous individuals.Heritability: a statistic that estimates the proportion of variation in a trait that is

attributable to genetic factors.

Phenotypic standarddeviations: a statisticthat describes thedispersionof data

about the mean.

Quantitative trait locus: genetic locior chromosomalregionsthat contribute to

variability in complex quantitative traits, as identified by statistical analysis.

Quantitative traits are typically affected by several genes and by the

environment.

Recombinant inbred lines: a strain that is formed by crossing two strains,

followed by 20 or more consecutive generations of brother–sister mating or

selfing. The resulting lines are homozygous (and therefore fixed) at each locus,

enabling repeated replicates of genetically homogeneous lines to be assayed.

Statistical power: a statistic that describes how effective a given experiment is

to detect a certain effect. Statistical power is expressed as the proportion of 

teststhat are expectedto be significant given a certainexperimentand a certain

effect.Corresponding author: de Koning, D.-J. ([email protected]).

 Available online 23 May 2005

Update TRENDS in Genetics  Vol.21 No.7 July 2005 377

www.sciencedirect.com

8/15/2019 Genetical genomics in humans model organisms 2005

http://slidepdf.com/reader/full/genetical-genomics-in-humans-model-organisms-2005 2/5

of genes than expected by chance. These major hubs of gene regulation are most prominent in yeast (eight) [7,8],

followed by mice (approximately seven) [3–5]. Clustering

of eQTL was not reported for maize [3]. The locations of 

the trans-acting eQTL show limited overlap among the

three mouse eQTL studies [3–5], which could be due to

tissue-specific trans regulation. Although the most sig-

nificant eQTL are cis-acting, the detection of trans-acting

regulatory hubs is plausible if  cis-regulation provides

more direct (i.e. less variable) genetic control than trans

regulation, ensuring that cis-acting effects are larger and

more consistent. Alternatively, it could be that the

proportion of false positive eQTL is greater among trans-

acting effects.The strong clustering in ‘hubs’ of eQTLs reflects the

highly correlated expression levels of many gene tran-

scripts. This is illustrated by a recent simulation study

using real expression data from human pedigrees with a

simulated SNP map that was independent of the

expression levels [12]. As a result, all eQTLs detected

were by default false positives. The eQTL analyses showed

strong clustering of (trans) eQTLs and the five most

populated bins contained 20% of the significant, but

spurious, eQTLs [12]. Thus, although both the high

correlation of expression levels among gene transcripts

and the detection of eQTL hotspots in experimental

studies can be interpreted to support the hypothesis of coordinated trans-regulation of multiple genes, a major

concern is whether the correlation could be due to some

technical or environmental factors that are currently

unaccounted for. For example, the clustering of eQTL for

multiple traits could simply represent the clustering of 

spurious QTL for highly correlated traits (i.e. with so

many traits we expect to see many false-positive QTL

effects, and if traits are highly correlated, for whatever

reason, these false-positive QTLs will often locate to the

same region). Because of the limited understanding of 

genetic and physiological control of gene expression and

the limited experimental sizes so far, any conclusions with

regard to hotspots for gene regulation should be inter-preted with caution.

eQTL studies in human cell lines

 Although the genetic complexity of most eQTL studies is

limited because of the use of inbred resources, two

recent studies report eQTL in analyses of cell lines

derived from human pedigrees [13,14]. These initial

studies both used lymphoblastoid cell lines from the

CEPH pedigrees (http://www.cephb.fr/cephdb/ ) but other-

wise have differences at almost every level of execution

(Table 1). Many of the differences between the two studies

are not unique to genetical genomics: discrepancies

Box 1. The power of eQTL studies to date

Table I summarizes the statistical power to detect QTL for some eQTL

studies to date and compares these with hypothetical F2 designs that

are commonly encountered in QTL detection. For example, an eQTL

with a Heritability of 0.03 (i.e. the eQTL explains 3% of the variation in

RNA abundance among the F2 mice) would be detected in 7% of the

experiments performed with 111 F2 mice [3] a nd 1 6% of the

experiments with 86 haploid yeast lines [8].

Although the experiment using 112 haploid yeast lines [9] is themost powerful of all the studies, most studies have limited power to

detect any QTL with an effect !0.5 phenotypic standard deviations

(SD; equivalent to a QTLheritability of 0.13).As a result, the studies fail

to detect many loci with moderate effects on gene regulation and are

also expected to miss some loci with major effects. The statistical

threshold that we have used for the power calculations is reasonably

stringent for a single trait, but fairly liberal overall, considering that

eQTL studies commonly examine the expression levels of thousands

of genes. This is a major issue in genetical genomics because it uses

multiple testing in two dimensions: hundreds of markers are tested for

their putative effect on O10 000 gene transcripts. Traditional

approaches, such as the Bonferroni correction, that limit the discovery

of spurious effects by increasing the stringency on the statistical

significance threshold are demanding as the thresholds becomeprohibitive for the detection of all but the most extreme effects.

Alternatives such as the false discovery rate have been proposed for

genome scans and gene-expression studies [15], and an overview of 

multiple testing issues and alternatives in genetics was recently

presented by Manly et al. [16].

Table I. A comparison of statistical power to detect QTL in eQTL studies

Refs Population Na Statistical power for different QTL effectsb

QTL effect (phenotypic SD)c 0.25 0.40 0.5 0.6 0.75

QTL heritability in F2 (variance explained)d 0.03 0.08 0.13 0.18 0.28

Brem et al. [7] Haploid yeast 40 0.05 0.2 0.51 0.73 0.99

Yvert et al. [8] Haploid yeast 86 0.16 0.67 0.94 0.99 0.99

Brem and Kruglyak [9] Haploid yeast 112 0.25 0.84 0.99 0.99 0.99

Schadt et al. [3] F2 mice 111 0.07 0.37 0.68 0.90 0.99

Schadt et al. [3] F3 maize 76 0.04 0.19 0.41 0.67 0.94Chesler et al. (mice);

Bystryk et al. (mice);

Hubner et al. (rats) [4–6]

Recombinant inbred linese 33 0.05 0.29 0.62 0.91 0.99

Hypothetical F2 200 0.21 0.77 0.96 0.99 0.99

Hypothetical F2 400 0.60 0.99 0.99 0.99 0.99aNumber of individuals with expression data.bThe probability of detecting as significant a QTL using a point-wise significance threshold of P !0.001, which corresponds to a LOD score of 3.0 for an F2 design (slightly

morestringent thanthe proposedthreshold forsuggestive linkagebut muchless stringent thanthe threshold for significant linkage [17]). Thepower calculationsaccount

fordifferent experimentaldesigns but notfor different genome length betweenspecies(the greaternumberof independenttests performed in a larger genome requiresa

more stringent significance threshold).cAdditive effect of the QTL (half of the difference between homozygotes) expressed in units of the phenotypic standard deviation.dThe proportion of the totalvariationin the population explained by the QTL,assuming an F2 populationwhere theQTLallelefrequenciesareboth0.5.In anRI orhaploid

system, the heritability of the QTL is twice the magnitude in an F 2.eAssuming a repeatability of 0.50 for gene transcripts and three replicates for every recombinant inbred (RI) line.

Update TRENDS in Genetics  Vol.21 No.7 July 2005378

www.sciencedirect.com

8/15/2019 Genetical genomics in humans model organisms 2005

http://slidepdf.com/reader/full/genetical-genomics-in-humans-model-organisms-2005 3/5

between gene-expression platforms, different statistical

methods and protocols are common obstacles when

comparing different microarray studies. Although the

studies overlap for about half (eight) of the CEPH families

studied, they use different genetic marker sets and

different methods for expression analysis and eQTL

analysis. Furthermore, they use different criteria for

including genes in their eQTL analysis and apply different

thresholds for QTL detection (Table 1). The results

between the two studies are also remarkably different:

Morley et al. take w42% of the genes (nZ3554) on theirarrays forward to eQTL analysis, whereas Monks et al.

use only w10% (nZ2430; Table 1). At comparable

significance levels (3.7!10K5 and 5.0!10K5, respect-

ively), Morley et al. report eQTL for w28% of the genes

that were taken forward to eQTL analysis compared with

w2% for Monks et al. (Table 1). Figure 1 shows the

theoretical power for detection of QTL for the two studies

using the two methods of QTL analysis. The QTL methods

are briefly explained in Box 2. For the sib-pair analyses,

both studies had similar power. The power calculations

confirm that variance component methods such as

sequential oligogenic linkage analysis routines (SOLAR)

are theoretically slightly more powerful than sib-pair

analyses, because they use all of the genetic relationships

within the pedigree. However, the power difference does

not explain the marked difference in numbers of QTL

detected by the two studies. The greater number of eQTLs

for the Morley et al. study could be due to several factors

including: (i) less technical noise in gene-expression

measurements, resulting in a larger proportion of the

 variance attributable to the QTL effect; (ii) environmental

conditions that promote greater genetically controlled

 variation in expression; or (iii) less robust gene-expressionmeasurements or analyses, making the results more

prone to bias and false positive results. Given the low

power of both studies to detect eQTLs under the stringent

thresholds that they apply, the results of Monks et al. are

more consistent with prior expectation, unless eQTL

effects are much stronger than those of phenotypic QTL.

 Although the low theoretical power does not explain why

Morley et al. detect more QTL than Monks et al., it would

explain differences in genes for which eQTL are detected,

in addition to discrepancies in finding eQTL in different

locations for a particular transcript. When both studies

have limited power to detect a given QTL, they will each

Table 1. A comparison between two eQTL analyses on human CEPH dataa

Morley et al. [14] Monks et al. [13]

CEPH families used 14 (eight in common) 15 (eight in common)

Gene expression

Platform Affymetrix genome focus 25-mer

oligonucleotide arrays

Agilent 60-mer oligonucleotide array

Genes on array w8500 23 499

Design and replicates Direct m easurement with t wo array replicates

per individual

Reference design with at least two arrays per

individual

Criterion for selecting genes for eQTLanalysis

Greater variation between individuals thanwithin

Differentially expressed in at least half of theoffspring

Genes taken forward to eQTL analysis 3554 2430

Marker genotypes 2756 autosomal SNP markers from the SNP

consortium database

346 autosomal markers, selected from CEPH

genotype database

Data availability Genotypes available at http://www.ceph/fr/ 

cephdb

Genotypes available at http://www.ceph/fr/ 

cephdb

Expression data at http://www.ncbi.nlm.nih.

gov/geo/ (GEO accession GSE1485)

Expression data at http://www.ncbi.nlm.nih.

gov/geo/ (GEO accession GSE1726)

eQTL analyses (i) Sib-pair analyses using S.A.G.E. for whole

genome analysis

Variance component analyses using SOLAR

for both heritability of transcript level and

eQTL

(ii) QTDT and association study for 17 genes

with cis-acting eQTL

Test for hubs of gene regulation 5 MB genome bins, testing for deviation from

poisson distribution

At 4 cM (w3.2 MB) intervals comparing

number of hits with those obtained by

simulationeQTL results 142 genes with at least one eQTL

(P !4.3!10K7)

33 genes with at least one eQTL

(P !5.0!10K6)

984 genes with at least one eQTL

(P !3.7!10K5)

50 genes with at least one eQTL

(P !5.0!10K5)

135 genes with at least one eQTL

(P !5.0!10K4)

Hubs of gene regulation Two hotspots on chromosomes 14 and 20

affecting seven and six genes, respectively

(using P !4.3!10K7) or 31 and 35 genes,

respectively (using P Z3.7!10K5)b

Six locations with five or six linkage hits on

chromosome 6; according to the authors,

these are attributable to allelic diversity and

non-specificity of gene probes and were

therefore dismissed

Other analyses Hierarchical clustering of genes within 5 MB

window on chromosome 14

Test for enrichment of certain annotations

among differentially expressed genes

RT–PCR for one gene with a large cis  effect 574 genes with non-zero heritability; these

were subsequently clustered using genetic

or phenotypic correlationsaAbbreviations: GEO, gene expression omnibus.bA different number of genes are affected by the eQTL, depending on the P  value used.

Update TRENDS in Genetics  Vol.21 No.7 July 2005 379

www.sciencedirect.com

8/15/2019 Genetical genomics in humans model organisms 2005

http://slidepdf.com/reader/full/genetical-genomics-in-humans-model-organisms-2005 4/5

only detect a small proportion of actual eQTL and are

hence unlikely to detect the same effects.

Both studies agree that the most significant QTL

appear to be cis-acting, whereas the proportion of  cis

acting eQTL is smaller in Morley et al. (w22%) than in

Monks et al. (w40%) for the most stringent significance

levels. However, although Morley et al. claim support for

two trans-acting hubs of regulation on chromosomes 14

and 20, Monks et al. claim ‘lack of evidence for linkage

hotspots’, although their permutations show that eQTLare significantly ‘unevenly distributed’. However, Monks

 et al. make their statement based on the eQTL with P!

0.000005, whereas Morley et al. use P!0.000037 (7.4

times larger) to claim the larger hubs. Therefore, the

difference in threshold, and the difference in genes that

were analysed, could explain this discrepancy.

 An interesting aspect of the Morley et al. article is the

follow-up analyses on cis-acting QTL: they perform a

within family association test [quantitative transmission

disequilibrium test (QTDT); Box 2] with additional SNP

markers for 17 transcripts. Furthermore, they re-estimate

the magnitude of these QTL effects by a regression

analyses on the grandparent data, giving a more realistic

estimate of the actual QTL effect. This provides a solution

to the problem that when QTLs are initially detected in a

study with low power, the effects of those that are detected

can be grossly overestimated. This overestimation of QTL

effects is apparent in the article by Monks et al., who

report genes with two, three and even one gene with 15

eQTLs. Subsequently, they claim that ‘all detectable QTL

accounted for at least 50% of the trait variance with 75% of 

the QTL having heritabilities O0.76’. This illustrates thelevel at which QTL effects are overestimated: it is

impossible to have 15 eQTL, each explaining 50% of the

trait variance. This phenomenon is not unique to eQTL,

but it illustrates the issue particularly well.

Morley et al. confirm one of their cis-acting eQTL by

quantitativePCR,which would seemto allay concernsabout

SNP variation in the probe. Only a single gene was

confirmed, therefore, no general conclusion can be drawn

from this result. Monks et al. discuss the potential problem

of SNP variation with the probe sequence and subsequently

question their own results for the human leukocyte antigen

(HLA)area,which harbours substantial sequence variation.

TRENDS in Genetics 

Power of eQTL studies in human pedigrees

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

QTL heritability

   P  o  w  e  r   t  o

   d  e

   t  e  c   t  e   Q   T   L

Sib-pair

VCA (Morley et al.)

VCA (Monks et al.)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Figure 1. The statistical power to detect the eQTL of given heritability for the two studies using either a sib-pair analysis or a variance component analysis (VCA). Using sib-

pair analyses (red), both studies had similar power; therefore, only a single line is shown. The statistical power is defined as the proportion of analyses in which a QTL with a

given effect willbe detectedunder a defined P value(in this case P !0.0001,which is still lessstringent thanthe proposedgenome-widethreshold [17]) The powerfor thesib-

pair analyses was assessed using the genetic power calculator [20] (http://statgen.iop.kcl.ac.uk/gpc/ ). The power for the VCA (pink and blue) was assessedusing routines thatwere kindly provided by Xijiang Yu (University of Edinburgh) based on Williams and Blangero [21], using the CEPH pedigrees. For all power calculations, the background

heritability was assumed to be 0.30. To restrict the pedigree from the original 210 members to the 167 that were used by Monks et al., 43 individuals were randomly deleted

from the power calculations. For a brief explanation of QTL methods, see Box 2.

Box 2. QTL methods used in the eQTL analyses of human data

Sib-pair analysisMorley et al. [14] applied a sib-pair analysis using the SIBPAL

procedure from S.A.G.E (http://darwin.cwru.edu/sage/index.php). A

sib-pair analysis determines evidence for linkage between a marker

and a quantitative trait by regressing the phenotypic difference

between sibs on the proportion of alleles that are shared identical by

descent (IBD) between the sibs.

Variance component QTL analysisMonks et al. [13] applied a variance component QTL analysis using

SOLAR (http://www.sfbr.org/solar/ ) [18]. In a variance component QTL

analysis, the proportion of phenotypic variation attributable to a QTL

is estimatedacross a population using theIBD proportions between all

related individuals for a putative QTL location.

Quantitative transmission disequilibrium test (QTDT)Morley etal. [13] used a family-based association test to confirm some

of the cis -acting eQTL. Transmission disequilibrium tests (TDT) were

initially proposed for studying mendelian disorders and provide a

combined test for linkage and association by comparing the

transmitted and non-transmitted marker alleles from the parents

with those of the affected offspring.The quantitative TDT (QTDT), used

by Morley et al., extended this methodology to complex traits where

direct classification of offspring is not possible [19].

Update TRENDS in Genetics  Vol.21 No.7 July 2005380

www.sciencedirect.com

8/15/2019 Genetical genomics in humans model organisms 2005

http://slidepdf.com/reader/full/genetical-genomics-in-humans-model-organisms-2005 5/5

Concluding remarks

Both articles present an interesting set of results but only

appear to share a limited theoretical power to detect eQTL

of small to moderate sizes. A first step to compare both

studies would be to analyse the experiment in the first

study with the methods that were applied in the second

study (i.e. re-analyse the data from Morley et al. with

SOLAR and the data from Monks et al. with a sib-pair

analysis). Given that the pedigree details, genotype andgene-expression data for both studies are available online

(Table 1), ongoing exploration of these data sets is

expected to shed further light on the differences and

simalarities between the two studies.

eQTL studies have been successfully linked to variation

in disease phenotype in mice [3] and rats [6]. Although the

current examples of eQTL mapping in humans lack this

important aspect (and motivation) of eQTL mapping, these

authors might have paved the way for future eQTL studies

that will address the complex nature of human disease.

Acknowledgements

We acknowledge financial support from the BBSRC. We are grateful to thetwo referees, and to John Gibson, Ritsert Jansen and Rob Williams for

constructive comments on anearlierdraft ofthis article.We alsothank Ritsert

Jansen and Rob Williams for sharing their manuscripts on BXD data.

References1 Jansen, R.C. and Nap, J.P. (2001) Genetical genomics: the added value

from segregation. Trends Genet. 17, 388–391

2 Jansen, R.C. (2003) Studying complex biological systems using

multifactorial perturbation. Nat. Rev. Genet. 4, 145–151

3 Schadt, E.E. et al. (2003) Genetics of gene expression surveyed in

maize, mouse and man. Nature 422, 297–302

4 Bystrykh, L. et al. (2005) Uncovering regulatory pathways that affect

hematopoietic stem cell function using ‘genetical genomics’. Nat.

Genet. 37, 225–232

5 Chesler, E.J. et al. (2005) Complex trait analysis of gene expression

uncovers polygenic and pleiotropic networks that modulate nervoussystem function. Nat. Genet. 37, 233–242

6 Hubner, N. et al. (2005) Integrated transcriptional profiling and

linkage analysis for identification of genes underlying disease. Nat.

Genet. 37, 243–253

7 Brem, R.B. et al. (2002) Genetic dissection of transcriptional

regulation in budding yeast. Science 296, 752–755

8 Yvert, G. et al. (2003) Trans-actingregulatory variationin Saccharomyces

cerevisiae and the role of transcription factors. Nat. Genet. 35, 57–64

9 Brem, R.B. and Kruglyak, L. (2005) The landscape of genetic

complexity across 5700 gene expression traits in yeast. Proc. Natl.

 Acad. Sci. U. S. A. 102, 1572–157710 Tan, P.K. etal. (2003)Evaluationof geneexpression measurementsfrom

commercial microarray platforms. Nucleic Acids Res. 31, 5676–5684

11 Hughes, T.R. et al. (2001) Expression profiling using microarrays

fabricated by an ink-jet oligonucleotide synthesizer. Nat. Biotechnol.

19, 342–347

12 Perez-Enciso, M. (2004) In silico study of transcriptome genetic

 variation in outbred populations. Genetics 166, 547–554

13 Monks, S.A. et al. (2004) Genetic inheritance of gene expression in

human cell lines. Am. J. Hum. Genet. 75, 1094–1105

14 Morley, M. et al. (2004) Genetic analysis of genome-wide variation in

human gene expression. Nature 430, 743–747

15 Storey, J.D. and Tibshirani, R. (2003) Statistical significance for

genomewide studies. Proc. Natl. Acad. Sci. U. S. A. 100, 9440–9445

16 Manly, K.F. et al. (2004) Genomics, prior probability, and statistical

tests of multiple hypotheses. Genome Res. 14, 997–1001

17 Lander, E. and Kruglyak, L. (1995) Genetic dissection of complextraits: guidelines for interpreting and reporting linkage results. Nat.

Genet. 11, 241–247

18 Almasy, L. andBlangero, J. (1998)Multipoint quantitative-traitlinkage

analysis in general pedigrees. Am. J. Hum. Genet. 62, 1198–1211

19 Abecasis, G.R. et al. (2000) A general test of association for quantitative

traits in nuclear families. Am. J. Hum. Genet. 66, 279–292

20 Purcell, S. et al. (2003) Genetic power calculator: design of linkage and

association genetic mapping studies of complex traits. Bioinformatics

19, 149–150

21 Williams, J.T. and Blangero, J. (1999) Power of variance component

linkage analysis to detect quantitative trait loci.  Ann. Hum. Genet. 63,

545–563

0168-9525/$ - see front matter Q 2005 Elsevier Ltd. All rights reserved.doi:10.1016/j.tig.2005.05.004

Genome Analysis

A highly unexpected strong correlation betweenfixation probability of nonsynonymous mutations andmutation rate

Gerald J. Wyckoff1,4,*, Christine M. Malcom1,2,*, Eric J. Vallender1,3,* and

Bruce T. Lahn

1

1Howard Hughes Medical Institute, Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA2Department of Anthropology, University of Chicago, Chicago, IL 60637, USA3Committee on Genetics, University of Chicago, Chicago, IL 60637, USA4Department of Molecular Biology and Biochemistry, University of Missouri-Kansas City, Kansas City, MO 64108, USA

Under prevailing theories, the nonsynonymous-to-

synonymous substitution ratio (i.e. K a /K s ), which

measures the fixation probability of nonsynonymous

mutations, is correlated with the strength of selection.

In this article, we report that K a /K s  is also strongly

correlated with the mutation rate as measured by K s ,

and that this correlation appears to have a similar

magnitude as the correlation between K a 

/K s 

and

selective strength. This finding cannot be reconciled

Corresponding author: Lahn, B.T. ([email protected]).

* These authors contributed equally to this work.

Update TRENDS in Genetics  Vol.21 No.7 July 2005 381

www.sciencedirect.com