Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS....
-
Upload
lesley-hodge -
Category
Documents
-
view
219 -
download
0
Transcript of Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS....
![Page 1: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/1.jpg)
families with >5 genes are more common in plants than in animals
adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65
0.010.020.030.040.050.060.070.080.090.0
100.0
1 2 3-5 >5
Number of genes per family
Per
cen
tag
e o
f g
enes
Human
Yeast
Fruit fly
Nematode
Rice
Arabidopsis
![Page 2: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/2.jpg)
alternative splicing (AS) is more common in animals than in plants
Boue S, et al. 2003. BioEssays 25: 1031-1034; Iida K, et al. 2004. Nucleic Acids Res 32: 5096-5103; Kikuchi S, et al. 2003. Science 301: 376-379
Arabidopsis and rice AS
![Page 3: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/3.jpg)
duplications occur on any length scale, from individual genes (where tandem refers to a gene and its duplicate being adjacent), to multi-gene segments of the chromosome, to an entire genomee.g. wild wheat is diploid 2n, domestication gave a tetraploid 4n (pasta) and a hexaploid 6n (bread)
synteny is when 2 or more genes are found in the same order/orientation on the chromosomes of related species
![Page 4: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/4.jpg)
polyploidy (whole genome duplication) events among plants
adapted from Blanc G, Wolfe KH. 2004. Plant Cell 16: 1667-1678; Paterson AH, et al. 2004. Proc Natl Acad Sci USA 101: 9903-9908
mon
ocot
dico
t
![Page 5: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/5.jpg)
phylogeny of the favored plantsthere is extensive synteny among Gramineae but between Gramineae and Arabidopsis there is essentially no synteny
sorghum
maize
Arabidopsis
barley
wheat
rice
Gramineae 55~70 Mya
monocot-dicot 170~235 Mya
![Page 6: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/6.jpg)
the duplication history of riceevery cDNA-defined gene is assigned a duplication category
using the methods of Yu J, et al. 2005. PLoS Biol 3: e38
1. analysis relies entirely on 19,079 full length cDNAs; had we used predicted genes instead many of the duplications would have been missed
2. a homolog pair refers to a cDNA and its TblastN match (i.e. comparisons done at amino acid level to genome translation in all 6 reading frames) at an expectation value of 1E-7 and requiring that >50% be aligned; note that the TblastN match is not necessarily expressed itself
3. if a gene has any homologs at all, the mean(median) number of homologs is 40(5)
4. multiple duplications are difficult to analyze; so consider the cDNAs with 1-and-only-1 homolog
![Page 7: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/7.jpg)
ONE whole genome duplication, a recent segmental duplication, and many individual gene duplications
birth
death
whole genome
individual genes
recent segmental
time
![Page 8: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/8.jpg)
18 pairs of duplicated segments covering 65.7% of rice genomehigher order homologs used to backfill established trend lines
RiceChr01Chr02Chr03Chr04Chr05Chr06Chr07Chr08Chr09Chr10Chr11Chr120
10
20
30
40
0 10 20 30Rice Chr02 (Mb)
Rice-Rice Comparison
segmental
![Page 9: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/9.jpg)
ancient whole genome duplication (WGD) in rice
![Page 10: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/10.jpg)
uninterpretable plot if use cDNAs with more than one homolog in rice
mean (median) number of homologs per duplicated gene is 40 (5)
RiceChr01Chr02Chr03Chr04Chr05Chr06Chr07Chr08Chr09Chr10Chr11Chr120
10
20
30
40
0 10 20 30Rice Chr02 (Mb)
Rice-Rice Comparison
![Page 11: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/11.jpg)
unmarked trend along diagonal from tandem gene duplicationsthere were NO segmental duplications within a chromosome
RiceChr01Chr02Chr03Chr04Chr05Chr06Chr07Chr08Chr09Chr10Chr11Chr120
10
20
30
40
0 10 20 30 40Rice Chr01 (Mb)
Rice-Rice Comparison
tandem
background
![Page 12: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/12.jpg)
computing molecular clocks and indicators of evolutionary selection
Ka = non-synonymous changes per available site
Ks = synonymous changes per available site
available site corrects for fact that 76% of substitutions, or 438 of 3364, encode a different amino acid
Ka/Ks < 1 is evidence of purifying selection
Ka/Ks = 1 is evidence of no selection (pseudogene)
Ka/Ks > 1 is evidence of adaptive selection
mean Ka/Ks is 0.20 in primates and 0.14 in rodents
![Page 13: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/13.jpg)
from neutral substitution rate to time since divergence of species
neutral substitution rates vary with genes and evolutionary lineages but on average they are 2.2×10-9 for mammals and 6.5×10-9 for Gramineae
Kumar S, Hedges SB. 1998. Nature 392: 917-920
common ancestor
species1 species2
time since divergence equals species2-species1 divided by (2 × neutral substitution rate)
![Page 14: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/14.jpg)
17 of 18 segments are attributable to a whole genome duplication just before the Gramineae divergence
higher order homologsKs from K-Estimator
0
30
60
90
0 0.5 1 1.5subs per silent site, Ks
Rice-Rice segmental duplicationtwo TblastN hits are allowedKs from K-Estimator
0
100
200
300
400
0 0.2 0.4 0.6subs per silent site, Ks
Rice-Rice tandem duplication
timing of WGD relative to Gramineae divergence is based on observed syntenies and not Ks
![Page 15: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/15.jpg)
background duplications have Ks signature like tandem duplications except that they are more ancient
two TblastN hits are allowedKs from K-Estimator
0
100
200
300
400
0 0.2 0.4 0.6subs per silent site, Ks
Rice-Rice tandem duplicationone and only one homologKs from K-Estimator
0
50
100
150
200
0 1 2 3subs per silent site, Ks
Rice-Rice background duplication
peak at zero Ks and exponential decay thereafter is indicative of ongoing duplication process
![Page 16: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/16.jpg)
duplicated genes undergo periods of relaxed selection and are usually silenced within 4~17 million years
hypothesis introduced by Lynch M, Conery JS. 2000. Science 290: 1151; with details in Lynch M, Conery JS. 2003. J Struct Funct Genomics 3: 35
one copy left alone
one copy to modify
eventual death
novel function
progenitor gene
relaxed selection
reduced expression
post-duplicative ‘transient’ of duration
4~17 million years
![Page 17: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/17.jpg)
rice analysis succeeded only because duplication is not too old
when the duplication is old: an analysis from yeast comparing related genomes with and without the duplicationKellis M, et al. 2004. Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428: 617-624
when the duplication is extremely new: an analysis from humanBailey JA, et al. 2002. Recent segmental duplications in the human genome. Science 297: 1003-1007
![Page 18: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/18.jpg)
proof of whole genome duplication in Saccharomyces cerevisiae by
comparison to sequence of Kluyveromyces waltii
duplication
mutation
gene death
interleaving genes from sister segments in comparison to K. waltii
![Page 19: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/19.jpg)
gene and regional correspondences with K. waltii
![Page 20: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/20.jpg)
ancient whole genome duplication in S. cerevisiae
![Page 21: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/21.jpg)
identifying recent segmental duplications in human assembly
whole genome shotgun (WGS) reads from Celera are aligned to map-based genome from IHGSC; recent segmental duplications are detected in similarity and read depth anomalies
![Page 22: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/22.jpg)
patterns of intra-chromosomal and inter-chromosomal duplication
recent segmental duplications of length>10-kb & identity>95%; intra-chromosomal (blue lines) and inter-chromosomal (red bars) duplication; unique regions
surrounded by intra-chromosomal duplications (gold bars) are hot spots for genomic disorders
![Page 23: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/23.jpg)
recent segmental duplications in IHGSC and Celera genomes
proportion of Celera aligned bases falls rapidly as identity exceeds 97% or length exceeds 15-kb, but the total sequence lost is still only 2%~3%
NB: search of the map-based rice genome revealed no segmental duplications of recent origins (Yu J, et al. 2006. Trends Plant Sci 11: 387-391
![Page 24: Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS. 2005. Trends Genet 21: 60-65.](https://reader034.fdocuments.us/reader034/viewer/2022051215/5697bfbd1a28abf838ca23f5/html5/thumbnails/24.jpg)
“Although it is clear that the detailed clone-ordered approach is superior in the resolution of segmental duplications, it would be unrealistic to propose that the sequencing community should abandon whole-genome-shotgun based approaches. These are the most efficient cost-effective means of capturing the bulk of the euchromatic sequence.”
Evan E. Eichler (21 October 2004)