Post on 19-Dec-2015
4: Genome evolution
Gene Duplication
Gene Duplication - History
1936: The first observation of a duplicated gene was in the Bar gene of Drosophila.
1950: Alpha and beta chains of hemoglobin are recognized to have been derived from gene duplication
1970: Ohno developed a theoretical framework of gene duplication
1995: Gene duplications are studied in fully sequenced genomes
Types of Genomic Duplications
•Part of an exon or the entire exon is duplicated
•Complete gene duplication
•Partial chromosome duplication
•Complete chromosome duplication
•Polyploidy: full genome duplication
Mechanism of Gene Duplication
Genes are duplicated mainly due to unequal crossing over
Mechanism of Gene Duplication
If these regions are complementary, it increases the chance of unequal crossing over. For example, if both of these regions are the same repeated sequence (microsatellite, transposon, etc’…)
After a Gene is DuplicatedAlternative fates:1. It can die and become a pseudogene.2. It can retain its original function, thus allowing
the organism to produce double the amount of the derived protein.
3. The two copies can diverge and each one will specialize in a different function.
Identical copiesOne copy diesDivergence
Invariant repeats
If the duplicated genes are identical or nearly identical, they are called invariant repeats. Many times the effect is an increase in the quantity of the derived protein, and this is why these duplications are also called “dose repetitions”.
Classical examples are the genes encoding rRNAs and tRNAs needed for translation.
Invariant repeats
Variant repeatsSome classic examples:
Trypsin, the digestive enzyme and Thrombin (cleaves fibrinogen during blood clotting) were derived from a complete gene duplication.
Lactalbumin, connected with lactose synthesis and Lysozyme, which degrades bacteria cell wall are also a result of an ancient gene duplication.
Variant repeats
4: Genome evolution
Dose Repetition
Gene duplication in mosquito as a response to insecticides
Kingdom = Metazoa (humans are also Metazoa)Phylum = Arthropoda (humans are Chordata)Class = Insecta (humans are Mammalia)Order = Diptera (humans are Primates)Genus = Culex (humans are Homo)Species = pipiens (sapiens)
Organophosphorous insecticides
Organophosphorous insecticides (e.g., parathion and malathion) interact with many enzymes and in particlar they inhibit the acetylcholinesterase (AChE) activity in the central nervous system, inducing lethal conditions.
Organophosphorous insecticides
The acetylcholine is a is a neurotransmitter that, upon release from neurons, stimulates the opening of a Na+ and K+ channels. These channels regulate the function of the brain as well as the heart, lungs, and skeletal muscles.
The acetylcholinesterase catalyzes the hydrolysis of acetylcholine to form inactive acetate and choline.
Acetylcholinesterase
Acetyl-CoA+
Choline
Acetylcholine
Postsynaptic tissue
Cholinergicneuron
Acetylcholinesterase
Acetylcholinesterase
Acetyl-CoA+
Choline
Acetylcholine
Postsynaptic tissue
Acetylcholinesterase
Cholinergicneuron
Insecticide
Esterases
Esterases are detoxifying carboxylester hydrolasewhich are responsible for the resistance to organophosphorous insecticides.
These enzymes are none specific.
Detoxifying esterases
Acetyl-CoA+
Choline
Acetylcholine
Postsynaptic tissue
Cholinergicneurone
Insecticide
Esterase
Esterases
Culex pipiens typically has 2 genes encoding esterases: Est-3 and Est-2. These genes are separated by an intergenic DNA fragment varying between 2–6 kb.
Est-3 Est-2
Alignment of predicted estα2 and estβ2 amino acid sequencesof Culex quinquefasciatus
~47% similarity between the two sequences
[Biochem.J.(1997) 325,359-365]
Esterases
Resistance alleles correspond to an esterase over-production (which binds or metabolizes the insecticide) relative to basal esterase production of susceptibility alleles. Several resistance allele have been described.
Different allele show 85-90% of similarity
Esterase starch gel
Esterases
For most alleles, the over-production of esterase is the result of gene duplication. This concerns either one locus or both.
Est-3 Est-2
B
A
A
B
47 % of similarity
~100 % of similarity
Nomenclature for the various resistance genes and their products at the Ester resistance locus
Genetica 112–113: 287–296, 2001
Esterases
The duplication of the two esterase loci, explains the tight statistical association of some electromorphs, like A2 and B2. Although, A4, A2 and A1 are coded by alleles of the Est-locus , and B2 and B4 by alleles at the Est-2 locus, A1, A4-B4 and A2-B2 are considered as alleles of a single superlocus (named Ester).
Independent amplifications have occurred only a few times.
Esterases
The level of gene duplication varies between the different alleles:EsterB1 could reach easily 100 copies in the fieldEster4 has never been found above few copies.
It varies also within and among populations for a given amplified allele.
Why the various amplified alleles have distinct limits of amplification is unknown.
Frequency of resistance allelein Montpellier (France)
111 21 31 41 km
111 21 31 41 km 111 21 31 41 km
Treatment area
A1
A1
A1
A1
Esterases
Resistance allele has a cost for the mosquito. In absence of insecticide in the environment non resistant-mosquitoes have the best fitness.
Geographic distribution of resistance allele
Genetica 112–113: 287–296, 2001
Esterases
The level of gene duplication varies between the different alleles:EsterB1 could reach easily 100 copies in the fieldEster4 has never been found above few copies.
It varies also within and among populations for a given amplified allele.
Why the various amplified alleles have distinct limits of amplification is unknown.
Gene Duplication in Aphids as a response for insecticide.
Same story than the mosquitoes
Few Words About Aphids
Kingdom=Metazoa (humans are also metazoa)Phylum=Arthropoda (humans are Chordata)Class=Insecta (humans are Mammalia)Order=Hemiptera (humans are Primates)Genus=Myzus (humans are Homo)Species=persicae (sapiens)
Around 4,000 species, ~250 are pests.
Few Words About Aphids
The Myzus persicae likes…lettuce.In fact, it is the most important aphid pest on lettuce
E4 & FE4
Myzus persicae has 2 genes encoding esterases E4 and FE4, which are responsible for the resistance to organophosphorous insecticides.
These genes show 99% identity in nucleotide sequences, both have exactly the same exon-intron structure (same size and same positions).
Many copies of E4 and FE4
Resistance strains of the aphid were found to contain multiple copies of E4 and FE4. The sequences of all copies are 100% identical.
It is believed that this duplication occurred within the last 50 years, with the introduction of the selective agent.
Take home message I:
Increase in gene number can occur quite rapidly under selection pressure.
Take home message II:
Mutations of gene duplication are not the limiting step (in evolution). It is selection that counts most.
4: Genome evolution
Duplications of RNA-specifying genes
Ribosome
Ribosome is a complex of proteins and RNA (called rRNA) on which proteins are built, based on the information in the mRNA.
Ribosomes are always composed of two units – big and small.
Ribosome
In prokaryotes the entire ribosome is 70S, and is composed of a 50S large subunit, and a 30S small subunit.
In eukaryotes the entire ribosome is 80S, and is composed of a 60S large subunit and a 40S small subunit.
Each subunit contain different rRNA.
The S value is the sedimentation coefficient in ultracentrifuge.
rRNA
There are also ribosomal genes coded by the mitochondrial genome.
In fact, the mitochondrial ribosome is coded by both nuclear and mitochondrial genes.
Comparison of ribosome structure in Bacteria, Eukaryotes, and Mitochondria
Bacterial (70S)Eukaryotic (80S)Mitochondrial (55S)
Large Subunit50S60S39S
rRNAs(1 of each)
23S (2904 nts)28S (4700 nts)16S (1560 nts)
5S (120 nts)5S (120 nts)
5.8S (160 nts)
Proteins33~4948
Small Subunit30S40S28S
rRNA16S (1542 nts)18S (1900 nts)12S (950 nts)
Proteins20~3329
16S, 18S are the most commonly used genesin phylogenetic analysis
Eukaryotic rRNA genes
• 28S, 5.8S, and 18S rRNAs are encoded by a single transcription unit (45S) separated by 2 internally transcribed spacers (ITS) and bounded by externally transcribed spacers (ETS).
18S 28S
ITS 1 ITS 2ETS ETS
5.8 S
• In Human the 45S rDNA is organized into 5 clusters (each has 30-40 repeats)
• These clusters are located on chromosomes 13, 14, 15, 21, and 22.
• These clusters are transcribed by the RNA polymerase I.
Human rRNA genes
18S 28S 18S 28S 18S 28S 18S 28S
• 5SrRNA genes occurs in tandem arrays and there are about ~200-300 true 5S genes and many dispersed pseudogenes.
• In human there are two gene cluster on chromosome 1 (in dogs there is a single gene cluster).
• 5S rRNA is transcribed by RNA polymerase III.
Human rRNA genes
Numbers of rRNA and tRNA genes per haploid genome in various organisms__________________________________________________________________________Genome Source Number of Number of
Approximate rRNA sets tRNA genesa
genome size (bp)__________________________________________________________________________Human mitochondrion 1 22 2 104
Nicotiana tabacum chloroplast 2 37 2 105
Escherichia coli 7 ~ 100 4 106
Neurospora crassa ~ 100 ~ 2,600 2 107
Saccharomyces cerevisiae ~ 140 ~ 360 5 107
Caenorhabditis elegans ~ 55 ~ 300 8 107
Tetrahymena thermophila 1 ~ 800c 2 108
Drosophila melanogaster 120-240 590-900 2 108
Physarum polycephalum 80-280 ~ 1,050 5 108
Euglena gracilis 800-1,000 ~ 740 2 109
Human ~ 300 ~ 1,300 3 109
Rattus norvegicus 150-170 ~ 6,500 3 109
Xenopus laevis 500-760 6,500-7,8008 109
__________________________________________________________________________
Correlation between the number of rRNA genes and the genome size
Correlation between number of rRNA genes and genome size: an exception
The general pattern: bigger genomes more genes to transcribed more rRNA needed.
Numbers of rRNA and tRNA genes per haploid genome in various organisms__________________________________________________________________________Genome Source Number of Approximate
rRNA sets genome size (bp)__________________________________________________________________________Human mitochondrion 1 2 104
Nicotiana tabacum chloroplast 2 2 105
Escherichia coli 7 4 106
Neurospora crassa ~ 100 2 107
Saccharomyces cerevisiae ~ 140 5 107
Caenorhabditis elegans ~ 55 8 107
Tetrahymena thermophila 1 2 108
Drosophila melanogaster 120-240 2 108
Physarum polycephalum 80-280 5 108
Euglena gracilis 800-1,000 2 109
Human ~ 300 3 109
Rattus norvegicus 150-170 3 109
Xenopus laevis 500-760 8 109
__________________________________________________________________________
4: Genome evolution
51Concerted EvolutionConcerted Evolution
18S rRNA tree
Bos Homo Ornithorhyncus
Gallus Xenopus
Danio Tetraodon
Branchiostoma Saccoglossus
Strongylocentrotus Capitella
Aplysia Lottia
Tribolium Apis
Trichinella Caenorhabditis
Schmidtea Drosophila
Anopheles Trichoplax Hydra Stylophora
Nematostella Sycon
Leucetta Caulophocus
Walteria Chondrosia
Chondrilla Negombata
Amphimedon Biemna
Monosiga Cryptococcus
Ustilago Neurospora
Schizosaccaromyces Kluyveromyces
0.1
Cnidaria
Hexactinellida
Bilateria
Demospongiae
Calcarea
Evolution of rRNA genes
• Although there are many copy of the same gene in the genome and the duplication is an ancient phenomena (since all organisms have many copies). All copies present in one genome are almost identical.
Divergent (classical) evolution
Duplication
Mutation
Time
Speciation
Divergent (classical) evolutionvs.
concerted evolution Divergent evolution
Concerted evolution
Concerted evolution
Duplication
Mutation
Time
Speciation
Question?
• How is it possible that all the ribosomal copies remain identical ??
????
(a) Stringent selection.(a) Stringent selection.(b) Recent multiplication.(b) Recent multiplication.(c) Concerted evolution.(c) Concerted evolution.
(a) Stringent selection.(a) Stringent selection.
Refuted by the fact that the ITS regions are as conserved as the functional rRNA sequences.
(b) Recent multiplication.(b) Recent multiplication.
Refuted by the fact that the intraspecific homogeneity does not decrease with evolutionary time.
(c) Concerted evolution.(c) Concerted evolution.
62
CONCERTED EVOLUTION
A member of a gene family does not evolve independently of the other members of the family.
It exchanges sequence information with other members reciprocally or non-reciprocally.
Through genetic interactions among its members, a multigene family evolves in concert as a unit.
CONCERTED EVOLUTION
Concerted evolution results in a homogenized set of nonallelic homologous sequences.
64
CONCERTED EVOLUTION REQUIRES:
(1) the horizontal transfer of mutations among the family members (homogenization).
(2) the spread of mutations in the population (fixation).
Mechanisms of concerted evolution
1. Unequal crossing-over2. Gene conversion3. Duplicative transposition.
Mechanisms of concerted evolution 1- Unequal crossing
1
2
Mechanisms of concerted evolution 1- Unequal crossing
3
4
Gene conversion
Gene conversion(one possible origin)
)a( Heteroduplexes formed by the resolution of Holliday structure or by other mechanisms.
)b( The blue DNA uses the invaded segment (e') as template to "correct" the mismatch, resulting in gene
conversion .
Gene conversion(one possible origin)
)c( Both DNA molecules use their original sequences as template to correct the mismatch. Gene conversion does
not occur .
Gene conversion(one possible origin)
Gene conversion has been Gene conversion has been found in found in allall ssppeciesecies and at and at allall lociloci that were examined that were examined in detail. in detail.
The rate of gene The rate of gene conversion varies with conversion varies with genomic location.genomic location.
concerted evolution:Advantages of Gene Conversion over
Unequal Crossing-Over
1. Unequal crossing-over changes the number of repeats, and may cause a dosage imbalance. Gene conversion does not change repeat number.
concerted evolution:Advantages of Gene Conversion over
Unequal Crossing-Over
2. Gene conversion can act on dispersed repeats. Unequal crossing-over is severely restricted when repeats are dispersed.
deletiondeletion
duplicationduplication
77
concerted evolution:Advantages of Unequal Crossing-Over over
Gene Conversion
1. Unequal crossing-over is faster and more efficient in bringing about concerted evolution.
At the mutation level, UCO occurs more At the mutation level, UCO occurs more frequently than GC.frequently than GC.
concerted evolution:Advantages of Unequal Crossing-Over
over Gene Conversion
2. In a gene-conversion event, only a small region is involved.
79
In yeast, an unequal crossing-over event involves on average ~20,000 20,000 bpbp. A gene-conversion track cannot exceed 1,500 bp1,500 bp.