[HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes •...
Transcript of [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes •...
1
Gene EvolutionGene EvolutionGene Evolution[Gene Evolution]
Forces affecting genome evolutionForces affecting genome evolutionForces affecting genome evolution
Genome EvolutionGenome EvolutionGenome Evolution
[Gene Evolution]
Genome changes Genome changes Genome changes
•• MutationMutation•• RecombinationRecombination•• TranspositionTransposition•• Gene transfer (e.g., between organelles and nuclear DNA)Gene transfer (e.g., between organelles and nuclear DNA)•• Deletion and Deletion and duplication duplication
–– major mechanismmajor mechanism for the expansion in the size of genomes as for the expansion in the size of genomes as organisms evolved from simple to more complex is duplication of organisms evolved from simple to more complex is duplication of whole genomeswhole genomes as well as duplication of as well as duplication of specific sequencesspecific sequences
Genome EvolutionGenome EvolutionGenome Evolution[Gene Evolution]
Early recognizedEarly recognizedEarly recognized
““A A redundant duplicatesredundant duplicates of a of a genegene may acquire may acquire divergent divergent mutationsmutations and eventually emerge and eventually emerge as a as a new genenew gene””
J. B. S. Haldane (1932)J. B. S. Haldane (1932)
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Early recognizedEarly recognizedEarly recognized
The The BarBar gene duplicationgene duplication•• first duplication mutation described in the literature (1936)first duplication mutation described in the literature (1936)
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Early recognizedEarly recognizedEarly recognized•• before the before the advent of biochemical advent of biochemical andand molecular biology techniquesmolecular biology techniques only only
few examples of duplicate genes were discovered few examples of duplicate genes were discovered •• late 1950s late 1950s
–– αα-- and and ββ--chains of hemoglobinchains of hemoglobin were recognized to have been derived were recognized to have been derived from duplicate genesfrom duplicate genes
Gene DuplicationGene DuplicationGene Duplication
•• laterlater–– isozymeisozyme and and cytological studiescytological studies
provided provided evidence for the frequent evidence for the frequent occurrence of gene duplicationoccurrence of gene duplicationduring evolutionduring evolution
2
[Gene Evolution]
The standard model of genome evolutionThe standard model of genome evolutionThe standard model of genome evolution
Gene DuplicationGene DuplicationGene Duplication
MutationsMutationsMutations
DNA sequenceDNA sequenceDNA sequence
Selection (purifying or positive)
Selection Selection ((purifying or purifying or positivepositive))
Random drift of neutral mutationsRandom drift of neutral mutationsRandom drift of neutral mutations
DNA sequence (altered) DNA sequence DNA sequence (altered(altered) )
The engine The engine The engine
The steering wheelThe steering The steering wheelwheel
[Gene Evolution]
Redundancy createsRedundancy createsRedundancy creates
•• SusumuSusumu OhnoOhno (1970) (1970) gene duplication is the only means by which a new gene can arisegene duplication is the only means by which a new gene can arise
natural selection merely modified, while redundancy creatednatural selection merely modified, while redundancy created–– other means of creating new functions are now known, butother means of creating new functions are now known, but–– Ohno's view remains Ohno's view remains largely validlargely valid
Gene DuplicationGene DuplicationGene Duplication
A new function is created by A new function is created by a.a. duplicating an old gene duplicating an old gene ✪✪b.b. modifying one of the copies modifying one of the copies ✪✪
✪✪
✪✪
[Gene Evolution]
Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution1.1. For each protein, For each protein, the rate of evolutionthe rate of evolution in terms of amino acid substitutions is in terms of amino acid substitutions is
approximately approximately constant per year per siteconstant per year per site for various lines, as long as the for various lines, as long as the functionfunction and and tertiary structuretertiary structure of the molecule remain essentially of the molecule remain essentially unalteredunaltered
Gene DuplicationGene DuplicationGene Duplication
DNA packaging DNA packaging proteinprotein•• highlyhighly constrainedconstrained•• slowslow evolutionevolution
Clotting Clotting proteinsproteins•• fewfew constraintsconstraints•• rapidrapid evolutionevolution
[Gene Evolution]
Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution1.1. For each protein, For each protein, the rate of evolutionthe rate of evolution in terms of amino acid substitutions is in terms of amino acid substitutions is
approximately approximately constant per year per siteconstant per year per site for various lines, as long as the for various lines, as long as the functionfunction and and tertiary structuretertiary structure of the molecule remain essentially of the molecule remain essentially unalteredunaltered
–– rates of replacementrates of replacement substitutions are substitutions are higher among functionally higher among functionally less important genesless important genes
Gene DuplicationGene DuplicationGene Duplication
8.598.592.792.79136136γγ5.885.882.212.21159159ββ113.533.531.411.41166166αα11
InterferonsInterferons
6.126.120.000.00101101HistoneHistone 44
6.396.390.000.00135135HistoneHistone 33
HistonesHistones
SilentSilent((per 10per 109 9 y)y)
ReplacementReplacement((per 10per 1099 y)y)
Length Length (bp)(bp)GeneGene
Histones Histones involvedinvolved with fundamental processes (DNA transcription and synthesis)with fundamental processes (DNA transcription and synthesis)Interferons Interferons lessless important (one of many immune system genes)important (one of many immune system genes)
PreproinsulinPreproinsulin
ProinsulinProinsulin
[Gene Evolution]
Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution2.2. Functionally less important moleculesFunctionally less important molecules or or partsparts of molecules of molecules evolve fasterevolve faster
than more important ones (in terms of mutant substitutions)than more important ones (in terms of mutant substitutions)
Gene DuplicationGene DuplicationGene Duplication
SignalSignal B chainB chain C peptideC peptide A chainA chain
SS SS
1.2 X 101.2 X 10--99
subst/site/yearsubst/site/year0.2 X 100.2 X 10--99
subst/site/yearsubst/site/year1.1 X 101.1 X 10--99
subst/site/yearsubst/site/year
[Gene Evolution]
Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution2.2. Functionally less important moleculesFunctionally less important molecules or or partsparts of molecules of molecules evolve fasterevolve faster
than more important ones (in terms of mutant substitutions)than more important ones (in terms of mutant substitutions)
Gene DuplicationGene DuplicationGene Duplication
Conservation in a typical geneon the basis of 3,165 human-mouse pairs
Conservation in a Conservation in a typicaltypical genegeneon the basis of 3,165 humanon the basis of 3,165 human--mouse pairsmouse pairs
Start of transcriptionStart of transcription PolyadenylationPolyadenylation sitesite
Splice sitesSplice sitesStart of translationStart of translation
3
[Gene Evolution]
Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution2.2. Functionally less important moleculesFunctionally less important molecules or or partsparts of molecules of molecules evolve fasterevolve faster
than more important ones (in terms of mutant substitutions) than more important ones (in terms of mutant substitutions)
Gene DuplicationGene DuplicationGene Duplication
11 22 33 44
PseudogenesPseudogenes
55’’ flanking regionflanking region
55’’ UTRUTR
Nondegenerate sitesNondegenerate sites
22--fold degenerate sitesfold degenerate sites
44--fold degenerate sitesfold degenerate sites
IntronsIntrons
33’’ UTRUTR
33’’ flanking regionflanking region
substitutions/site/10substitutions/site/10--99 yearyear
[Gene Evolution]
Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution3.3. Those mutant Those mutant substitutionssubstitutions that are that are less disruptiveless disruptive to the existing structure to the existing structure
and function of the molecule (conservative substitutions) and function of the molecule (conservative substitutions) occur more occur more frequentlyfrequently in evolution than more disruptive onesin evolution than more disruptive ones
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution4.4. Gene duplicationGene duplication must must always precede the emergence of a genealways precede the emergence of a gene having a having a
new functionnew function5.5. Selective eliminationSelective elimination of definitely deleterious mutant and random fixation of of definitely deleterious mutant and random fixation of
selectively neutral or very slightly deleterious mutants selectively neutral or very slightly deleterious mutants occur far more occur far more frequently in evolution than positive Darwinian selectionfrequently in evolution than positive Darwinian selection of definitely of definitely advantageous mutantsadvantageous mutants
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution1.1. For each protein, For each protein, the rate of evolutionthe rate of evolution in terms of amino acid substitutions is in terms of amino acid substitutions is
approximately approximately constant per year per siteconstant per year per site for various lines, as long as the for various lines, as long as the functionfunction and tertiary structure of the molecule remain essentially and tertiary structure of the molecule remain essentially unalteredunaltered
2.2. Functionally less important moleculesFunctionally less important molecules or or partsparts of molecules of molecules evolve fasterevolve fasterthan more important ones (in terms of mutant substitutions) than more important ones (in terms of mutant substitutions)
3.3. Those mutant Those mutant substitutionssubstitutions that are that are less disruptiveless disruptive to the existing structure to the existing structure and function of the molecule (conservative substitutions) and function of the molecule (conservative substitutions) occur more occur more frequentlyfrequently in evolution than more disruptive ones Gene duplication must in evolution than more disruptive ones Gene duplication must always precede the emergence of a gene having a new functionalways precede the emergence of a gene having a new function
4.4. Gene duplicationGene duplication must must always precede the emergence of a genealways precede the emergence of a gene having a having a new functionnew function
5.5. Selective eliminationSelective elimination of definitely deleterious mutant and random fixation of of definitely deleterious mutant and random fixation of selectively neutral or very slightly deleterious mutants selectively neutral or very slightly deleterious mutants occur far more occur far more frequently in evolution than positive Darwinian selectionfrequently in evolution than positive Darwinian selection of definitely of definitely advantageous mutantsadvantageous mutants
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
The standard model of genome evolutionThe standard model of genome evolutionThe standard model of genome evolution
Gene DuplicationGene DuplicationGene Duplication
Single nucleotide changes, short insertions or deletions, inversions, recombinations, domain
duplication, gene duplication, cluster duplication, segment duplication, chromosome duplication,
genome duplication, etc.
Single nucleotide changes, short insertions or Single nucleotide changes, short insertions or deletions, inversions, recombinations, domain deletions, inversions, recombinations, domain
duplication, gene duplication, cluster duplication, duplication, gene duplication, cluster duplication, segment duplication, chromosome duplication, segment duplication, chromosome duplication,
genome duplication, etc.genome duplication, etc.
DNA sequenceDNA sequenceDNA sequence
Selection (purifying or positive)
Selection Selection ((purifying or purifying or positivepositive))
Random drift of neutral mutationsRandom drift of neutral mutationsRandom drift of neutral mutations
DNA sequence (altered) DNA sequence DNA sequence (altered(altered) )
The engine The engine The engine
The steering wheelThe steering The steering wheelwheel
[Gene Evolution]
Detection of natural selection Detection of natural selection Detection of natural selection
Overwhelming support for neutralist predictionOverwhelming support for neutralist prediction1.1. synonymous vs. nonsynonymous vs. non--synonymous substitution ratessynonymous substitution rates2.2. accelerated rate of psuedogene evolutionaccelerated rate of psuedogene evolution
Synonymous substitutionsSynonymous substitutions do do NOT changeNOT change encoded amino acidencoded amino acid
NonNon--synonymous substitutionssynonymous substitutions DO changeDO change encoded amino acidencoded amino acid
•• if DNA divergence includes if DNA divergence includes neutral mutationsneutral mutations–– 3rd position should change more rapidly3rd position should change more rapidly
•• synonymous mutations are more likely to be neutralsynonymous mutations are more likely to be neutral
•• if most DNA changes were due to if most DNA changes were due to adaptive evolutionadaptive evolution–– most changes would occur most changes would occur in thein the 1st 1st andand 2nd codon positions2nd codon positions
in most genes ever studied in most genes ever studied synonymous sites change at a higher ratesynonymous sites change at a higher rate than than nonnon--synonymous sitessynonymous sites
Gene DuplicationGene DuplicationGene Duplication
4
Types of natural selectionTypes of natural selection1.1. purifying (negative) selectionpurifying (negative) selection removalremoval of of deleteriousdeleterious variantsvariants2.2. diversifying (positive) selectiondiversifying (positive) selection fixationfixation of of adaptiveadaptive variantsvariants
Types of substitution rates for protein coding genesTypes of substitution rates for protein coding genes1.1. synonymous substitution ratesynonymous substitution rate ((KKss or or dsds))
number of number of synonymous substitutionssynonymous substitutions per per synonymous sitesynonymous site
(# of synonymous changes divided by the # of synonymous sites)(# of synonymous changes divided by the # of synonymous sites)
–– rate of substitution for DNA changes that do not change the rate of substitution for DNA changes that do not change the encoded amino acidsencoded amino acids
2.2. nonnon--synonymous substitution ratesynonymous substitution rate ((KKAA or or dndn))
number of number of nonnon--synonymous substitutionssynonymous substitutions per per nonnon--synonymous sitesynonymous site
(# of non(# of non--synonymous changes divided by the # of nonsynonymous changes divided by the # of non--synonymous sites)synonymous sites)
–– rate of substitution for DNA changes that do change the encoded rate of substitution for DNA changes that do change the encoded amino acidsamino acids
thethe relative levels for these rates relative levels for these rates indicate theindicate the mode of selection mode of selection for a genefor a gene
[Gene Evolution]
Detection of natural selection Detection of natural selection Detection of natural selection
Gene DuplicationGene DuplicationGene Duplication
Neutral theory provides Neutral theory provides Null ModelNull Model for tests of selectionfor tests of selection
[Gene Evolution]
Detection of natural selection Detection of natural selection Detection of natural selection
Gene DuplicationGene DuplicationGene Duplication
0.00.0 1.01.0←← conservingconserving diversifying diversifying →→
KKAA/K/KSS (d(dNN/d/dSS, , ωω))
KKAA/K/KSS ≈≈ 1 1 Neutral evolution (no selection)Neutral evolution (no selection)an equal number of silent and aminoan equal number of silent and amino--acid replacement substitutions have been acid replacement substitutions have been preserved since duplicationpreserved since duplication
KKAA/K/KSS «« 1 1 Purifying SelectionPurifying Selectionmore aminomore amino--acid replacement substitutions than silent substitutions have beacid replacement substitutions than silent substitutions have been en eliminated since duplication eliminated since duplication
•• some aminosome amino--acid changes had deleterious effectsacid changes had deleterious effects
KKAA/K/KSS »» 1 1 Diversifying (positive) SelectionDiversifying (positive) Selectionmore aminomore amino--acid replacement substitutions than silent substitutions have beacid replacement substitutions than silent substitutions have been en preserved since duplicationpreserved since duplication
abundance of replacement substitutions conferring a selective adabundance of replacement substitutions conferring a selective advantagevantage
[Gene Evolution]
HomologyHomologyHomology
Similarity due to inheritance from a common ancestorSimilarity due to inheritance from a common ancestor
HomologsHomologssequences that have sequences that have common originscommon origins but but may or may not may or may not have common activityhave common activity
1.1. OrthologsOrthologshomologs produced by speciationhomologs produced by speciation
•• genes derived from a common ancestor that diverged due to genes derived from a common ancestor that diverged due to divergence of the organismsdivergence of the organisms they are associated withthey are associated with
•• tend to have tend to have similar functionsimilar function
2.2. ParalogsParalogshomologs produced by gene duplicationhomologs produced by gene duplication
•• genes derived from a common ancestral gene that duplicated withigenes derived from a common ancestral gene that duplicated within an n an organism and then subseqeuntly organism and then subseqeuntly diverged by accumulated mutationdiverged by accumulated mutation
•• tend to have tend to have slightly different functionsslightly different functions
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
HomologyHomologyHomology
Similarity due to inheritance from a common ancestorSimilarity due to inheritance from a common ancestor
Gene DuplicationGene DuplicationGene Duplication
duplicationduplication
evolutionevolution
speciationspeciation
evolutionevolution
ParalogsParalogs AA1 and 1 and AA22BB1 and 1 and BB22
Orthologs Orthologs AA1 and 1 and BB11AA2 and 2 and BB22
[Gene Evolution]
HomologyHomologyHomology
Similarity due to inheritance from a common ancestorSimilarity due to inheritance from a common ancestor
Gene DuplicationGene DuplicationGene Duplication
ParalogsParalogs
[Gene Evolution]
Types of Gene DuplicationTypes of Types of Gene DuplicationGene Duplication
An increase in the number of copies of a DNA segment can be brouAn increase in the number of copies of a DNA segment can be brought ght about by about by several typesseveral types of gene duplicationof gene duplication
•• classified according to the classified according to the extent of the genomic region involvedextent of the genomic region involved
1.1. partial or internal gene duplicationpartial or internal gene duplication
2.2. complete gene duplicationcomplete gene duplication
3.3. partial chromosomal duplicationpartial chromosomal duplication
4.4. complete chromosomal duplicationcomplete chromosomal duplication
5.5. polyploidy or genome duplicationpolyploidy or genome duplication
Gene DuplicationGene DuplicationGene Duplication
5
[Gene Evolution]
Types of Gene DuplicationTypes of Types of Gene DuplicationGene Duplication
Gene DuplicationGene DuplicationGene Duplication
deleteriousness determined by the mode deleteriousness determined by the mode of reproduction and sexof reproduction and sex--determinationdeterminationcommoncommonpolyploidypolyploidy
almost invariably deleteriousalmost invariably deleteriouscommoncommonpolysomypolysomy
almost invariably deleteriousalmost invariably deleteriousrarerarepartial polysomypartial polysomy
deleterious only in organisms in which the deleterious only in organisms in which the genome is replicated as unitgenome is replicated as unitfrequentfrequentwhole genewhole gene
deleterious if it affects reading framedeleterious if it affects reading framevery frequentvery frequentpartial genepartial gene
Effects on FitnessEffects on FitnessMutational Mutational occurrenceoccurrence
Extent of Extent of duplicationduplication
[Gene Evolution]
Equal & Unequal Crossing OverEqual & Unequal Crossing OverEqual & Unequal Crossing Over
Equal Crossing OverEqual Crossing Over
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Equal & Unequal Crossing OverEqual & Unequal Crossing OverEqual & Unequal Crossing Over
Unqual Crossing OverUnqual Crossing Over
Gene DuplicationGene DuplicationGene Duplication
DeletionDeletion
DuplicationDuplication
[Gene Evolution]
Equal & Unequal Crossing OverEqual & Unequal Crossing OverEqual & Unequal Crossing Over
Unqual Crossing OverUnqual Crossing Over
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Domains and exonsDomains and exonsDomains and exons
1.1. Functional domainFunctional domainwellwell--defined region within a protein that defined region within a protein that performs a specific functionperforms a specific functione.g. substrate bindinge.g. substrate binding
2.2. Structural domainStructural domain or or modulemodulewellwell--defined region within a protein that constitutes a defined region within a protein that constitutes a stablestable, independently , independently folding, folding, compact structural unitcompact structural unit within the protein that can be within the protein that can be distinguished from all the other partsdistinguished from all the other parts
Defining the boundariesDefining the boundaries of a functional domain is often difficultof a functional domain is often difficultfunctionalityfunctionality is in many cases conferred byis in many cases conferred by aminoamino--acid residues acid residues that arethat arescatteredscattered throughout the polypeptidethroughout the polypeptide
Structural modules are collinear with the aminoStructural modules are collinear with the amino--acid sequence of a proteinacid sequence of a proteini.e., a module consists of a continuous stretch of amino acidsi.e., a module consists of a continuous stretch of amino acids
a.a. if if functionalityfunctionality is conferred by a is conferred by a modulemodulea duplication will a duplication will increase the number of functional segmentsincrease the number of functional segments
b.b. if if functionalityfunctionality is conferred by is conferred by aminoamino--acid residues acid residues scattered among scattered among different modulesdifferent modules
a a duplication may not be functionally desirableduplication may not be functionally desirable
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Domains and exonsDomains and exonsDomains and exons
Possible relationships between the protein structural domains anPossible relationships between the protein structural domains and d the arrangements of the exons in the genethe arrangements of the exons in the gene
Gene DuplicationGene DuplicationGene Duplication
each exon corresponds exactly to each exon corresponds exactly to a structural domaina structural domain
approximate correspondence approximate correspondence
an exon encodes 2 or more domainsan exon encodes 2 or more domains
a single structural domain is a single structural domain is encoded by 2 or more exonsencoded by 2 or more exons
lack of correspondence between lack of correspondence between exons and domainsexons and domains
6
[Gene Evolution]
Domains and exonsDomains and exonsDomains and exons
αα and and ββ chainschains of theof the vertebrate hemoglobin vertebrate hemoglobin •• consist of consist of 4 domains4 domains, whereas their genes consist of only , whereas their genes consist of only 3 exons3 exons, the second , the second
of which encodes two adjacent domainsof which encodes two adjacent domains•• it was postulated that a merger occurred between two exons as a it was postulated that a merger occurred between two exons as a result of the result of the
loss of a central intronloss of a central intron•• homologous globin genes in plants (leghemoglobins) were found tohomologous globin genes in plants (leghemoglobins) were found to contain an contain an
additional intron at or very near the position predicted by the additional intron at or very near the position predicted by the domain structure domain structure of globinsof globins
•• a similar intron was found in the globin genes of a nematodea similar intron was found in the globin genes of a nematodeduring the evolution of the globinduring the evolution of the globin--gene family gene family
from a from a 44--exon ancestral geneexon ancestral geneseveral lineages lost some or all of their three intronsseveral lineages lost some or all of their three introns, thereby , thereby generating a panoply of exon/intron permutationsgenerating a panoply of exon/intron permutations
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Domains and exonsDomains and exonsDomains and exons
αα and and ββ chainschains of theof the vertebrate hemoglobin vertebrate hemoglobin •• internal organization of the human internal organization of the human αα11 and and ββ genesgenes
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Domains and exonsDomains and exonsDomains and exons
αα-- and and ββ--chainschains of theof the vertebrate hemoglobinvertebrate hemoglobin
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Domains and exonsDomains and exonsDomains and exons•• domain duplications at the proteindomain duplications at the protein level generally indicate that an level generally indicate that an exon exon
duplicationduplication has occurred at the DNA levelhas occurred at the DNA level–– it has been suggested that exon duplication is one of the most iit has been suggested that exon duplication is one of the most important mportant
types of internal gene duplicationtypes of internal gene duplication•• proteins show internal repeats of amino acid sequencesproteins show internal repeats of amino acid sequences•• these repeats often correspond to functional or structural domaithese repeats often correspond to functional or structural domains ns
These findings suggestThese findings suggest
1.1. the genes for the genes for many proteins were formed by internal gene duplicationmany proteins were formed by internal gene duplication
2.2. the the functionfunction of these proteins was of these proteins was improvedimproved by increasing their by increasing their stabilitystability or or the the number of active sitesnumber of active sites
3.3. internal duplications can also provide internal duplications can also provide redundant DNAredundant DNA segments for a gene to segments for a gene to develop new functionsdevelop new functions
–– many many complex genescomplex genes might have might have evolved from smallevolved from small, , simplesimpleprimordial genes primordial genes viavia internal duplicationinternal duplication and and subsequent subsequent modificationmodification
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Internal duplicationsInternal duplicationsInternal duplicationsProteins with internal domain duplications taking up 50% or moreProteins with internal domain duplications taking up 50% or more of the of the total length of the proteintotal length of the protein
Gene DuplicationGene DuplicationGene Duplication
878722360360826826Villin Villin 100100774242284284Tropomyosin Tropomyosin αα chain chain 10010033195195584584Serum albumin Serum albumin
9999885757461461Ribonuclease/angiogenin inhibitor Ribonuclease/angiogenin inhibitor 858555354354
22117117223030
3358658638173817PrePre--propro--von Willebrand factorvon Willebrand factor5050557979790790Plasminogen Plasminogen 95952260960912801280Multidrug resistanceMultidrug resistance--1 P1 P--glycoportein glycoportein 79793348048019271927LactaseLactase--phlorizin hydrolase phlorizin hydrolase 5454226868251251InterleukinInterleukin--2 receptor 2 receptor
10010044108108423423Immunoglobulin Immunoglobulin εε chain C region chain C region 989833108108329329Immunoglobulin Immunoglobulin γγ chain C region chain C region 979722447447917917Hexokinase Hexokinase 949422207207439439Homopexin Homopexin
100100227474148148CalciumCalcium--dependent regulator protein dependent regulator protein 9696559191474474αα11ββ--glycoprotein glycoprotein
% repetition
# repeats
repeat length
protein lengthProtein
[Gene Evolution]
Domain duplicationsDomain duplicationsDomain duplicationsVariable and Constant regions of immunoglobulin genesVariable and Constant regions of immunoglobulin genes•• probably probably derived from a common primordial domainderived from a common primordial domain, but have since , but have since
acquired distinct propertiesacquired distinct properties•• despite common molecular ancestrydespite common molecular ancestry
–– the the variablevariable region of immunoglobulins region of immunoglobulins binds antigensbinds antigens–– the the constantconstant region mediates region mediates nonnon--antigenic functionsantigenic functions
Gene DuplicationGene DuplicationGene Duplication
7
[Gene Evolution]
Domain duplicationsDomain duplicationsDomain duplicationsVariable and Constant regions of immunoglobulin genesVariable and Constant regions of immunoglobulin genes
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Creation of new functionCreation of new functionCreation of new function
3 pathways can lead to the creation of a new function3 pathways can lead to the creation of a new function1.1. de novode novo appearanceappearance from nonfunctional sequencefrom nonfunctional sequence due to accumulation of due to accumulation of
mutationsmutations
2.2. replacementreplacement due to due to change of one function into anotherchange of one function into another
3.3. creation of a creation of a novel function from a redundant copynovel function from a redundant copy of an old function of an old function following duplicationfollowing duplication
Gene DuplicationGene DuplicationGene Duplication
D I C ED I C E
D I R ED I R E
D A R ED A R E
C A R EC A R E
C A R DC A R D
D I C ED I C E
[Gene Evolution]
Creation of new functionCreation of new functionCreation of new function
3 pathways can lead to the creation of a new function3 pathways can lead to the creation of a new function
1.1. de novode novo appearanceappearance from nonfunctional sequencefrom nonfunctional sequence due to due to accumulation of mutationsaccumulation of mutations
2.2. replacementreplacement due to due to change of one function into anotherchange of one function into another
3.3. creation of a creation of a novel function from a redundant copynovel function from a redundant copy of an old of an old function following duplicationfunction following duplication
Following gene duplication 3 things may happen to the copiesFollowing gene duplication 3 things may happen to the copies
a.a. all copies may retain the same functionall copies may retain the same function
b.b. some copies may diesome copies may die
c.c. some copies may evolve into new functionssome copies may evolve into new functions
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Creation of new functionCreation of new functionCreation of new function
Prevalence of gene duplicationPrevalence of gene duplication•• gene duplications gene duplications arise spontaneously at high ratesarise spontaneously at high rates in bacteria, in bacteria,
bacteriophages, insects and mammals, and are bacteriophages, insects and mammals, and are generally viablegenerally viable
•• mutation is not the ratemutation is not the rate--limitinglimiting step in the evolutionary process of gene step in the evolutionary process of gene duplicationduplication
•• only a small fraction of all duplicated genes are retainedonly a small fraction of all duplicated genes are retained
•• an even smaller fraction evolves new functionsan even smaller fraction evolves new functions
•• the the probability of nonfunctionalization is much higherprobability of nonfunctionalization is much higher than that of evolving than that of evolving a new functiona new function
•• an an increase in gene number can occur quite rapidly under selection increase in gene number can occur quite rapidly under selection pressurepressure for increased amounts of a gene productfor increased amounts of a gene product
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Gene redundancyGene redundancyGene redundancy
Duplicated genes can be divided into 2 typesDuplicated genes can be divided into 2 types
1.1. Invariant repeats Invariant repeats •• identical or nearly identical in sequence to one anotheridentical or nearly identical in sequence to one another–– the repetition of identical sequences is correlated with the synthe repetition of identical sequences is correlated with the synthesis of thesis of
increased quantities of the gene productincreased quantities of the gene product that is that is required for the required for the normal function of the organismnormal function of the organism
dose repetitionsdose repetitions–– common whenever a metabolic need for producing large quantities common whenever a metabolic need for producing large quantities of of
specific RNAs or proteins arisesspecific RNAs or proteins arises
2.2. Variant repeats Variant repeats •• copies of a gene that differ in their sequence to a lesser or grcopies of a gene that differ in their sequence to a lesser or greater extenteater extent
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Gene redundancyGene redundancyGene redundancy
1.1. Invariant repeatsInvariant repeats
Gene DuplicationGene DuplicationGene Duplication
Histone genesHistone genes
rDNAsrDNAs
8
[Gene Evolution]
Gene redundancyGene redundancyGene redundancy
1.1. Invariant repeatsInvariant repeatsNumbers of rRNA and tRNA genes per haploid genome in various orgNumbers of rRNA and tRNA genes per haploid genome in various organismsanisms
Gene DuplicationGene DuplicationGene Duplication
8 8 ×× 1010996,5006,500--7,8007,800500500--760760Xenopus laevisXenopus laevis
3 3 ×× 101099~ 6,500~ 6,500150150--170170Rattus norvegicusRattus norvegicus
3 3 ×× 101099~ 1,300~ 1,300~ 300~ 300HumanHuman
5 5 ×× 101088~ 1,050~ 1,0508080--280280Physarum polycephalumPhysarum polycephalum
2 2 ×× 101088590590--900900120120--240240Drosophila melanogasterDrosophila melanogaster
2 2 ×× 101088~ 800~ 80011Tetrahymena thermophilaTetrahymena thermophila
8 8 ×× 101077~ 300~ 300~ 55~ 55Caenorhabditis elegansCaenorhabditis elegans
5 5 ×× 101077~ 360~ 360~ 140~ 140Saccharomyces cerevisiaeSaccharomyces cerevisiae
2 2 ×× 101077~ 2,600~ 2,600~ 100~ 100Neurospora crassaNeurospora crassa
4 4 ×× 101066~ 100~ 10077Escherichia coliEscherichia coli
2 2 ×× 101055373722Nicotiana tabacumNicotiana tabacum chloroplastchloroplast
2 2 ×× 101044222211Human mitochondrionHuman mitochondrion
genome size genome size (bp)(bp)# tRNA genes# tRNA genes# # rRNA setsrRNA setsGenome SourceGenome Source
[Gene Evolution]
Gene redundancyGene redundancyGene redundancy
2.2. Variant repeats Variant repeats
““As long as there are other copies of a gene that function normalAs long as there are other copies of a gene that function normally, ly, a duplicate gene may accumulate deleterious mutations and a duplicate gene may accumulate deleterious mutations and become nonfunctionalbecome nonfunctional without adversely affecting the fitness of without adversely affecting the fitness of the organismthe organism””
J. B. S. Haldane (1933)J. B. S. Haldane (1933)
““Because Because deleterious mutations occur far more oftendeleterious mutations occur far more often than than advantageous ones, a advantageous ones, a redundant duplicate gene is more likely redundant duplicate gene is more likely to become nonfunctionalto become nonfunctional than to evolve into a new genethan to evolve into a new gene””
Susumu Ohno (1972)Susumu Ohno (1972)
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Gene redundancyGene redundancyGene redundancy
2.2. Variant repeatsVariant repeats
some copies may evolve into new functionssome copies may evolve into new functions–– can perform markedly different functionscan perform markedly different functions
a.a. thrombinthrombin and and trypsintrypsin•• thrombin cleaves fibrinogen during the process of blood clottingthrombin cleaves fibrinogen during the process of blood clotting•• trypsin digestive enzymetrypsin digestive enzyme
b.b. lactalbuminlactalbumin and and lysozymelysozyme•• lactalbumin is a subunit of the enzyme that catalyzes the synthelactalbumin is a subunit of the enzyme that catalyzes the synthesis sis
of the sugar lactoseof the sugar lactose•• lysozyme dissolves certain bacteria by cleaving the polysaccharilysozyme dissolves certain bacteria by cleaving the polysaccharide de
component of their cell wallscomponent of their cell walls
•• in some cases, a in some cases, a novel functionnovel function may be achieved through may be achieved through few substitutionsfew substitutions–– lactate dehydrogenaselactate dehydrogenase can be changed into a can be changed into a malate dehydromalate dehydro--
genasegenase by replacing just 1 out of its 317 amino acidsby replacing just 1 out of its 317 amino acids
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Gene redundancyGene redundancyGene redundancy
The origin of new gene function after gene duplicationThe origin of new gene function after gene duplication
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Gene redundancyGene redundancyGene redundancy
The origin of new gene function after gene duplicationThe origin of new gene function after gene duplication•• more more complex species evolvecomplex species evolve by by adding new gene functionsadding new gene functions
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Young duplicate genes in the human genomeYoung duplicate genes in the human genomeYoung duplicate genes in the human genome
Young duplicate genesYoung duplicate genesduplicate genes with Kduplicate genes with Kss < 0.3< 0.3
•• 250 pairs of young human duplicates studied250 pairs of young human duplicates studied–– 145 showed significant evidence that one copy had evolved faster145 showed significant evidence that one copy had evolved faster than the than the
other at the aminoother at the amino--acid levelacid level
•• KKAA/K/KSS ratioratio–– index of functional constraints index of functional constraints –– the smaller the Ka/Ks ratio is, the stronger the functional consthe smaller the Ka/Ks ratio is, the stronger the functional constraints aretraints are
65 pairs showed significantly different K65 pairs showed significantly different KAA/K/KSS ratios ratios –– after gene duplication 26% of the duplicate pairs have after gene duplication 26% of the duplicate pairs have
experienced different functional constraintsexperienced different functional constraints
Gene DuplicationGene DuplicationGene Duplication
9
[Gene Evolution]
Young duplicate genes in the human genomeYoung duplicate genes in the human genomeYoung duplicate genes in the human genome
Gene DuplicationGene DuplicationGene Duplication
Ka/Ks > 1 in 113 genesKKaa//KKss > 1 in 113 genes> 1 in 113 genes
[Gene Evolution]
Young duplicate genes in the human genomeYoung duplicate genes in the human genomeYoung duplicate genes in the human genome
Gene DuplicationGene DuplicationGene Duplication
fast-evolving genesfastfast--evolving genesevolving genes slow-evolving genesslowslow--evolving genesevolving genes
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
result from complete gene duplicationresult from complete gene duplicationthe genes that belong to a group of repeated sequences in a genothe genes that belong to a group of repeated sequences in a genome me
•• functional or nonfunctional members of a gene family may functional or nonfunctional members of a gene family may a.a. reside in close proximityreside in close proximity to one another on the same chromosome to one another on the same chromosome b.b. located on different chromosomeslocated on different chromosomes
SuperfamiliesSuperfamilies•• term coined by Dayhoff (1978) in order to distinguish term coined by Dayhoff (1978) in order to distinguish distantly relateddistantly related
proteinsproteins from closely related onesfrom closely related ones•• similarity <50%similarity <50% at the aminoat the amino--acid levelacid level
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Gene FamiliesGene FamiliesGene FamiliesFunctionally similar genes are occasionally clustered, but usualFunctionally similar genes are occasionally clustered, but usually ly dispersed throughout the genomedispersed throughout the genome
Gene DuplicationGene DuplicationGene Duplication
Histone genesHistone genes
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Duplicated regions on human chromosomesDuplicated regions on human chromosomesParalogons on human chromosome 17Paralogons on human chromosome 17
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Gene FamiliesGene FamiliesGene FamiliesDuplicated Regions on Human ChromosomesDuplicated Regions on Human Chromosomes
Gene DuplicationGene DuplicationGene Duplication
2424 of these regions of these regions correspondcorrespondto known genomic to known genomic disordersdisorders
Blue intrachromosomalRed interchromosomal segmental
duplications
10
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Evolutionarily related genes in Evolutionarily related genes in Bacillus subtilisBacillus subtilis genomegenome
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Large genomes contain more paralogs than smaller genomesLarge genomes contain more paralogs than smaller genomes
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Large genomes contain more paralogs than smaller genomes Large genomes contain more paralogs than smaller genomes
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
IsozymesIsozymes•• enzymes that catalyse the enzymes that catalyse the same biochemical reactionsame biochemical reaction but may differ in but may differ in
a.a. tissue specificitytissue specificityb.b. developmental regulation developmental regulation c.c. electrophoretic mobilityelectrophoretic mobilityd.d. biochemical propertiesbiochemical properties
encoded by different lociencoded by different loci, usually duplicated genes, usually duplicated genes
AllozymesAllozymes–– distinct forms of an enzymedistinct forms of an enzyme–– encoded by encoded by different allelesdifferent alleles at a at a single locussingle locus
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Lactate dehydrogenase (LDH): Lactate dehydrogenase (LDH): developmental speciationdevelopmental speciation•• 2 genes encode for the 2 genes encode for the αα and and ββ subunits of in mammalssubunits of in mammals•• these 2 subunits form these 2 subunits form 5 tetrameric isozymes5 tetrameric isozymes all of which catalyze either all of which catalyze either
a.a. the conversion the conversion lactate lactate pyruvatepyruvate in the presence of the oxidized in the presence of the oxidized coenzyme nicotinamide adenine dinucleotide (coenzyme nicotinamide adenine dinucleotide (NAD+NAD+) )
b.b. pyruvatepyruvate lactate lactate in the presence of the reduced coenzyme (in the presence of the reduced coenzyme (NADHNADH) )
Isozymes rich in Isozymes rich in ββ subunitssubunits–– have a high affinity for NAD+have a high affinity for NAD+–– function as true lactate dehydrogenase in function as true lactate dehydrogenase in aerobically metabolizing aerobically metabolizing
tissuestissuese.g. e.g. heartheart
Isozymes rich in Isozymes rich in αα subunitssubunits–– have a high affinity for NADHhave a high affinity for NADH–– are especially geared to serve as pyruvate reductases in are especially geared to serve as pyruvate reductases in anaerobically anaerobically
metabolizing tissuesmetabolizing tissuese.g. e.g. skeletal muscleskeletal muscle
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Lactate dehydrogenase (LDH): Lactate dehydrogenase (LDH): developmental speciationdevelopmental speciationDevelopmental sequence of LDH production in the heartDevelopmental sequence of LDH production in the heart
Gene DuplicationGene DuplicationGene Duplication
•• the more anaerobic the heart is (e.i. the more anaerobic the heart is (e.i. early stages of gestationearly stages of gestation) ) •• the the higherhigher the proportion of the proportion of LDH isozymes rich in LDH isozymes rich in αα subunitssubunits will bewill be
ββ--subunitsubunit
αα--subunitsubunit
11
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Lactate dehydrogenase (LDH): Lactate dehydrogenase (LDH): developmental speciationdevelopmental speciationDevelopmental sequence of LDH production in the heartDevelopmental sequence of LDH production in the heart
Gene DuplicationGene DuplicationGene Duplication
the the two duplicate genestwo duplicate genes have become have become specialized to different tissuesspecialized to different tissuesand to and to different developmental stagesdifferent developmental stages
ββ--subunitsubunit
αα--subunitsubunit
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
OpsinsOpsins
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among Primates
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among Primates
Gene DuplicationGene DuplicationGene Duplication
Stink Gorilla More
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among Primates
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among Primates–– color vision in color vision in humanshumans, , apesapes, and , and Old World monkeysOld World monkeys is mediated in the is mediated in the
eye by eye by 3 types of photoreceptor cells3 types of photoreceptor cells (cones), which transduce photic (cones), which transduce photic energy into electrical potentialsenergy into electrical potentials
Gene DuplicationGene DuplicationGene Duplication
12
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among Primates–– each typeeach type of colorof color--sensitive sensitive conecone is maximally is maximally sensitive to a certain sensitive to a certain
wavelengthwavelength, depending on the kind of color, depending on the kind of color--sensitive pigment sensitive pigment ((photopigmentphotopigment) present in the cone) present in the cone
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among Primates•• in humans, the in humans, the redred, , greengreen, and , and blueblue cones are maximally sensitive at cones are maximally sensitive at
approximately approximately 560560, , 530530, and , and 430430 nm, respectivelynm, respectively•• each color stimulates one or more kinds of coneseach color stimulates one or more kinds of cones
–– blue light stimulates blue conesblue light stimulates blue cones–– yellow light stimulates red and green cones equallyyellow light stimulates red and green cones equally–– white light stimulates all three types of cones equallywhite light stimulates all three types of cones equally
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among Primates•• each coloreach color--sensitive photopigment consists of 2 partssensitive photopigment consists of 2 parts
1.1. a protein called (a protein called (opsinopsin) ) 2.2. a lipid vitamina lipid vitamin--A1 derivative (A1 derivative (retinalretinal) )
the the color specificitycolor specificity is determined by the is determined by the opsinsopsins•• members of a superfamily of Gmembers of a superfamily of G--proteinprotein--coupled receptorscoupled receptors
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesGenes coding for opsinesGenes coding for opsines
a.a. the the blue opsinblue opsin is encoded by an is encoded by an autosomal geneautosomal geneb.b. the the redred and and greengreen opsinsopsins are encoded by are encoded by XX--linked geneslinked genes
•• each X chromosome contains each X chromosome contains only one redonly one red--opsin geneopsin gene, but , but •• may contain may contain more than one greenmore than one green--opsin geneopsin gene
Gene DuplicationGene DuplicationGene Duplication
AutosomeAutosomeX chromosomeX chromosome
96% amino96% amino--acid similarityacid similarity
43% amino43% amino--acid similarity acid similarity (500 Mya)(500 Mya)
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesGenes coding for opsinesGenes coding for opsines
a.a. the the blue opsinblue opsin is encoded by an is encoded by an autosomal geneautosomal geneb.b. the the redred and and greengreen opsinsopsins are encoded by are encoded by XX--linked geneslinked genes
the aminothe amino--acid sequences of the acid sequences of the redred and and greengreen opsins are opsins are 96% similar96% similar, but , but they only share they only share 43% amino43% amino--acid similarityacid similarity with the with the blueblue opsinopsin
the blue opsin gene and the ancestor of the green and red opsin the blue opsin gene and the ancestor of the green and red opsin genes genes diverged about 500 Myadiverged about 500 Mya
the close linkage and high similarity between the red and green the close linkage and high similarity between the red and green opsin opsin genes point to a very recent gene duplicationgenes point to a very recent gene duplication
Gene DuplicationGene DuplicationGene Duplication
X chromosomeX chromosome
96% amino96% amino--acid similarityacid similarity
43% amino43% amino--acid similarity acid similarity (500 Mya)(500 Mya)
AutosomeAutosome
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesColor Deficiency Color Deficiency ((ColorColor--Blindness)Blindness)
–– inability to distinguish one or more of theinability to distinguish one or more of the primary colorsprimary colors–– colorcolor--blind persons may be blind to one, two or all of the three colorblind persons may be blind to one, two or all of the three colors s
NormalNormal
1.1. tritanopiatritanopia blindness to blindness to blueblue
•• cannot distinguish between blue and yellowcannot distinguish between blue and yellow
2.2. deuteranopiadeuteranopia blindness to blindness to greengreen•• unable to see the green part of the visible spectrumunable to see the green part of the visible spectrum
3.3. protanopiaprotanopia blindness to blindness to red red •• unable to distinguish between red and greenunable to distinguish between red and green
4.4. MonochromatismMonochromatism or total coloror total color--blindness blindness •• all hues are perceived as all hues are perceived as variations of grayvariations of gray
Gene DuplicationGene DuplicationGene Duplication
13
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesColor Deficiency Color Deficiency ((ColorColor--Blindness)Blindness)DichromatismDichromatism
–– most common form of colormost common form of color--blindnessblindness
–– affect affect ~~12%12% of of ♂♂ and and ~~0.2 % 0.2 % ♀♀–– many dicromatic persons are unaware that they are colormany dicromatic persons are unaware that they are color--blindblind
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesIshihara PlatesIshihara Plates
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesGenes coding for opsinesGenes coding for opsinesa.a. NewNew--World monkeysWorld monkeys
only one Xonly one X--linked pigment genelinked pigment geneb.b. OldOld--World monkeys, including apes and humansWorld monkeys, including apes and humans
have two or morehave two or morea a duplication occurred about 25duplication occurred about 25--35 Mya35 Mya in the ancestor of Oldin the ancestor of Old--World World monkeys after their divergence from the Newmonkeys after their divergence from the New--World monkeysWorld monkeysas a consequence of this duplication, as a consequence of this duplication, OldOld--World monkeys are trichromaticWorld monkeys are trichromatic
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Primates phylogenetic treePrimates phylogenetic treePrimates phylogenetic tree
Gene DuplicationGene DuplicationGene Duplication
duplication of the X-linked pigment geneduplication of the Xduplication of the X--linked linked pigment genepigment gene
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesGenes coding for opsinesGenes coding for opsines
NewNew--World monkeysWorld monkeys possess onlypossess only 2 loci 2 loci for the opsinsfor the opsinsone autosomalone autosomal and and one Xone X--linkedlinked
Gene DuplicationGene DuplicationGene Duplication
–– the exception are howler monkeys (the exception are howler monkeys (AlouattaAlouatta) ) which have one autosomal and two Xwhich have one autosomal and two X--linked linked opsin genesopsin genes
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesGenes coding for opsinesGenes coding for opsines
NewNew--World monkeysWorld monkeys possess onlypossess only 2 loci 2 loci for the opsinsfor the opsinsone autosomalone autosomal and and one Xone X--linkedlinked
•• however, in many Newhowever, in many New--World monkeys (e.g., squirrel monkeys and tamarins), World monkeys (e.g., squirrel monkeys and tamarins), the the XX--linked opsin locus is highly polymorphiclinked opsin locus is highly polymorphic
Gene DuplicationGene DuplicationGene Duplication
•• 22 of these alleles have of these alleles have maximalmaximal--sensitivitysensitivity peakspeakssimilar to those of human similar to those of human redred and and greengreen opsinopsin
–– the third allele has an intermediate maximalthe third allele has an intermediate maximal--sensitivity peak sensitivity peak
♀♀ heterozygousheterozygous are are tritrichromachromatictic
♂♂ and and ♀♀ homozygoushomozygous are are dichromaticdichromatic
14
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among Primates
Gene DuplicationGene DuplicationGene Duplication
♀♀ womanwoman trichromatictrichromatic
♂♂ manman trichromatictrichromatic
♂♂ man color blindman color blinddichromaticdichromatic
dichromaticdichromatic
♀♀ OWM homozygous OWM homozygous dichromaticdichromatic
dichromaticdichromatic
♂♂ OWMOWMdichromaticdichromatic
dichromaticdichromatic
♀♀ OWM heterozygousOWM heterozygous trichromatictrichromatic
oror
oror
oror
[Gene Evolution]
Gene FamiliesGene FamiliesGene Families
Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesHumans, apes and African monkeysHumans, apes and African monkeys
–– achieved achieved trichromatictrichromatic vision by a vision by a mechanism akin to isozymesmechanism akin to isozymesdistinct proteins encoded by different locidistinct proteins encoded by different loci
Heterozygous Heterozygous ♀♀ squirrel monkeyssquirrel monkeys–– achieve achieve trichromacy through the use of two trichromacy through the use of two allozymesallozymes
distinct proteins encoded by different allelic forms at a singledistinct proteins encoded by different allelic forms at a single locuslocus
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Gene LossGene LossGene Loss
~~7,000 genetic diseases documented in the medical literature7,000 genetic diseases documented in the medical literature–– mutations can easily destroy the function of a proteinmutations can easily destroy the function of a protein--coding genecoding gene
The vast The vast majority of mutations are deleteriousmajority of mutations are deleterious1.1. eliminatedeliminated quickly from the population, or quickly from the population, or 2.2. maintained at very low frequenciesmaintained at very low frequencies due to due to
a.a. overdominant selectionoverdominant selectionb.b. genetic driftgenetic drift
Because Because deleterious mutations occur far more oftendeleterious mutations occur far more often than than advantageous ones, a advantageous ones, a redundant duplicate gene is more likely to redundant duplicate gene is more likely to become nonfunctionalbecome nonfunctional than to evolve into a new genethan to evolve into a new gene””
Susumu Ohno (1972)Susumu Ohno (1972)
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Gene LossGene LossGene Loss
Unprocessed pseudogenesUnprocessed pseudogenes–– results of the results of the nonfunctionalizationnonfunctionalization or or silencing of a genesilencing of a gene due to due to
deleterious mutationsdeleterious mutations–– generally derived via the generally derived via the silencing silencing of a duplicate functional geneof a duplicate functional gene
–– contain contain multiple defectsmultiple defects•• frameshiftsframeshifts•• premature stop codonspremature stop codons•• obliteration of splicing sites or regulatory elementsobliteration of splicing sites or regulatory elements
–– difficult to identifydifficult to identify the mutation that was the direct the mutation that was the direct causecause of gene silencingof gene silencing
Why are pseudogenes interesting?Why are pseudogenes interesting?–– provide information of how the genomic DNA has been changed withprovide information of how the genomic DNA has been changed without out
evolutionary pressureevolutionary pressure–– can be used as a model for determining the rate of nucleotide sucan be used as a model for determining the rate of nucleotide substitution, bstitution,
insertion, deletioninsertion, deletion
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Gene LossGene LossGene Loss
Unprocessed pseudogene formationUnprocessed pseudogene formation
Gene DuplicationGene DuplicationGene Duplication
as time as time goes by...goes by...
mutations occurmutations occur
stop codon, frame shift, etc.stop codon, frame shift, etc.
pseudogene (pseudogene (ψψ))functional copyfunctional copy
no functional constraintsno functional constraints
[Gene Evolution]
Globin EvolutionGlobin EvolutionGlobin EvolutionThe The globin superfamilyglobin superfamily has experienced has experienced all the possible evolutionary all the possible evolutionary
pathwayspathways that can occur in families of repeated sequencesthat can occur in families of repeated sequences
1.1. retention of original function retention of original function
2.2. acquisition of new functionacquisition of new function
3.3. loss of functionloss of function
In humans, the globin superfamily consists of In humans, the globin superfamily consists of 5 families5 families1.1. αα--globin familyglobin family on chromosome 16on chromosome 162.2. ββ--globin familyglobin family on chromosome 11 on chromosome 11 3.3. myoglobinmyoglobin, single member on chromosome 22, single member on chromosome 224.4. neuroglobinneuroglobin, single member on chromosome 4, single member on chromosome 45.5. cytoglobincytoglobin, single member on chromosome 17, single member on chromosome 17
4 types of functional proteins4 types of functional proteins–– myoglobinmyoglobin–– hemoglobinhemoglobin–– neuroglobinneuroglobin–– cytoglobincytoglobin
Gene DuplicationGene DuplicationGene Duplication
15
[Gene Evolution]
Globin EvolutionGlobin EvolutionGlobin Evolution
TheThe globin superfamilyglobin superfamily
Gene DuplicationGene DuplicationGene Duplication
Chromosome 16Chromosome 16
Chromosome 11Chromosome 11
Chromosome 22Chromosome 22
Chromosome 14Chromosome 14
Chromosome 17Chromosome 17
[Gene Evolution]
Globin EvolutionGlobin EvolutionGlobin Evolution
TheThe globin superfamilyglobin superfamily•• the globins are be very ancient in origin the globins are be very ancient in origin
–– globinglobin--like proteins exist in all life forms studied like proteins exist in all life forms studied •• Neuroglobin is the first to have branched offNeuroglobin is the first to have branched off•• MyoglobinMyoglobin and and HemoglobinHemoglobin diverged diverged 800 Mya 800 Mya
–– before the emergence of annelid wormsbefore the emergence of annelid wormsMyoglobinMyoglobin
–– remained remained monomericmonomeric–– evolved a higher affinity for oxygenevolved a higher affinity for oxygen than hemoglobin than hemoglobin –– became became oxygenoxygen--storage protein in musclesstorage protein in muscles
HemoglobinHemoglobin–– acquired a acquired a tetramerictetrameric structure structure –– became the became the oxygen carrier in bloodoxygen carrier in blood–– much more much more refinedrefined and and regulatedregulated
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Globin EvolutionGlobin EvolutionGlobin Evolution
TheThe globin superfamilyglobin superfamily
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Globin EvolutionGlobin EvolutionGlobin Evolution
TheThe globin superfamilyglobin superfamily
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Globin EvolutionGlobin EvolutionGlobin Evolution
TheThe globin superfamilyglobin superfamilyMammalian hemoglobinMammalian hemoglobin•• has acquired several capabilities that are absent in myoglobinhas acquired several capabilities that are absent in myoglobin1.1. binding of 4 oxygen molecules cooperativelybinding of 4 oxygen molecules cooperatively2.2. responding to the acidity and carbonresponding to the acidity and carbon--dioxide concentration inside reddioxide concentration inside red--blood cells blood cells
3.3. regulating its oxygen affinity through the level of organic phosregulating its oxygen affinity through the level of organic phosphate in the bloodphate in the blood
the the heteromeric structureheteromeric structure of hemoglobin has facilitated these refinements of of hemoglobin has facilitated these refinements of the function of hemoglobinthe function of hemoglobin
•• made up of made up of 2 types of chains2 types of chains–– one encoded by an one encoded by an αα family memberfamily member–– the other by a member of the the other by a member of the ββ familyfamily
•• the the αα and and ββ families diverged following a gene duplication about families diverged following a gene duplication about 450450--500 Mya500 Mya–– tandem duplicationtandem duplication resulting in 2 linked genes on the same chromosomeresulting in 2 linked genes on the same chromosome
•• chromosomal linkage is preserved in raychromosomal linkage is preserved in ray--finned fishes and amphibiansfinned fishes and amphibians
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Globin EvolutionGlobin EvolutionGlobin Evolution
TheThe globin superfamilyglobin superfamilyOrganization of the Organization of the αα and and ββ globin familiesglobin familiesInternal organization of Internal organization of the the αα11 and and ββ genes are showngenes are shown
Gene DuplicationGene DuplicationGene Duplication
16
[Gene Evolution]
Globin EvolutionGlobin EvolutionGlobin Evolution
TheThe globin superfamilyglobin superfamilyMammalian hemoglobinMammalian hemoglobin•• thethe αα andand ββ familiesfamilies have diverged in both have diverged in both physiological propertiesphysiological properties and and
ontological regulationontological regulation
Gene DuplicationGene DuplicationGene Duplication
embryonic geneembryonic gene 1 adult 1 adult genesgenes fetal genefetal gene3 unprocessed pseudogenes3 unprocessed pseudogenes
αα familyfamily4 functional genes4 functional genes
embryonic geneembryonic gene 2 fetal genes2 fetal genes 2 adult genes2 adult genesunprocessed pseudogeneunprocessed pseudogene
ββ familyfamily5 functional genes5 functional genes
[Gene Evolution]
Globin EvolutionGlobin EvolutionGlobin Evolution
TheThe globin superfamilyglobin superfamilyMammalian hemoglobinMammalian hemoglobin•• thethe αα andand ββ familiesfamilies have diverged in both have diverged in both physiological propertiesphysiological properties and and
ontological regulation ontological regulation 1.1. distinct hemoglobinsdistinct hemoglobins appear at appear at different developmental stagesdifferent developmental stages
ζζ22εε22 andand αα22εε22 in the in the embryoembryo
αα22γγ22 in the in the fetusfetus
αα22ββ22 andand αα22δδ22 in adultsin adults
the the θθ11 gene is mainly transcribed gene is mainly transcribed 55--8 weeks after conception8 weeks after conception at veryat verylow levelslow levels (the protein has not yet been detected (the protein has not yet been detected in vivoin vivo))
Gene DuplicationGene DuplicationGene Duplication
[Gene Evolution]
Globin EvolutionGlobin EvolutionGlobin Evolution
TheThe globin superfamilyglobin superfamilyMammalian hemoglobinMammalian hemoglobin•• thethe αα andand ββ familiesfamilies have diverged in both have diverged in both physiological propertiesphysiological properties and and
ontological regulation ontological regulation 1.1. distinct hemoglobinsdistinct hemoglobins appear at appear at different developmental stagesdifferent developmental stages2.2. differences in oxygendifferences in oxygen--binding affinity have evolved binding affinity have evolved
–– embryonicembryonic and and fetalfetal hemoglobins (hemoglobins (ζζ22εε22 αα22εε22 and and αα22γγ22) have a ) have a
higher oxygen affinityhigher oxygen affinity than adult hemoglobins (than adult hemoglobins (αα22ββ22 and and αα22δδ22))
–– better functionbetter function in the relatively in the relatively hypoxic environmenthypoxic environment in which the in which the embryo and the fetus resideembryo and the fetus reside
gene duplicationgene duplication resulted in evolutionary resulted in evolutionary refinements of physiological refinements of physiological systemssystems
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Globin EvolutionGlobin EvolutionGlobin Evolution
TheThe globin superfamilyglobin superfamily
Gene DuplicationGene DuplicationGene Duplication
αα1 and 1 and αα22•• produce identical polypeptidesproduce identical polypeptides•• present in humans and all the apes present in humans and all the apes •• arisen about 20 million years agoarisen about 20 million years ago
GGγγ 1 1 andand AAγγafter the separation after the separation of the simian lineage of the simian lineage from the prosimiansfrom the prosimians
[Gene Evolution]
Globin EvolutionGlobin EvolutionGlobin Evolution
TheThe globin superfamilyglobin superfamily
Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]
Types of exon shufflingTypes of exon shufflingTypes of exon shuffling1.1. exon duplicationexon duplication
–– duplication of one or more exons in a gene (type of internal dupduplication of one or more exons in a gene (type of internal duplication) lication)
2.2. exon insertionexon insertion–– structural or functional domains are exchanged between proteinsstructural or functional domains are exchanged between proteins
3.3. exon deletionexon deletion–– removal of a segment of amino acids from the proteinremoval of a segment of amino acids from the protein
Mosaic ProteinMosaic Protein–– protein encoded by a gene that contains regions that are also protein encoded by a gene that contains regions that are also
found in other genesfound in other genesevidence of exon shufflingevidence of exon shuffling during the evolution of their genesduring the evolution of their genes
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
17
[Gene Evolution]
Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) •• activated by bloodactivated by blood--clotting Factor XIIaclotting Factor XIIa
•• TPA converts TPA converts plasminogenplasminogen into its active form, into its active form, plasminplasmin, which dissolves , which dissolves fibrinfibrin, a soluble fibrous protein in blood clots, a soluble fibrous protein in blood clots
•• conversion of conversion of plasminogen plasminogen plasminplasmin is accelerated by the presence of is accelerated by the presence of fibrinfibrin, the substrate of plasmin, the substrate of plasmin
–– fibrin polymers bind both plasminogen and TPA aligning them for fibrin polymers bind both plasminogen and TPA aligning them for catalysiscatalysis
–– production of plasmin only in the proximity of fibrin (production of plasmin only in the proximity of fibrin (fibrinfibrin--specificityspecificity))
ProurokinaseProurokinase•• precursor of the precursor of the urinary plasminogen activatorurinary plasminogen activator•• lacks fibrin specificitylacks fibrin specificity
Exon Shuffling & Mosaic Proteins Exon Shuffling & Exon Shuffling & Mosaic Proteins Mosaic Proteins [Gene Evolution]
Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Diagram of blood coagulation and fibrinolysis Diagram of blood coagulation and fibrinolysis •• several mosaic proteins are involvedseveral mosaic proteins are involved
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
arrangement of the structural modules in arrangement of the structural modules in the mosaic proteins are shown as boxesthe mosaic proteins are shown as boxes
[Gene Evolution]
Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
fibronectin typefibronectin type--1 module (F1)1 module (F1)epidermal growthepidermal growth--factor module (EG)factor module (EG)
serine proteinase (protease) region homologous to that of trypsiserine proteinase (protease) region homologous to that of trypsinnkringles (KR)kringles (KR)
TPA contains a TPA contains a 4343--residue residue sequence at its Nsequence at its N--terminal end absent in prourokinaseterminal end absent in prourokinase•• this segment is this segment is homologoushomologous to one of the to one of the 3 finger domains3 finger domains of fibronectin of fibronectin
responsible for the responsible for the fibrin affinityfibrin affinitya large glycoprotein present in the a large glycoprotein present in the plasma and on cell surfaces which plasma and on cell surfaces which promotes cellular adhesionpromotes cellular adhesion
[Gene Evolution]
Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
fibronectin typefibronectin type--1 module (F1)1 module (F1)epidermal growthepidermal growth--factor module (EG)factor module (EG)
serine proteinase (protease) region homologous to that of trypsiserine proteinase (protease) region homologous to that of trypsinnkringles (KR)kringles (KR)
TPA contains a TPA contains a 4343--residue residue sequence at its Nsequence at its N--terminal end absent in prourokinaseterminal end absent in prourokinase•• this segment is this segment is homologoushomologous to one of the to one of the 3 finger domains3 finger domains of fibronectin of fibronectin
responsible for the responsible for the fibrin affinityfibrin affinity
•• a a deletiondeletion of this segment leads to a of this segment leads to a loss of the fibrin affinity of TPAloss of the fibrin affinity of TPA
exon shufflingexon shuffling must have been must have been responsible for the acquisition of this responsible for the acquisition of this domain by TPAdomain by TPA from either fibronectin or a similar proteinfrom either fibronectin or a similar protein
[Gene Evolution]
Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
fibronectin typefibronectin type--1 module (F1)1 module (F1)epidermal growthepidermal growth--factor module (EG)factor module (EG)
serine proteinase (protease) region homologous to that of trypsiserine proteinase (protease) region homologous to that of trypsinnkringles (KR)kringles (KR)
TPA TPA alsoalso contains contains •• a segment homologous to portions of the a segment homologous to portions of the epidermal growth factorepidermal growth factor precursor precursor
and the growthand the growth--factorfactor--like regions of other proteins (Factors VII, IX, X, and XII) like regions of other proteins (Factors VII, IX, X, and XII)
[Gene Evolution]
Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
fibronectin typefibronectin type--1 module (F1)1 module (F1)epidermal growthepidermal growth--factor module (EG)factor module (EG)
serine proteinase (protease) region homologous to that of trypsiserine proteinase (protease) region homologous to that of trypsinnkringles (KR)kringles (KR)
TPA TPA alsoalso contains contains •• two structures similar to the two structures similar to the kringles of plasminogenkringles of plasminogen
18
[Gene Evolution]
Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
fibronectin typefibronectin type--1 module (F1)1 module (F1)epidermal growthepidermal growth--factor module (EG)factor module (EG)
serine proteinase (protease) region homologous to that of trypsiserine proteinase (protease) region homologous to that of trypsinnkringles (KR)kringles (KR)
TPA TPA alsoalso contains in the Ccontains in the C--terminal terminal •• regions homologous to the regions homologous to the protease parts of trypsinprotease parts of trypsin and other trypsinand other trypsin--like like
serine proteinases (e.g. plasminogen) serine proteinases (e.g. plasminogen) ➘➘ enzymes that hydrolyze proteins into peptide fragmentsenzymes that hydrolyze proteins into peptide fragments
[Gene Evolution]
Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
TPA acquired at least TPA acquired at least 5 DNA segments from 5 DNA segments from at leastat least 4 other genes4 other genes–– plasminogenplasminogen–– epidermal growth factorepidermal growth factor–– fibronectinfibronectin–– trypsin trypsin
[Gene Evolution]
Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
TPA acquired at least TPA acquired at least 5 DNA segments from 5 DNA segments from at leastat least 4 other genes4 other genesthethe junctionsjunctions of these acquired units of these acquired units coincide preciselycoincide precisely with the borders with the borders between between exonsexons and and intronsintrons
–– further evidence of further evidence of exons shufflingexons shuffling from one gene to anotherfrom one gene to another
[Gene Evolution]
Phase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations of the exonic structurePhase limitations of the exonic structure
–– must be respectedmust be respected for an exon to be inserted, deleted or duplicated without for an exon to be inserted, deleted or duplicated without causing a causing a frameshift in the reading frameframeshift in the reading frame
IntronsIntrons are classified into three are classified into three 3 types3 types according to the way in which the coding according to the way in which the coding region is interruptedregion is interrupted
–– phase 0phase 0 if it lies between two codonsif it lies between two codons–– phase 1phase 1 if it lies between the 1st and 2nd nucleotides of a codonif it lies between the 1st and 2nd nucleotides of a codon–– phase 2phase 2 if it lies between the 2nd and 3rd nucleotides of a codonif it lies between the 2nd and 3rd nucleotides of a codon
ExonsExons
–– grouped into grouped into classesclasses according to the according to the phases of their flanking intronsphases of their flanking introns
class 0class 0--0 0 exon flanked by a phaseexon flanked by a phase--0 intron at its 5' end and by a 0 intron at its 5' end and by a phasephase--0 intron at its 3' end0 intron at its 3' end
class 0class 0--1 1 phasephase--0 intron in 5' and phase0 intron in 5' and phase--1 intron in 3'1 intron in 3'
class 1class 1--2 2 phasephase--1 intron in 5' and phase1 intron in 5' and phase--2 intron in 3'2 intron in 3'
etc...etc...
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
[Gene Evolution]
Phase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations of the exonic structurePhase limitations of the exonic structure
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
GGC AAGGGC AAG gtaagtgtaagt ................ (................ (PyPy))nnncagncag GTC AACGTC AAC
GlyGly LysLys ValVal AsnAsnPhase 0Phase 0
GG CAA GGG CAA G gtaagtgtaagt ................ (................ (PyPy))nnncagncag GT CAA CGT CAA C
GlnGln GG lyly GlnGlnPhase 1Phase 1
G GCA AGG GCA AG gtaagtgtaagt ................ (................ (PyPy))nnncagncag G TCA ACG TCA AC
AlaAla ArAr gg SerSerPhase 2Phase 2
[Gene Evolution]
Phase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations on exon shuffling
Acceptance of mutants created by intronic recombination Acceptance of mutants created by intronic recombination Several levels of selectionSeveral levels of selection determine whether intronic recombination mutant will determine whether intronic recombination mutant will be be fixedfixed or or rejectedrejected
1.1. chimeric intron must be spliced correctlychimeric intron must be spliced correctly–– otherwise translation will probably run into a stop codon in theotherwise translation will probably run into a stop codon in the
mRNA/intron region and form a mRNA/intron region and form a truncated proteintruncated protein
2.2. twotwo nonnon--orthologous introns must be in theorthologous introns must be in the same phasesame phasea.a. must split the reading frame in the same phasemust split the reading frame in the same phaseb.b. downstream exon must be translated in its original phase to prevdownstream exon must be translated in its original phase to prevent ent
frameshift mutationsframeshift mutationsc.c. symmetrical exonssymmetrical exons
3.3. new protein must be able to adopt a stable conformationnew protein must be able to adopt a stable conformation
4.4. selective advantage of having a new functional domainselective advantage of having a new functional domain–– impact of exon insertion may initially be mitigated by alternateimpact of exon insertion may initially be mitigated by alternate splicingsplicing
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
19
[Gene Evolution]
Phase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations of the exonic structurePhase limitations of the exonic structure
–– only symmetrical exons can be duplicatedonly symmetrical exons can be duplicated in tandem, in tandem, insertedinserted or or deleteddeleted without affecting the reading framewithout affecting the reading frame
symmetrical exons symmetrical exons flanked by introns of the same phase at both endsflanked by introns of the same phase at both ends
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
00 22 11 11 11 22 11 22 11
00 22 11 11 22 11 22 11
00 22 11 11 22 11 22 11
11 11
11 11 1111
11 11
[Gene Evolution]
Phase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations on exon shuffling
Phase limitations of the exonic structurePhase limitations of the exonic structure–– only symmetrical exons can be duplicatedonly symmetrical exons can be duplicated in tandem, in tandem, insertedinserted or or
deleteddeleted without affecting the reading framewithout affecting the reading frame–– duplication or deletion of duplication or deletion of asymmetrical exonsasymmetrical exons would would disrupt the disrupt the
reading frame downstreamreading frame downstream–– the the lengthlength of a symmetrical exon is always a of a symmetrical exon is always a multiplemultiple of of 3 nucleotides3 nucleotides
–– insertion of symmetrical exons is also restrictedinsertion of symmetrical exons is also restricted•• 00--0 exons can only be inserted in phase0 exons can only be inserted in phase--0 introns 0 introns •• 11--1 exons can only be inserted into phase1 exons can only be inserted into phase--1 introns 1 introns •• 22--2 exons can only be inserted into phase2 exons can only be inserted into phase--2 introns2 introns
all the exons coding for theall the exons coding for the modules of mosaic proteins modules of mosaic proteins areare symmetricalsymmetrical
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
[Gene Evolution]
Phase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations of the exonic structurePhase limitations of the exonic structure
–– only symmetrical exons can be duplicatedonly symmetrical exons can be duplicated in tandem, in tandem, insertedinserted or or deleteddeleted without affecting the reading framewithout affecting the reading frame
–– duplication or deletion of duplication or deletion of asymmetrical exonsasymmetrical exons would would disrupt the reading disrupt the reading frame downstreamframe downstream
Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins
22 22 22 22 22 22 22
DeletionDeletion22 22
InsertionInsertion
DuplicationDuplication
22 22 00 00 11 11 00
DeletionDeletion11 00
InsertionInsertion
DuplicationDuplication
[Gene Evolution]
Evolutionary roleEvolutionary roleEvolutionary role•• probably probably did not play a roledid not play a role in the formation of genes in the formation of genes in thein the early stages of early stages of
evolutionevolution
•• full bloom with the full bloom with the evolution of spliceosomal intronsevolution of spliceosomal introns, which do not play a , which do not play a role in their own excisionrole in their own excision
–– these introns these introns contain mainly nonessential partscontain mainly nonessential parts
could accomodatecould accomodate quantities of quantities of foreignforeign DNADNA
Factors favouring intronic recombinationFactors favouring intronic recombination•• middle middle repetitive sequencesrepetitive sequences flanking an exon may flanking an exon may facilitate facilitate looping outlooping out or or
insertioninsertion of modules by of modules by intronic recombinationintronic recombination•• ApolipoproteinApolipoprotein
–– number of tandem kringle domains ranges from 12 to 51 copiesnumber of tandem kringle domains ranges from 12 to 51 copies•• in one variant, 24 of the 37 kringle domains have identical nuclin one variant, 24 of the 37 kringle domains have identical nucleotide eotide
sequences, suggesting very recent duplicationsequences, suggesting very recent duplication–– isoformsisoforms containingcontaining different numbers of kringledifferent numbers of kringle domainsdomains do not do not
follow simple Mendelian patterns of inheritancefollow simple Mendelian patterns of inheritance•• offspring often have apolipoprotein isoforms that differ from thoffspring often have apolipoprotein isoforms that differ from those of ose of
parentsparents
Exon ShufflingExon ShufflingExon Shuffling
[Gene Evolution]
Overlapping GenesOverlapping GenesOverlapping Genes
•• a a DNA segmentDNA segment can can code for more than one gene productcode for more than one gene product by using by using different reading framesdifferent reading frames or or different initiation codonsdifferent initiation codons
–– widespread phenomenon in DNA and RNA viruses, as well as in orgawidespread phenomenon in DNA and RNA viruses, as well as in organelles nelles and bacteriaand bacteria
–– also known in nuclear eukaryotic genomesalso known in nuclear eukaryotic genomes•• can also arise by the can also arise by the use of the complementary strand of a geneuse of the complementary strand of a gene
Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions
e.g. the genes specifying e.g. the genes specifying tRNAtRNAIleIle and and tRNAtRNAGlnGln
in the human mitochondrial genome in the human mitochondrial genome –– located on different strands located on different strands –– 33--nucleotide overlap between them nucleotide overlap between them
5'5'——CTACTA——3' in tRNA3' in tRNAIleIle
5'5'——TAGTAG——3' in tRNA3' in tRNAGlnGln
•• also the also the ND6ND6 coding sequence corresponds coding sequence corresponds to the complementary strand of to the complementary strand of cytBcytB
[Gene Evolution]
Overlapping GenesOverlapping GenesOverlapping Genes
•• ORF are abundantORF are abundant throughout the genomethroughout the genome–– even a random DNA sequence might contain ORF hundreds of bp longeven a random DNA sequence might contain ORF hundreds of bp long
•• potential CDSpotential CDS of considerable length existof considerable length exist1.1. in a in a different reading framedifferent reading frame of an existing gene of an existing gene 2.2. on the on the complementary strandcomplementary strand
•• an additional mRNA will be transcribed and translated into a newan additional mRNA will be transcribed and translated into a new protein if protein if
a.a. by chance such a ORF contains an initiation codon and a transcriby chance such a ORF contains an initiation codon and a transcriptionption--initiation siteinitiation site
b.b. such sites are created by mutationsuch sites are created by mutation
•• the the rate of evolutionrate of evolution is expected to be is expected to be slowerslower in DNA encoding overlapping in DNA encoding overlapping genes than in similar DNA sequences that only use one reading frgenes than in similar DNA sequences that only use one reading frameame
–– higher proportion of nondegenerate siteshigher proportion of nondegenerate sites–– reduced proportion of synonymous mutationsreduced proportion of synonymous mutations–– 3rd codon3rd codon position on a given strand is the position on a given strand is the 1st codon1st codon position on its position on its
complementary strandcomplementary strand
Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions
20
[Gene Evolution]
Alternative splicing Alternative splicing Alternative splicing
•• production of different mRNAs from the same DNA segmentproduction of different mRNAs from the same DNA segment–– translated into different polypeptidestranslated into different polypeptides
•• the the distinction between exons and intronsdistinction between exons and introns is no longer absolute but is no longer absolute but depends on the mRNA of referencedepends on the mRNA of reference
2 types of exons2 types of exons
1.1. constitutiveconstitutive–– included within all the mRNAs transcribed from a geneincluded within all the mRNAs transcribed from a gene
2.2. facultativefacultative–– exons that are exons that are sometimes spliced insometimes spliced in and and sometimes spliced outsometimes spliced out
2 types of 2 types of aalternative splicinglternative splicing
1.1. unconditionalunconditional–– two or more mRNA variants aretwo or more mRNA variants are produced in all tissues produced in all tissues
expressing the geneexpressing the gene
2.2. conditionalconditional–– tissue tissue specificspecific–– developmentaldevelopmental--stage stage specificspecific–– physiologicalphysiological--state state specificspecific
Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions
[Gene Evolution]
Alternative splicing Alternative splicing Alternative splicing
RNA splicingRNA splicing removal of introns & ligation of adjacent exonsremoval of introns & ligation of adjacent exons
Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions
splicingsplicing
splice variant Isplice variant I splice variant IIsplice variant II
alternativealternativesplicingsplicing
constitutive exonsconstitutive exonsfacultative exonfacultative exon
[Gene Evolution]
Alternative splicing Alternative splicing Alternative splicing
Intron retentionIntron retention•• an unspliced intron can result in the addition of a peptide segman unspliced intron can result in the addition of a peptide segment ent
–– the ORF must be maintainedthe ORF must be maintained•• more commonly intron retention results in the more commonly intron retention results in the premature termination of premature termination of
translationtranslation due to due to frameshiftsframeshifts
Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions
splice variant Isplice variant I
alternativealternativesplicingsplicing
splice variant IIsplice variant II
[Gene Evolution]
Alternative splicing Alternative splicing Alternative splicing
Alternative internal donor or acceptor sitesAlternative internal donor or acceptor sites•• excisions of introns of different lengths with complementary varexcisions of introns of different lengths with complementary variation in the size iation in the size
of neighboring exonsof neighboring exons
Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions
alternative internal donor sitealternative internal donor site
alternative internal acceptoralternative internal acceptor sitesite
[Gene Evolution]
Alternative splicing Alternative splicing Alternative splicing
Alternative transcription initiation and polyadenylation sites Alternative transcription initiation and polyadenylation sites •• different mRNAs that are produced from the same gene differ fromdifferent mRNAs that are produced from the same gene differ from one another one another
only at their 5' or 3' endsonly at their 5' or 3' ends•• alternative polyadenylation sites are common in eukaryotic nuclealternative polyadenylation sites are common in eukaryotic nuclear genesar genes
Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions
alternative transcription initiation sitealternative transcription initiation site
alternative polyadenylation sitealternative polyadenylation site
TATATATA
TATATATA
AATAAAATAA
AATAAAATAA
[Gene Evolution]
Alternative splicing Alternative splicing Alternative splicing
Mutually exclusive exonsMutually exclusive exons•• 2 exons are never spliced out together, nor are both retained in2 exons are never spliced out together, nor are both retained in the same mRNAthe same mRNA
Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions
splice variant Isplice variant I splice variant IIsplice variant II
alternativealternativesplicingsplicing
•• M1M1 and and M2M2 forms of forms of pyruvate kinasepyruvate kinase–– mutually exclusive use of exons 9 and 10 mutually exclusive use of exons 9 and 10 ofof a single genea single gene
21
[Gene Evolution]
Alternative splicing Alternative splicing Alternative splicing
CassetteCassette exonsexons•• special case of mutual exclusivity special case of mutual exclusivity •• a a cassettecassette is either is either spliced inspliced in or or spliced outspliced out in the alternative mRNA in the alternative mRNA •• usually the reading frame is maintained whether such an exon is usually the reading frame is maintained whether such an exon is in or outin or out
Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions
splice variant Isplice variant I splice variant IIsplice variant II
alternativealternativesplicingsplicing
TroponinTroponin--TT genegene•• 55 cassette cassette exonsexons in conjunction with in conjunction with 2 mutually exclusive 2 mutually exclusive exonsexons
productionproduction of of 64 different proteins64 different proteins from a single from a single genegene
[Gene Evolution]
Alternative splicing Alternative splicing Alternative splicing
Alternative splicing as a means of developmental regulationAlternative splicing as a means of developmental regulationDrosophila melanogasterDrosophila melanogaster
•• at least 3 genes are involved in the process of sex determinatioat least 3 genes are involved in the process of sex determinationn–– doublesexdoublesex ((dsxdsx))–– SexlethalSexlethal ((SxlSxl))–– transformertransformer ((tratra))
•• are are spliced differentlyspliced differently in in malesmales and and femalesfemales
Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions
4 653214321 65321
dsx
4321tra
2
363 4 5 7 821
stop
Sxl
stop
♂♂ mRNAmRNA
♂♂ mRNAmRNA
♂♂ mRNAmRNA
♀♀ mRNAmRNA
♀♀ mRNAmRNA
♀♀ mRNAmRNA
[Gene Evolution]
Alternative splicing Alternative splicing Alternative splicing
Evolution of alternative splicingEvolution of alternative splicing•• requires that an requires that an alternative splice junctionalternative splice junction site be site be created created de novode novo•• created with an appreciable frequency by mutationcreated with an appreciable frequency by mutation
–– splicing signals are usually 5splicing signals are usually 5--10 nucleotides long10 nucleotides long•• many such examples are known.many such examples are known.
ββ++ thalassemiathalassemia•• synonymous nucleotide substitutionsynonymous nucleotide substitution in the gene (in the gene (GGT GGT GGAGGA = glycine)= glycine)•• however, however, not silentnot silent
•• activation of the activation of the new splicing site in the new splicing site in the ββ globin geneglobin gene–– stronger than the old splice sitestronger than the old splice site
•• production of a production of a frameshifted proteinframeshifted protein
Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions
[Gene Evolution]
Alternative splicing Alternative splicing Alternative splicing
ββ++ thalassemiathalassemia•• synonymous nucleotide substitutionsynonymous nucleotide substitution in the gene (in the gene (GGT GGT GGAGGA = glycine)= glycine)•• however, however, not silentnot silent
•• activation of the activation of the new splicing site in the new splicing site in the ββ globin geneglobin gene–– stronger than the old splice sitestronger than the old splice site
•• production of a production of a frameshifted proteinframeshifted protein
Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions
[Gene Evolution]
An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon
members of a repeatedmembers of a repeated--sequence family are generally very similar sequence family are generally very similar to to each othereach other within one specieswithin one species
members of the family from closely related species may differ grmembers of the family from closely related species may differ greatlyeatly
•• if each duplicate sequence evolves independentlyif each duplicate sequence evolves independently
–– the the similaritysimilarity betweenbetween any two randomly chosen any two randomly chosen sequencessequences within a within a speciesspecies is expected to be the is expected to be the same as that between two sequencessame as that between two sequenceschosen between the specieschosen between the species
•• observed patterns reveal a observed patterns reveal a high degree of withinhigh degree of within--species homogeneityspecies homogeneityamong duplicated sequencesamong duplicated sequences
Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]
An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon
•• if each duplicate sequence evolves independentlyif each duplicate sequence evolves independently–– the the similaritysimilarity betweenbetween any two randomly chosen any two randomly chosen sequencessequences within a within a
speciesspecies is expected to be the is expected to be the same as that between two sequencessame as that between two sequenceschosen between the specieschosen between the species
Concerted EvolutionConcerted EvolutionConcerted Evolution
22
[Gene Evolution]
An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon
members of a repeatedmembers of a repeated--sequence family are generally very similar sequence family are generally very similar to to each othereach other within one specieswithin one species
•• observed patterns reveal a observed patterns reveal a high degree of withinhigh degree of within--species homogeneityspecies homogeneityamong duplicated sequencesamong duplicated sequences
Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]
An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon
members of a repeatedmembers of a repeated--sequence family are generally very similar sequence family are generally very similar to to each othereach other within one specieswithin one species
Concerted EvolutionConcerted EvolutionConcerted Evolution
[Gene Evolution]
An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon
XenopusXenopus and most other vertebratesand most other vertebrates•• the the genes specifying the 18S and 28S rRNAgenes specifying the 18S and 28S rRNA are present in are present in hundreds of hundreds of
copiescopies and are arranged in and are arranged in one or a few tandem arraysone or a few tandem arrays•• each repeated unit consists of a each repeated unit consists of a transcribedtranscribed and a and a nontranscribednontranscribed segmentsegment
Concerted EvolutionConcerted EvolutionConcerted Evolution
•• rRNA genes in rRNA genes in X. laevisX. laevis and and X. borealisX. borealis1.1. 18S18S and and 28S28S genes of the two species were virtually genes of the two species were virtually identicalidentical2.2. NTS regionsNTS regions differed greatly between the two speciesdiffered greatly between the two species
very similar within each individualvery similar within each individual and and among among individualsindividuals within a specieswithin a species
NTS regions in each species have evolved in concertNTS regions in each species have evolved in concertbut have but have diverged rapidly between speciesdiverged rapidly between species
Xenopus laevisXenopus laevis Xenopus borealisXenopus borealis
[Gene Evolution]
An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon
Evolutionary scenarios for homogenization tandemly repeated arraEvolutionary scenarios for homogenization tandemly repeated arrayy
a.a. Stringent SelectionStringent Selection
Concerted EvolutionConcerted EvolutionConcerted Evolution
•• the function of the repeats depends on the function of the repeats depends on their specific nucleotide sequencetheir specific nucleotide sequence
•• beneficial mutations are fixed by positive beneficial mutations are fixed by positive selection (selection (++) )
•• deleterious mutations are eliminated by deleterious mutations are eliminated by purifying selection (purifying selection (––))
HoweverHowever
•• NTS regions have no known functionNTS regions have no known function
–– do not appear to be subject to do not appear to be subject to stringent selective constraintsstringent selective constraints
++
++
++
--
++
++
[Gene Evolution]
An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon
Evolutionary scenarios for homogenization tandemly repeated arraEvolutionary scenarios for homogenization tandemly repeated arrayya.a. Stringent Selection (NO)Stringent Selection (NO)
b.b. Recent MultiplicationRecent Multiplication
Concerted EvolutionConcerted EvolutionConcerted Evolution
•• the repeated family arises through the the repeated family arises through the amplification of a single unitamplification of a single unit
•• the homogeneity reflects the fact that the homogeneity reflects the fact that there had there had notnot been been enough timeenough time for the for the members of the multigene family members of the multigene family to to divergediverge from each otherfrom each other
•• it is expected that the homogeneity of the family would graduallit is expected that the homogeneity of the family would gradually decreasey decrease–– mutations would accumulate in the family members through geneticmutations would accumulate in the family members through genetic
drift, particularly in regions that are subject to no stringent drift, particularly in regions that are subject to no stringent structural constraintsstructural constraints
HoweverHoweverintraspecific homogeneity of intraspecific homogeneity of NTS regions in NTS regions in XenopusXenopus does not does not decrease with evolutionarydecrease with evolutionary timetime
[Gene Evolution]
An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon
Evolutionary scenarios for homogenization tandemly repeated arraEvolutionary scenarios for homogenization tandemly repeated arrayya.a. Stringent Selection (NO)Stringent Selection (NO)b.b. Recent Multiplication (NO)Recent Multiplication (NO)
c.c. Concerted EvolutionConcerted Evolution
Concerted EvolutionConcerted EvolutionConcerted Evolution
individual member of a gene familyindividual member of a gene family does does not evolve independentlynot evolve independently of the other of the other members of the familymembers of the familyiit t exchanges sequence informationexchanges sequence information with with other members other members reciprocallyreciprocally or or nonnon--reciprocallyreciprocally
through through genetic interactionsgenetic interactions among its members, among its members, a multigene family a multigene family evolves in concert as a unitevolves in concert as a unitresults in a results in a homogenized set of nonallelic homologous sequenceshomogenized set of nonallelic homologous sequencesconcerted evolution requires concerted evolution requires
1.1. horizontal transfer of mutations among the family membershorizontal transfer of mutations among the family members((homogenizationhomogenization))
2.2. spread of mutations to all individuals in the populationspread of mutations to all individuals in the population ((fixationfixation))
23
[Gene Evolution]
Mechanisms of concerted evolutionMechanisms of concerted evolutionMechanisms of concerted evolution
•• Gene Conversion Gene Conversion •• Unequal CrossingUnequal Crossing--Over Over •• SlippedSlipped--Strand Mispairing (Replication Slippage)Strand Mispairing (Replication Slippage)•• Duplicative TranspositionDuplicative Transposition
Concerted EvolutionConcerted EvolutionConcerted Evolution
result in a result in a homogenized set of nonallelic homologous sequenceshomogenized set of nonallelic homologous sequences
[Gene Evolution]
Gene Conversion Gene Conversion Gene Conversion
nonreciprocal recombination process nonreciprocal recombination process in whichin which two sequences two sequences interact interact in such a way thatin such a way that one is converted by the otherone is converted by the other
•• according to the chromatids involved in the process, gene converaccording to the chromatids involved in the process, gene conversion can be sion can be divided into several different typesdivided into several different types
1.1. intrachromatid conversionintrachromatid conversionexchange between paralogous sequences on the same chromatidexchange between paralogous sequences on the same chromatid
2.2. sistersister--chromatid conversionchromatid conversionexchange between paralogous sequences from complementary chromatexchange between paralogous sequences from complementary chromatidsids
3.3. classical conversionclassical conversionexchanges between alleles at the same locusexchanges between alleles at the same locus
4.4. semiclassical conversionsemiclassical conversionexchange between paralogous genes from two homologous chromosomeexchange between paralogous genes from two homologous chromosomess
5.5. ectopic conversionectopic conversionexchange between paralogous sequences located on nonhomologous exchange between paralogous sequences located on nonhomologous chromosomeschromosomes
Concerted EvolutionConcerted EvolutionConcerted Evolution
[Gene Evolution]
Gene Conversion Gene Conversion Gene Conversion
Concerted EvolutionConcerted EvolutionConcerted Evolution
ectopicectopic
classicalclassical semiclassicalsemiclassical
sistersister--chromatidchromatid
intrachromatidintrachromatid
homologoushomologouschromosomechromosome
pairpair
nonhomologousnonhomologouschromosomechromosome
[Gene Evolution]
Gene Conversion Gene Conversion Gene Conversion •• gene conversion has been gene conversion has been found in all speciesfound in all species and at and at all lociall loci that were that were
examined in detailexamined in detail
–– most important types of GC are the most important types of GC are the nonallelic conversionsnonallelic conversions
•• the the rate of gene conversion varies with genomic locationrate of gene conversion varies with genomic location
Unbiased Unbiased gene conversiongene conversion–– sequence A has as much chance of converting sequence B as sequence A has as much chance of converting sequence B as –– sequence B has of converting sequence Asequence B has of converting sequence A
Biased Biased gene conversiongene conversion–– the probabilities of gene conversion between two sequences in ththe probabilities of gene conversion between two sequences in the two e two
possible directions occur are unequalpossible directions occur are unequal•• then one sequence is the then one sequence is the mastermaster and the other is the and the other is the slaveslave
more commonmore common than unbiased gene conversionthan unbiased gene conversion
Concerted EvolutionConcerted EvolutionConcerted Evolution
[Gene Evolution]
Mechanisms of concerted evolutionMechanisms of concerted evolutionMechanisms of concerted evolution
•• Gene Conversion Gene Conversion
•• Unequal CrossingUnequal Crossing--OverOver•• SlippedSlipped--Strand Mispairing (Replication Slippage)Strand Mispairing (Replication Slippage)•• Duplicative TranspositionDuplicative Transposition
Concerted EvolutionConcerted EvolutionConcerted Evolution
result in a result in a homogenized set of nonallelic homologous sequenceshomogenized set of nonallelic homologous sequences
[Gene Evolution]
Unequal Crossing-Over Unequal CrossingUnequal Crossing--Over Over •• reciprocal recombination process that creates reciprocal recombination process that creates
–– a sequence a sequence duplicationduplication in one chromatid or chromosome and in one chromatid or chromosome and –– a a corresponding deletioncorresponding deletion in the otherin the other
•• may occur either may occur either –– between the 2 sister chromatidsbetween the 2 sister chromatids of a chromosome during mitosis in a of a chromosome during mitosis in a
germgerm--line cell line cell –– between two homologous chromosomes between two homologous chromosomes at meiosisat meiosis
•• despite the number of repeats either increases or decreasesdespite the number of repeats either increases or decreases–– both daughter chromosomesboth daughter chromosomes have a have a more homogenous repeat makemore homogenous repeat make--
upup than the parental chromosomethan the parental chromosome
Concerted EvolutionConcerted EvolutionConcerted Evolution
24
[Gene Evolution]
Unequal Crossing-Over Unequal CrossingUnequal Crossing--Over Over
Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]
Unequal Crossing-Over Unequal CrossingUnequal Crossing--Over Over •• if this process is repeatedif this process is repeated
–– the the numbers of each variant repeatnumbers of each variant repeat on a chromosome will on a chromosome will fluctuatefluctuatewith timewith time
–– eventually eventually one type will become dominantone type will become dominant in the familyin the family
one type of repeat may spread throughout a gene family due to one type of repeat may spread throughout a gene family due to repeated rounds of unequal crossingrepeated rounds of unequal crossing--overover
unequal crossingunequal crossing--over has also been suggested to have played a much more over has also been suggested to have played a much more important roleimportant role than gene conversion in the than gene conversion in the concerted evolution of concerted evolution of immunoglobulin VH gene familyimmunoglobulin VH gene family in mousein mouse
Concerted EvolutionConcerted EvolutionConcerted Evolution
[Gene Evolution]
Unequal Crossing-Over Unequal CrossingUnequal Crossing--Over Over
Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]
Relative roles of gene conversion and unequal crossing-overRelative roles of gene conversion and unequal crossingRelative roles of gene conversion and unequal crossing--overover
Gene conversion have several advantagesGene conversion have several advantages over unequal crossingover unequal crossing--overover
1.1. GC causes GC causes no change in gene numberno change in gene number–– UCUC--O generates changes in the number of repeated genes within a famO generates changes in the number of repeated genes within a familyily
•• sometimes cause a significant dosage imbalancesometimes cause a significant dosage imbalancee.g. e.g. deletion of one of the two deletion of one of the two αα--globin genesglobin genes following following
unequal crossingunequal crossing--over gives rise to the mild form of over gives rise to the mild form of αα--thalassemiathalassemia in homozygotesin homozygotes
Concerted EvolutionConcerted EvolutionConcerted Evolution
[Gene Evolution]
Relative roles of gene conversion and unequal crossing-overRelative roles of gene conversion and unequal crossingRelative roles of gene conversion and unequal crossing--overover
Gene conversion have several advantagesGene conversion have several advantages over unequal crossingover unequal crossing--overover
1.1. GC causes GC causes no change in gene numberno change in gene number
2.2. GC can act as a GC can act as a correction mechanismcorrection mechanism on on a.a. tandem repeats but also ontandem repeats but also on
b.b. dispersed repeatsdispersed repeats within a chromosome, between homologous within a chromosome, between homologous chromosomes, or between nonhomologous chromosomeschromosomes, or between nonhomologous chromosomes
–– UCUC--O is severely restricted when repeats dispersed on nonhomologousO is severely restricted when repeats dispersed on nonhomologouschromosomes are involvedchromosomes are involved
Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]
Relative roles of gene conversion and unequal crossing-overRelative roles of gene conversion and unequal crossingRelative roles of gene conversion and unequal crossing--overover
Gene conversion have several advantagesGene conversion have several advantages over unequal crossingover unequal crossing--overover
1.1. GC causes GC causes no change in gene numberno change in gene number
2.2. GC can act as a GC can act as a correction mechanismcorrection mechanism on on a.a. tandem repeats but also ontandem repeats but also on
b.b. dispersed repeatsdispersed repeats within a chromosome, between homologous within a chromosome, between homologous chromosomes, or between nonhomologous chromosomeschromosomes, or between nonhomologous chromosomes
–– UCO is severely restricted when repeats dispersed on nonhomologoUCO is severely restricted when repeats dispersed on nonhomologous us chromosomes are involvedchromosomes are involved
Concerted EvolutionConcerted EvolutionConcerted Evolution
25
[Gene Evolution]
Relative roles of gene conversion and unequal crossing-overRelative roles of gene conversion and unequal crossingRelative roles of gene conversion and unequal crossing--overover
Advantages of Gene Conversion over Unequal CrossingAdvantages of Gene Conversion over Unequal Crossing--OverOver1.1. GC causes GC causes no change in gene numberno change in gene number
2.2. GC can act as a GC can act as a correction mechanismcorrection mechanism on tandem and dispersed repeatson tandem and dispersed repeats
3.3. GC can be GC can be biasedbiased, i.e., have a preferred direction, i.e., have a preferred direction–– experimental data from fungi have shown that bias in the directiexperimental data from fungi have shown that bias in the direction of on of
gene conversion is common and often stronggene conversion is common and often strong–– theoretical studies have shown that even a theoretical studies have shown that even a small bias can have a large small bias can have a large
effecteffect on the probability of fixationon the probability of fixation of repeated mutantsof repeated mutants
Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]
Relative roles of gene conversion and unequal crossing-overRelative roles of gene conversion and unequal crossingRelative roles of gene conversion and unequal crossing--overover
Advantages of Unequal CrossingAdvantages of Unequal Crossing--Over over Gene Conversion Over over Gene Conversion 1.1. UCO is UCO is fasterfaster and and more efficientmore efficient in bringing about concerted in bringing about concerted
evolutionevolution–– at the mutation level, UCO occurs more frequently than GCat the mutation level, UCO occurs more frequently than GC
2.2. in a GC event, only a small region is involvedin a GC event, only a small region is involved–– in yeastin yeast
•• an unequal crossingan unequal crossing--over event involves on average ~20,000 bpover event involves on average ~20,000 bp•• a GC track cannot exceed 1,500 bpa GC track cannot exceed 1,500 bp
Concerted EvolutionConcerted EvolutionConcerted Evolution
[Gene Evolution]
Factors affecting the rate of concerted evolutionFactors affecting the rate of concerted evolutionFactors affecting the rate of concerted evolution
1.1. thethe number of repeatsnumber of repeats–– the number of UCO required for the fixation of a variant repeat the number of UCO required for the fixation of a variant repeat increases increases
roughly with nroughly with n22 (n = the number of repeats)(n = the number of repeats)
2.2. thethe arrangement of the repeatsarrangement of the repeats–– dispersed arrangement dispersed arrangement
a.a. causes UCO to lead to disastrous genetic consequencescauses UCO to lead to disastrous genetic consequencesb.b. reduces the frequency of gene conversionreduces the frequency of gene conversion
3.3. relativerelative sizes of slowly and rapidly evolving regions within the sizes of slowly and rapidly evolving regions within the repeatrepeat unitunit
–– both UCO and GC depend on sequence similarity for the misalignmeboth UCO and GC depend on sequence similarity for the misalignment of nt of repeatsrepeats
–– the more coding regionsthe more coding regions (slowly evolving) there are, (slowly evolving) there are, the higher the the higher the ratesrates concerted evolution will beconcerted evolution will be
Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]
Factors affecting the rate of concerted evolutionFactors affecting the rate of concerted evolutionFactors affecting the rate of concerted evolution
4.4. constraints on homogeneityconstraints on homogeneity–– if function requires large amounts of an invariable gene productif function requires large amounts of an invariable gene product
e.g. rRNA and histone genese.g. rRNA and histone genesselection against variationselection against variation
–– if the function requires a large amount of diversityif the function requires a large amount of diversitye.g. immunoglobulin and histocompatibility genese.g. immunoglobulin and histocompatibility genes
selection against homogeneityselection against homogeneity
5.5. mechanismsmechanisms of concerted evolutionof concerted evolution–– concerted evolution under UCO is quicker than that under GCconcerted evolution under UCO is quicker than that under GC
6.6. population sizepopulation size•• the time required for a variant to become fixed in a population the time required for a variant to become fixed in a population depends depends
on population sizeon population size
Concerted EvolutionConcerted EvolutionConcerted Evolution
[Gene Evolution]
Evolutionary implications of concerted evolutionEvolutionary implications of concerted evolutionEvolutionary implications of concerted evolution
1.1. spread of advantageous mutationsspread of advantageous mutations–– spread of deleterious mutations is avoided by selectionspread of deleterious mutations is avoided by selection
2.2. retardation of divergence of duplicate genesretardation of divergence of duplicate genes3.3. obliteration on evolutionary historyobliteration on evolutionary history4.4. generation of allelic variationgeneration of allelic variation
Concerted EvolutionConcerted EvolutionConcerted Evolution
Gene ConversionGene Conversion
2 loci, 2 loci, 44 allelesalleles
2 loci, 3 alleles2 loci, 3 alleles
decreasedecreasevariationvariation
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
Concerted evolution Concerted evolution erases the record of molecular divergenceerases the record of molecular divergenceduring the evolution of paralogous sequencesduring the evolution of paralogous sequences
–– observing similar paralogous sequences from a species, it is usuobserving similar paralogous sequences from a species, it is usually impossible ally impossible to distinguish between two possible alternativesto distinguish between two possible alternatives
a.a. the sequences have onlythe sequences have only recently divergedrecently diverged from one another by duplicationfrom one another by duplication
b.b. the sequences have the sequences have evolved in concertevolved in concert
The phylogenetic approachThe phylogenetic approach can be good choicecan be good choice
Concerted EvolutionConcerted EvolutionConcerted Evolution
26
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
The two The two αα--globin genesglobin genes in humans are almost identicalin humans are almost identical•• thought to have duplicated quite recentlythought to have duplicated quite recently
–– no sufficient time for them to diverge in sequenceno sufficient time for them to diverge in sequence
Concerted EvolutionConcerted EvolutionConcerted Evolution
•• duplicated duplicated αα--globin genes were also globin genes were also discovered in distantly relateddiscovered in distantly related speciesspecies
•• 2 possibilities2 possibilitiesa.a. multiple multiple genegene--duplication events occurred independentlyduplication events occurred independently in many in many
evolutionary lineagesevolutionary lineagesb.b. the two genes are quite ancient (duplicated once in the common the two genes are quite ancient (duplicated once in the common
ancestor) but ancestor) but their antiquity was subsequently obscured by their antiquity was subsequently obscured by concerted evolutionconcerted evolution
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes–– arose by a duplication that occurred approximately 55 Mya, afterarose by a duplication that occurred approximately 55 Mya, after the the
divergence between prosimians and simiandivergence between prosimians and simian
Concerted EvolutionConcerted EvolutionConcerted Evolution
•• since the African apes diverged much later, we would expect since the African apes diverged much later, we would expect the the GGγγ orthologousorthologous genes from apes to be much genes from apes to be much more similarmore similar to each to each other other than to anythan to any of the of the AAγγ paralogsparalogs
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
Concerted EvolutionConcerted EvolutionConcerted Evolution
duplication of Aγ and Gγ-globin genes duplication of Aduplication of Aγγ and Gand Gγγ--globin genes globin genes
African apes divergenceAfrican apes divergenceAfrican apes divergence
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apesthe orthologs should be closer to one another than the paralogsthe orthologs should be closer to one another than the paralogs
Concerted EvolutionConcerted EvolutionConcerted Evolution
duplication 55 Myaduplicationduplication 5555 MyaMya
speciation 5-7 Myaspeciation speciation 55--77 MyaMya
γγγ
AγAAγγ GγGGγγ
AγAAγγ GγGGγγAγAAγγ GγGGγγ
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apesin humansin humans
55’’ AAγγ ≠≠ GGγγ at 7 out of 1,550 nucleotide positions (at 7 out of 1,550 nucleotide positions (0.5%0.5%))
33’’ AAγγ ≠≠ GGγγ at 145 out of 1,550 nucleotide positions (at 145 out of 1,550 nucleotide positions (9.4%9.4%))
–– assuming that the 5assuming that the 5’’ and the 3and the 3’’ are subject to are subject to similar functional constraintssimilar functional constraints5' end of the gene underwent gene conversion5' end of the gene underwent gene conversion
–– intron 2intron 2 in both genes in all apes contains a in both genes in all apes contains a stretch ofstretch of (TG)(TG)nn that can that can serve as a serve as a hotspot for recombinationhotspot for recombination events involved in the process of events involved in the process of gene conversiongene conversion
Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes
Concerted EvolutionConcerted EvolutionConcerted Evolution
exon 3exonexon 33
Orthologous genesOrthologous genes from apes are more from apes are more
similar to each othersimilar to each other if their if their 33’’ partsparts
(i.e. the 3rd exons) are considered(i.e. the 3rd exons) are considered
AγAAγγ
GγGGγγ
27
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes
Concerted EvolutionConcerted EvolutionConcerted Evolution
exon 3exonexon 33 exons 1 and 2 exonsexons 1 and 2 1 and 2
The 5The 5’’ parts (exons 1 and 2)parts (exons 1 and 2)•• different phylogenetic patterndifferent phylogenetic pattern•• paralogous exons within each species paralogous exons within each species
resemble each other moreresemble each other more than they than they resemble their orthologous counterparts in resemble their orthologous counterparts in other apesother apes
AγAAγγ
GγGGγγ
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes
Concerted EvolutionConcerted EvolutionConcerted Evolution
exon 3exonexon 33 exons 1 and 2 exonsexons 1 and 2 1 and 2
This tree contains an additional anomalyThis tree contains an additional anomaly•• it clusters it clusters chimpanzeechimpanzee and and gorillagorilla as a as a cladeclade
AγAAγγ
GγGGγγ
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes
Concerted EvolutionConcerted EvolutionConcerted Evolution
exon 3exonexon 33 exons 1 and 2 exonsexons 1 and 2 1 and 2
AγAAγγ
GγGGγγ
AγAAγγ
GγGGγγ
expected treefor exons 1 and 2
ifGC had only occurredin the human lineage
expectedexpected treetreefor exons 1 and 2for exons 1 and 2
ififGCGC had had only occurredonly occurredin the in the human lineagehuman lineage
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes
Concerted EvolutionConcerted EvolutionConcerted Evolution
exon 3exonexon 33 exons 1 and 2 exonsexons 1 and 2 1 and 2
AγAAγγ
GγGGγγ
AγAAγγ
GγGGγγ
expected tree if GC had only occurred in the human lineageexpected tree if GC had only expected tree if GC had only occurred in the human lineageoccurred in the human lineage
each of the 3 lineageseach of the 3 lineages has experienced has experienced multiple independent GC multiple independent GC eventsevents
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes
Concerted EvolutionConcerted EvolutionConcerted Evolution
exons 1 and 2 exonsexons 1 and 2 1 and 2
Assuming that all parts of the genes evolve at equal ratesAssuming that all parts of the genes evolve at equal rates•• it is possible to it is possible to date the last gene conversion eventdate the last gene conversion event by using by using
a.a. the degrees of similarity between the two sequences, and the degrees of similarity between the two sequences, and b.b. the date for the gene duplication eventthe date for the gene duplication event
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes
Concerted EvolutionConcerted EvolutionConcerted Evolution
exons 1 and 2 exonsexons 1 and 2 1 and 2
11--2 Mya2 Mya last conversion event in the last conversion event in the human lineage human lineage
–– i.e., after the divergence between human and chimpanzeei.e., after the divergence between human and chimpanzee
GC in the chimpGC in the chimp and and gorillagorilla lineages lineages occurred independentlyoccurred independently
28
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
GC in the evolution of XGC in the evolution of X--linked color vision geneslinked color vision genesRed and green opsin genes in humansRed and green opsin genes in humans
–– intron 4 and 2 sequences of these two genes are identical (or neintron 4 and 2 sequences of these two genes are identical (or nearly identical)arly identical)–– unexpected low divergencesunexpected low divergences because the duplication event producing the because the duplication event producing the
two genes have occurred before the separation of the human and Otwo genes have occurred before the separation of the human and Old World ld World monkey lineages (monkey lineages (<35 Mya)<35 Mya)
Concerted EvolutionConcerted EvolutionConcerted Evolution
exon 1exonexon 11 exon 2exonexon 22 exon 3exonexon 33 exon 4exonexon 44 exon 5exonexon 55 exon 6exonexon 66
intron 2intronintron 22 intron 4intronintron 44
Ks = 5.6Ka = 1.5KKs = 5.6s = 5.6KKa = 1.5a = 1.5
Ks = 1.9Ka = 3.5KKs = 1.9s = 1.9KKa = 3.5a = 3.5
Ks = 7.6Ka = 3.9KKs = 7.6s = 7.6KKa = 3.9a = 3.9
Ks = 0.0Ka = 0.0KKs = 0.0s = 0.0KKa = 0.0a = 0.0K = 0.3KK = 0.3= 0.3 K = 0.0KK = 0.0= 0.0Ks = 0
Ka = 0KKs = 0s = 0KKa = 0a = 0
Ks = 2.0Ka = 1.5KKs = 2.0s = 2.0KKa = 1.5a = 1.5
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
GC in the evolution of XGC in the evolution of X--linked color vision geneslinked color vision genesRed and green opsin genes in humansRed and green opsin genes in humans
–– divergences in the introns are significantly lower than both thedivergences in the introns are significantly lower than both the synonymous synonymous and the nonsynonymous divergence in the coding sequences of exonand the nonsynonymous divergence in the coding sequences of exons 2s 2--55
–– high similarities in the intron sequences might be due to high similarities in the intron sequences might be due to very recent GCvery recent GC, , probably probably during evolution of the human lineageduring evolution of the human lineage
Concerted EvolutionConcerted EvolutionConcerted Evolution
exon 1exonexon 11 exon 2exonexon 22 exon 3exonexon 33 exon 4exonexon 44 exon 5exonexon 55 exon 6exonexon 66
intron 2intronintron 22 intron 4intronintron 44
Ks = 5.6Ka = 1.5KKs = 5.6s = 5.6KKa = 1.5a = 1.5
Ks = 1.9Ka = 3.5KKs = 1.9s = 1.9KKa = 3.5a = 3.5
Ks = 7.6Ka = 3.9KKs = 7.6s = 7.6KKa = 3.9a = 3.9
Ks = 0.0Ka = 0.0KKs = 0.0s = 0.0KKa = 0.0a = 0.0K = 0.3KK = 0.3= 0.3 K = 0.0KK = 0.0= 0.0Ks = 0
Ka = 0KKs = 0s = 0KKa = 0a = 0
Ks = 2.0Ka = 1.5KKs = 2.0s = 2.0KKa = 1.5a = 1.5
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
GC in the evolution of XGC in the evolution of X--linked color vision geneslinked color vision genesRed and green opsin genes in humansRed and green opsin genes in humans
–– GC can occur in exons as well as in intronsGC can occur in exons as well as in introns–– GC events in exons may be GC events in exons may be disadvantageousdisadvantageous
•• reduce the differences between the red and green pigment genes reduce the differences between the red and green pigment genes •• reduce the ability to distinguish betweenreduce the ability to distinguish between red and green red and green colorscolors
the resultant changes may be the resultant changes may be eliminated from the populationeliminated from the population
High frequency of High frequency of redred--greengreen or or greengreen--red fusion genesred fusion genes in human populationsin human populations
–– ~~16% of Caucasian 16% of Caucasian ♂♂–– ~~21% of African21% of African--American American ♂♂
suggests that suggests that during meiosis during meiosis thethe red red andand green pigment genes green pigment genes maymaymispairmispair frequentlyfrequently because of their high sequence similaritybecause of their high sequence similarity
•• mispairing during meiosis increases the probability of GCmispairing during meiosis increases the probability of GC
•• high levels of high levels of similarity in intronssimilarity in introns facilitate mispairing and facilitate mispairing and recombination, leading to production of recombination, leading to production of hybrid geneshybrid genes
Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
GC in the evolution of XGC in the evolution of X--linked color vision geneslinked color vision genesRed and green opsin genes in humans, apes and OWMRed and green opsin genes in humans, apes and OWM
–– intron 4 sequences between the two genes have been strongly or cintron 4 sequences between the two genes have been strongly or completely ompletely homogenized in all 3 specieshomogenized in all 3 species
Concerted EvolutionConcerted EvolutionConcerted Evolution
intron 4K = 0.0
intronintron 44KK = 0.0= 0.0
exon 4+5 Ks = 8.1Ka = 5.8
exonexon 4+5 4+5 KKs = 8.1s = 8.1KKa = 5.8a = 5.8
intron 4K = 0.3
intronintron 44KK = 0.3= 0.3
exon 4+5Ks = 6.6Ka = 5.1
exonexon 4+54+5KKs = 6.6s = 6.6KKa = 5.1a = 5.1
intron 4K = 0.9
intronintron 44KK = 0.9= 0.9
exon 4+5Ks = 11.5Ka = 5.1
exonexon 4+54+5KKs = 11.5s = 11.5KKa = 5.1a = 5.1
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
GC in the evolution of XGC in the evolution of X--linked color vision geneslinked color vision genesRed and green opsin genes in humansRed and green opsin genes in humans
–– two or more conversion eventstwo or more conversion events may have occurred at different times in may have occurred at different times in introns 4introns 4 of the two pigment genes in of the two pigment genes in baboonsbaboons
Concerted EvolutionConcerted EvolutionConcerted Evolution
exon 4+5Ks = 4.2Ka = 0.0
exon 4+5exon 4+5KKs = 4.2s = 4.2KKa = 0.0a = 0.0
intron 4K = 1.1
intronintron 44KK = 1.1= 1.1
exon 4+5Ks = 5.8Ka = 1.1
exon 4+5exon 4+5KKs = 5.8s = 5.8KKa = 1.1a = 1.1
intron 4K = 7.3
intronintron 44KK = 7.3= 7.3
exon 4+5Ks = 0.0Ka = 0.7
exon 4+5exon 4+5KKs = 0.0s = 0.0KKa = 0.7a = 0.7
intron 4K = 1.0
intronintron 44KK = 1.0= 1.0
exon 4+5Ks = 9.0Ka = 1.3
exon 4+5exon 4+5KKs = 9.0s = 9.0KKa = 1.3a = 1.3
intron 4K = 7.1
intronintron 44KK = 7.1= 7.1
[Gene Evolution]
Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution
GC in the evolution of XGC in the evolution of X--linked color vision geneslinked color vision genesRed and green opsin genes in humansRed and green opsin genes in humans
–– strong strong natural selectionnatural selection for maintaining the distinct functions of exons 4 and for maintaining the distinct functions of exons 4 and 5 of the red and green pigment genes has 5 of the red and green pigment genes has acted against sequence acted against sequence homogenization homogenization of theseof these exonsexons
Concerted EvolutionConcerted EvolutionConcerted Evolution
exon 4+5Ks = 4.2Ka = 0.0
exon 4+5exon 4+5KKs = 4.2s = 4.2KKa = 0.0a = 0.0
intron 4K = 1.1
intronintron 44KK = 1.1= 1.1
exon 4+5Ks = 5.8Ka = 1.1
exon 4+5exon 4+5KKs = 5.8s = 5.8KKa = 1.1a = 1.1
intron 4K = 7.3
intronintron 44KK = 7.3= 7.3
exon 4+5Ks = 0.0Ka = 0.7
exon 4+5exon 4+5KKs = 0.0s = 0.0KKa = 0.7a = 0.7
intron 4K = 1.0
intronintron 44KK = 1.0= 1.0
exon 4+5Ks = 9.0Ka = 1.3
exon 4+5exon 4+5KKs = 9.0s = 9.0KKa = 1.3a = 1.3
intron 4K = 7.1
intronintron 44KK = 7.1= 7.1
29
[Gene Evolution]
Gene EvolutionGene EvolutionGene Evolution[Gene Evolution]
Concerted evolution of genes and pseudogenesConcerted evolution of genes and pseudogenesConcerted evolution of genes and pseudogenes
Pseudogenes Pseudogenes may representmay represent reservoirs of genetic information reservoirs of genetic information that that participate in theparticipate in the evolution of new genesevolution of new genes, rather than, rather than relics relics of of inactivated genes whose fate is genomic extinctioninactivated genes whose fate is genomic extinction
–– the the proximity of a gene to a pseudogeneproximity of a gene to a pseudogene, however, may not only spell , however, may not only spell rebirth for the pseudogene, but also rebirth for the pseudogene, but also death for the genedeath for the gene
2121--hydroxylase (cytochrome P21) genehydroxylase (cytochrome P21) gene
–– example of gene death by concerted evolutionexample of gene death by concerted evolution
–– involved in the involved in the Congenital Adrenal HyperplasiaCongenital Adrenal Hyperplasia
–– 1010--exon geneexon gene located on located on chromosome 6chromosome 6 in a region in which many MHC in a region in which many MHC and complement genes are interspersed with each otherand complement genes are interspersed with each other
–– there is a there is a paralogous unprocessed pseudogene in the vicinityparalogous unprocessed pseudogene in the vicinity
–– in many organisms one of the genes became nonfunctionalin many organisms one of the genes became nonfunctional
–– the the nonfunctionalization nonfunctionalization eventevent occurred independentlyoccurred independently in many lineagesin many lineages
ortholog of the ortholog of the human functional genehuman functional gene pseudogene in mousepseudogene in mouse
ortholog of the ortholog of the human pseudogenehuman pseudogene functional gene in mousefunctional gene in mouse
Concerted EvolutionConcerted EvolutionConcerted Evolution
[Gene Evolution]
Concerted evolution of genes and pseudogenesConcerted evolution of genes and pseudogenesConcerted evolution of genes and pseudogenes
2121--hydroxylase (cytochrome P21) genehydroxylase (cytochrome P21) gene–– Congenital Adrenal HyperplasiaCongenital Adrenal Hyperplasia (21(21--Hydroxylase deficiency)Hydroxylase deficiency)
Concerted EvolutionConcerted EvolutionConcerted Evolution
–– hundreds of mutations in the 21hundreds of mutations in the 21--hydroxylase hydroxylase gene have been describedgene have been described
–– 75% of them are due to gene conversion75% of them are due to gene conversion