Conservation of sequence and function in the Pax6 regulatory elements

5
resulted in the generation of two ‘false’ CAPS. This can be bypassed by developing markers from higher quality sequence data (i.e. with higher levels of redundancy) while avoiding single-read sequences. In practice, how- ever, an 80% marker success rate is more than acceptable because of the low cost of primer synthesis and primer testing and the abundance of snip-SNPs between the genomes of different accessions. Finally, common sense must be followed when using BlastDigester. The sequence used for the input Blast should, if possible, be unique and not align with paralogous sequences in the genome. This will avoid problems during PCR amplification, namely the gener- ation of multiple PCR products. Concluding remarks With its all-in-one design, this program will have immediate application in positional cloning in plant and animal species for which sequence information is avail- able. Although most useful for gene mapping, BlastDige- ster will also be a useful tool in a broad range of genetic studies [e.g. genotyping, phylogenetics, quantitative trait loci (QTL) mapping and RT–PCR-based expression studies of multigene families]. Considering the large number of genome sequencing projects that are currently underway or in the planning stages, BlastDigester will have immediate appeal for diverse groups of bench scientists in the genetic and genomic communities. Acknowledgements We thank Katrien Devos and Sean Cutler for critically reading and commenting on this manuscript. This work was funded by NSERC and Genome Canada. References 1 Wicks, S.R. et al. (2001) Rapid gene mapping in Caenorhabditis elegans using a high density polymorphism map. Nat. Genet. 28, 160–164 2 Hoskins, R.A. et al. (2001) Single nucleotide polymorphism markers for genetic mapping in Drosophila melanogaster . Genome Res. 11, 1100–1113 3 Torjek, O. et al. (2003) Establishment of a high-efficiency SNP-based framework marker set for Arabidopsis. Plant J. 36, 122–140 4 Konieczny, A. and Ausubel, F.M. (1993) A procedure for mapping Arabidopsis mutations using co-dominant ecotype-specific PCR-based markers. Plant J. 4, 403–410 5 Glazebrook, J. et al. (1998) Use of cleaved amplified polymorphic sequences (CAPS) as genetic markers in Arabidopsis thaliana. Methods Mol. Biol. 82, 173–182 6 Lukowitz, W. et al. (2000) Positional cloning in Arabidopsis. Why it feels good to have a genome initiative working for you. Plant Physiol. 123, 795–805 7 Drenkard, E. et al. (2000) A simple procedure for the analysis of single nucleotide polymorphisms facilitates map-based cloning in Arabidop- sis. Plant Physiol. 124, 1483–1492 8 Barker, G. et al. (2003) Redundancy based detection of sequence polymorphisms in expressed sequence tag data using autoSNP. Bioinformatics 19, 421–422 9 Thiel, T. et al. (2004) SNP2CAPS: a SNP and INDEL analysis tool for CAPS marker development. Nucleic Acids Res. 32, e5 10 Neff, M.M. et al. (1998) dCAPS, a simple technique for the genetic analysis of single nucleotide polymorphisms: experimental appli- cations in Arabidopsis thaliana genetics. Plant J. 14, 387–392 11 Neff, M.M. et al. (2002) Web-based primer design for single nucleotide polymorphism analysis. Trends Genet. 18, 613–615 12 Altschul, S.F. et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 13 Tatusova, T.A. and Madden, T.L. (1999) BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. FEMS Microbiol. Lett. 174, 247–250 14 Stajich, J.E. et al. (2002) The bioperl toolkit: perl modules for the life sciences. Genome Res. 12, 1611–1618 15 Arabidopsis Genome Initiative, (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796–815 16 Jander, G. et al. (2002) Arabidopsis map-based cloning in the post- genome era. Plant Physiol. 129, 440–450 17 Rhee, S.Y. et al. (2003) The Arabidopsis information resource (TAIR). A model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community. Nucleic Acids Res. 31, 224–228 0168-9525/$ - see front matter q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.tig.2004.04.012 | Genome Analysis Conservation of sequence and function in the Pax6 regulatory elements Richard Morgan Department of Basic Medical Sciences, St George’s Hospital Medical School, Cranmer Terrace, London SW17 0RE, UK The Pax6 transcription factor is a highly conserved regu- lator of eye and brain development. Its complex mode of regulation is shared between Drosophila melanoga- ster and the vertebrates, despite there being no signifi- cant sequence identity between their enhancer and promoter regions. In this article, a more detailed examination of the Pax6 genomic sequence reveals a common set of putative Pax6 and basic helix–loop– helix transcription factor binding sites. Several proteins that are involved in early developmental events are highly conserved throughout the Metazoans, a particularly striking example is paired box gene 6 (Pax6). This encodes a transcription factor with both a paired box and a homeodomain that is expressed in several different Corresponding author: Richard Morgan ([email protected]). Available online 6 May 2004 Update TRENDS in Genetics Vol.20 No.7 July 2004 283 www.sciencedirect.com

Transcript of Conservation of sequence and function in the Pax6 regulatory elements

resulted in the generation of two ‘false’ CAPS. This can bebypassed by developing markers from higher qualitysequence data (i.e. with higher levels of redundancy)while avoiding single-read sequences. In practice, how-ever, an 80% marker success rate is more than acceptablebecause of the low cost of primer synthesis and primertesting and the abundance of snip-SNPs between thegenomes of different accessions.

Finally, common sense must be followed when usingBlastDigester. The sequence used for the input Blastshould, if possible, be unique and not align withparalogous sequences in the genome. This will avoidproblems during PCR amplification, namely the gener-ation of multiple PCR products.

Concluding remarks

With its all-in-one design, this program will haveimmediate application in positional cloning in plant andanimal species for which sequence information is avail-able. Although most useful for gene mapping, BlastDige-ster will also be a useful tool in a broad range of geneticstudies [e.g. genotyping, phylogenetics, quantitative traitloci (QTL) mapping and RT–PCR-based expressionstudies of multigene families]. Considering the largenumber of genome sequencing projects that are currentlyunderway or in the planning stages, BlastDigester willhave immediate appeal for diverse groups of benchscientists in the genetic and genomic communities.

AcknowledgementsWe thank Katrien Devos and Sean Cutler for critically reading andcommenting on this manuscript. This work was funded by NSERC andGenome Canada.

References

1 Wicks, S.R. et al. (2001) Rapid gene mapping inCaenorhabditis elegansusing a high density polymorphism map. Nat. Genet. 28, 160–164

2 Hoskins, R.A. et al. (2001) Single nucleotide polymorphism markers for

genetic mapping in Drosophila melanogaster. Genome Res. 11,1100–1113

3 Torjek, O. et al. (2003) Establishment of a high-efficiency SNP-basedframework marker set for Arabidopsis. Plant J. 36, 122–140

4 Konieczny, A. and Ausubel, F.M. (1993) A procedure for mappingArabidopsis mutations using co-dominant ecotype-specific PCR-basedmarkers. Plant J. 4, 403–410

5 Glazebrook, J. et al. (1998) Use of cleaved amplified polymorphicsequences (CAPS) as genetic markers in Arabidopsis thaliana.Methods Mol. Biol. 82, 173–182

6 Lukowitz, W. et al. (2000) Positional cloning in Arabidopsis. Why itfeels good to have a genome initiative working for you. Plant Physiol.123, 795–805

7 Drenkard, E. et al. (2000) A simple procedure for the analysis of singlenucleotide polymorphisms facilitates map-based cloning in Arabidop-

sis. Plant Physiol. 124, 1483–14928 Barker, G. et al. (2003) Redundancy based detection of sequence

polymorphisms in expressed sequence tag data using autoSNP.Bioinformatics 19, 421–422

9 Thiel, T. et al. (2004) SNP2CAPS: a SNP and INDEL analysis tool forCAPS marker development. Nucleic Acids Res. 32, e5

10 Neff, M.M. et al. (1998) dCAPS, a simple technique for the geneticanalysis of single nucleotide polymorphisms: experimental appli-cations in Arabidopsis thaliana genetics. Plant J. 14, 387–392

11 Neff, M.M. et al. (2002) Web-based primer design for single nucleotidepolymorphism analysis. Trends Genet. 18, 613–615

12 Altschul, S.F. et al. (1997) Gapped BLAST and PSI-BLAST: a newgeneration of protein database search programs.Nucleic Acids Res. 25,3389–3402

13 Tatusova, T.A. and Madden, T.L. (1999) BLAST 2 Sequences, a newtool for comparing protein and nucleotide sequences. FEMSMicrobiol.Lett. 174, 247–250

14 Stajich, J.E. et al. (2002) The bioperl toolkit: perl modules for the lifesciences. Genome Res. 12, 1611–1618

15 Arabidopsis Genome Initiative, (2000) Analysis of the genome sequenceof the flowering plant Arabidopsis thaliana. Nature 408, 796–815

16 Jander, G. et al. (2002) Arabidopsis map-based cloning in the post-genome era. Plant Physiol. 129, 440–450

17 Rhee, S.Y. et al. (2003) TheArabidopsis information resource (TAIR). Amodel organism database providing a centralized, curated gateway toArabidopsis biology, research materials and community. Nucleic AcidsRes. 31, 224–228

0168-9525/$ - see front matter q 2004 Elsevier Ltd. All rights reserved.doi:10.1016/j.tig.2004.04.012

|Genome Analysis

Conservation of sequence and function in the Pax6regulatory elements

Richard Morgan

Department of Basic Medical Sciences, St George’s Hospital Medical School, Cranmer Terrace, London SW17 0RE, UK

The Pax6 transcription factor is a highly conserved regu-

lator of eye and brain development. Its complex mode

of regulation is shared between Drosophila melanoga-

ster and the vertebrates, despite there being no signifi-

cant sequence identity between their enhancer and

promoter regions. In this article, a more detailed

examination of the Pax6 genomic sequence reveals a

common set of putative Pax6 and basic helix–loop–

helix transcription factor binding sites.

Several proteins that are involved in early developmentalevents are highly conserved throughout the Metazoans, aparticularly striking example is paired box gene 6 (Pax6).This encodes a transcription factor with both a paired boxand a homeodomain that is expressed in several different

Corresponding author: Richard Morgan ([email protected]).

Available online 6 May 2004

Update TRENDS in Genetics Vol.20 No.7 July 2004 283

www.sciencedirect.com

embryonic tissues including the eye, brain and pancreas[1]. Studies of mouse Pax6 mutants have established thatit has a role in the development of each of these organs.Most notably, Pax6 mutations lead to smaller eyes in mice[2], a phenomenon that is mirrored in humans wheremutations in PAX6 lead to eye defects including aniridia[3]. Pax6 (also known as ey) is also required for eyedevelopment in Drosophila melanogaster [4] and ectopicexpression of Pax6 in the ectoderm of Xenopus laevisembryos results in the formation of additional eyestructures [5].

Pax6 homologues have also been cloned from theprotochordate Ciona intestinalis (AB07983) and thenematode Caenorhabditis elegans [6,7]. In C. elegans,variable abnormal-3 (Vab-3), an orthologue of Pax6, isrequired early in development for the specification ofepidermal cell fate and, subsequently, for the correctspecification of some neuronal subtypes [7]. A shorterversion male abnormal-18 (Mab-18) is transcribed from aninternal promoter and lacks the paired-box bindingdomain. Mab-18 is required for the specification of senseorgan identity [8].

Transcriptional regulation of Pax6

The highly conserved developmental role of Pax6 hasprompted several studies on its transcriptional regulationin Drosophila and mice. In Drosophila, two intronicenhancers have been identified that drive Pax6 expression

Figure 1. Regulatory sites in the Drosophila melanogaster paired box gene 6 (Pax6)

gene. The genomic map of the 50 transcribed region of the gene is shown. Enhan-

cer regions identified to date are shown as green ovals and the regions in which

each enhancer directs expression of a reporter construct is indicated above. Exons

are shown as vertical bars (light blue). The blue circles denote sequences that are

.90% identical to the previously defined Pax6 consensus binding sequence. The

level of sequence identity with the equivalent regions of the Drosophila hydei,

Drosophila pseudoobscura and Anopheles Pax6 genes is shown. The solid red

lines indicate an identity of .60%, black lines indicate an identity of 30%–60% and

the grey lines indicate an identity of 10%–30%. The black triangle indicates the

position of the ATG start codon and the vertical black bar with an asterisk indicates

the location of the highly conserved ‘E-box containing’ region that is reported here

(see main text).

TRENDS in Genetics

Mushroom bodiesVentral nerve cord

Eye

2 3

Drosophila hydei

Drosophila pseudopbscura

Anopheles

1 2 3

Figure 2. Regulatory sites in the mouse paired box gene 6 (Pax6) gene. The genomic map of the Pax6 gene is shown. Enhancer regions identified to date are shown as red

ovals and the regions in which each enhancer directs expression of a reporter construct is indicated above. (For more details see Refs [11,12,14,15]). Exons are shown as

vertical bars (light blue). The three promoter regions where the transcription of P0, P1 and Pa start are shown in the lower panel (an enlargement of this region is shown)

together with the exons that are present in each transcript and the sites at which these transcripts are expressed in the mouse embryo. The vertical black bar with an aster-

isk indicates the location of the highly conserved region that is reported here (see main text). The black triangle indicates the position of the ATG start codon. Abbreviations:

CNS, central nervous system; NTDP, non-terminally differentiated neurones.

TRENDS in Genetics

0234567 1234567 α567

Pancreas

LensCorneaLacrimalglandConjunctiva

DorsaltelencephalonHindbrainSpinal cord

Post mitoticNTDP

Amacrine cells, irisCiliary bodyNeural retinaPigmented layer of retina

Olfactory regionPretectumNeural retina

Transcript

Expression

∗0 2010 60 kb4030 500 1 2 34 5 6 7α

5 1510 30 kb2520

Lens placodeCorneal and conjunctivalepithelium

Lens placodeOptic vesicleCNSCorneal and conjunctivalepithelium

CNS

P0 P1 Pα

0 1 2 3 4 5 6 7α

Update TRENDS in Genetics Vol.20 No.7 July 2004284

www.sciencedirect.com

in the nervous system and eye, respectively (Figure 1)[9,10]. Both of these are found between the second andthird exons and share a significant degree of sequenceidentity with other Drosophila species and with theAnopheles mosquito. The extent of this sequence identitybetween D.melanogaster and Drosophila hydei is reflectedin the ability of the D. hydei enhancer sequence torecapitulate the Pax6 expression pattern in D. melano-gaster [10]. Both of these enhancers contain putative15–17-bp binding-site motifs for Pax6 and, indeed, theeye enhancer contains five of these sites that togetherdominate this region (Figure 1) [10]. These binding

sites are probably required to mediate the autoregula-tory control of Pax6, whereby it activates its owntranscription [10].

There is no statistically significant sequence identitybetween Drosophila and vertebrate species in the non-coding regions of Pax6. Despite this, the Drosophilaenhancers that are found between exons two and threecan recapitulate the major features of mouse Pax6expression during development when placed in an appro-priate reporter construct [11]. Similar to its Drosophilacounterpart, mousePax6 has been the subject of numerousstudies and the most significant findings of these aresummarized in Figure 2. To describe its regulation ascomplex would be something of an understatement. Inmice, Pax6 transcription can start from three sites that areassociated with the P0, P1 and Pa promoters [12,13].These are under the control of at least six differentenhancers, located upstream and downstream of eachpromoter. One of these enhancers directs Pax6 expressionin the pancreas [11,12], whereas another, located betweenP0 and P1, drives expression in the brain [12]. The otherfour enhancers regulate different aspects of Pax6expression in the developing eye [11,12,14,15].

Although the majority of studies on vertebrate Pax6regulation have focused on the mouse gene, some groupshave also studied the enhancer elements of other speciesincluding quail, Fugu rubripes (pufferfish) and Xenopus.In general, there seems to be a high degree of functionalconservation, for example, Fugu enhancer elements can

Figure 3. Alignment of the mouse paired box gene 6 (Pax6) genomic region with the Pax6 gene from other species. For the purposes of this alignment, the first exon of

each species was taken as a common reference point; the areas identified as functional enhancers in vertebrates are unlikely to correspond precisely to locations in non-

vertebrate genomes. The solid red lines indicate a sequence identity of .60%, black lines indicate an identity of 30%–60% and the grey lines indicate an identity of 10%–30%.

The blue circles represent putative Pax6 binding sites. The black triangle indicates the position of the ATG start codon. The black line marked by an asterisk shows the position

of the putative E-box type binding site.

TRENDS in Genetics

1kb

Human

Fugu rubripes

Zebra fish

Quail

Drosophila melanogaster

Anopheles

Caenorhabdites elegans

Ciona

1 2 3 4 5 6α∗0 7

Pancreas

LensCorneaLacrimalglandConjunctiva

DorsaltelencephalonHindbrainSpinal cord

Post mitoticNTDP

Amacrine cells, irisCiliary bodyNeural retinaPigmented layer of retina

Figure 4. Sequence identity of putative Pax6 binding sites compared with the

previously described Pax6 consensus binding site. The different shades of blue

indicate the percentage of sites that correspond to the consensus at any given

position.

TRENDS in Genetics

A--TTCACGC T A-T ATA GC GTCA

CT

Drosophila melanogaster

Mouse

Fugu rubripes

Ciona

Caenorhabditis elegans

Pax6 consensus

0 20 40 60 80 100%

Update TRENDS in Genetics Vol.20 No.7 July 2004 285

www.sciencedirect.com

substitute for their murine equivalents [12]. This iscoupled with a high level of sequence identity betweenthese enhancer regions and the putative enhancer regionsof different vertebrate species (Figure 3) [12]. As isapparent from Figure 3, there are comparatively longstretches of sequence identity of.30% between most of thevertebrate enhancers. There are no significant stretches ofsequence identity between the non-coding regions of thevertebrate Pax6 genes and those of the non-vertebratespecies (Figure 3), although it should be noted that it is notalways obvious how to align these genes given the differentpromoter and enhance locations in different species. Oneapproach is to align sequences by using the first exon as acommon point of reference (Figure 3). The second approachis to compare enhancers that are known to be functionallyanalogous between vertebrates and invertebrates. Butneither of these approaches revealed any significantsequence identity in this study. There are, however,several short sequences that closely match the Pax6consensus binding site that are present in all of thegenes studied, often in an equivalent position comparedwith the first exon (Figure 3). In addition, there areputative Pax6 binding sites in the fly (Figure 1) andvertebrate enhancers that drive expression in the nervoussystem and the eye (Figure 3). These sites mediate theauto-activation of Pax6 in Drosophila [10] and there issome evidence that they serve the same function invertebrates [16]. Comparing these putative Pax6 bindingsites with representatives of different phyla indicates thatthey are highly conserved (Figure 4), although it should benoted that experimental evidence for these sites beingfunctional is limited to just a few examples. The apparentconservation of these Pax6 binding sites might imply thatthe key conserved function of the Pax6 enhancers is toenable the gene to respond to its own product.

In addition to the putative Pax6 binding sites, there is asecond short area of sequence identity that is located in thesame orientation and has a similar position relative to thefirst exon in each of the species considered here (Figure 3).Intriguingly, this sequence contains an ‘E-box’ (Figure 5), aconsensus binding site for a large group of genes thatencode transcription factors that are known to beimportant in development. These genes are characterizedby their basic helix–loop–helix (bHLH) DNA-interactingdomain and include basic helix–loop–helix domaincontaining class B5 (BHLHB5), neurogenic factor 1

(Neurod1), Neurogenin 2 (Neurog2), hairy and enhancerof split 1 (Hes1), murine atonal homologue-2 (MATH-2)and inhibitor of DNA binding 1 (Id1), all of which areknown to have a key role in the development of the brainand sensory organs [17–22]. CLOCK, which encodesanother bHLH transcription factor that forms part of thecircadian loop, is also required for the correct specificationof the forebrain and eyes [23]. Of these genes, Neurog2 andNeurod1 are known to activate Pax6 expression [18,24],although the E-box binding site described for Neurod1 isnot the same as that identified here and does not seem to beconserved between species. Another E-box binding gene,BHLHB5, has been shown to repress transcription fromthe human Pax6 promoter in vitro [22].

Concluding remarks

What, if any, is the significance of these isolated sequenceidentities? Given the lack of any significant stretches ofsequence identity between the non-coding regions of thedifferent Pax6 genes, it is tempting to speculate that theseisolated transcription factor binding sites actually repre-sent a common ancestral control mechanism. Subse-quently, more specific ‘fine-tuning’ mechanisms havegradually been imposed. The extent to which these aretrue will no doubt become clear with our increasingunderstanding of Pax6 regulation.

AcknowledgementsThe author is grateful to St George’s Hospital Medical School and TheRoyal Society for their support, and to A. Morgan for her careful proofreading of the manuscript.

References

1 Simpson, T.I. and Price, D.J. (2002) Pax6; a pleiotropic player indevelopment. BioEssays 24, 1041–1051

2 Hill, R.E. et al. (1991) Mouse small eye results from mutations in apaired-like homeobox-containing gene. Nature 354, 522–525

3 Ton, C.C. et al. (1991) Positional cloning and characterization of apaired box- and homeobox-containing gene from the aniridia region.Cell 67, 1059–1074

4 Quiring, R. et al. (1994) Homology of the eyeless gene of Drosophila tothe small eye gene in mice and Aniridia in humans. Science 265,785–789

5 Onuma, Y. et al. (2002) Conservation of Pax 6 function and upstreamactivation by Notch signaling in eye development of frogs and flies.Proc. Natl. Acad. Sci. U. S. A. 99, 2020–2025

6 Chamberlin, H.M. and Sternberg, P.W. (1995) Mutations in theCaenorhabditis elegans gene vab-3 reveal distinct roles in fatespecification and unequal cytokinesis in an asymmetric cell division.Dev. Biol. 170, 679–689

Figure 5. Alignment of the genomic region 50 to the first exon (as shown in Figure 3). The grey boxed area indicates the putative E-box consensus sequence within the

aligned sequences. ‘Distance 50 to exon 10, is the distance in base pairs that the first ‘C’ in the ‘CACGTG’ core is from the first base pair of exon 1, as defined in Figures 1–3.

TRENDS in Genetics

TCAAA--------CAAATGG--CACGTGGGGAG 877

TCAATATTT-TTCCACA-----CACGTGACGCG 1011

TCAACGGTGTTGCCACATGG--CACGTGGCACA 1701

TCAATGATT----CACTTGAAACACGTGACTGG 2051TCAATGATT----CACATGAAACACGTGACATT 1312

TCAA---------CAAATGG--CACGTGGGAGA 908

TCAACGCTGTTCCCACATGG--CACGTGGCACA 1741

TCAA---------CAAATGG--CACGTGGGAGA 837

TCAAA--------CAAATGG--CACGTGGGGAG 838

Species

XenopusZebrafishMouseHumanCionaDrosophila melanogasterDrosophila pseudopbscuraCaenorhabditis elegansCaenorhabditis briggsae

Accession

Z49937

AY265960

AF098640U63833

AY048575

Distance 5′ to exon 1Conserved sequence region

Update TRENDS in Genetics Vol.20 No.7 July 2004286

www.sciencedirect.com

7 Chisholm, A.D. and Horvitz, H.R. (1995) Patterning of the Caenor-habditis elegans head region by the Pax-6 family member vab-3.Nature 377, 52–55

8 Zhang, Y. and Emmons, S.W. (1995) Specification of sense-organ identityby a Caenorhabditis elegans Pax-6 homologue. Nature 377, 55–59

9 Adachi, Y. et al. (2003) Conserved cis-regulatory modules mediatecomplex neural expression patterns of the eyeless gene in theDrosophila brain. Mech. Dev. 120, 1113–1126

10 Hauck, B. et al. (1999) Functional analysis of an eye specific enhancerof the eyeless gene in Drosophila. Proc. Natl. Acad. Sci. U. S. A. 96,564–569

11 Xu, P.X. et al. (1999) Regulation of Pax6 expression is conservedbetween mice and flies. Development 126, 383–395

12 Kammandel, B. et al. (1999) Distinct cis-essential modules direct thetime-space pattern of the Pax6 gene activity. Dev. Biol. 205, 79–97

13 Anderson, T.R. et al. (2002) Differential Pax6 promoter activity andtranscript expression during forebrain development. Mech. Dev. 114,171–175

14 Griffin, C. et al. (2002) New 30 elements control Pax6 expression in thedeveloping pretectum, neural retina and olfactory region. Mech. Dev.112, 89–100

15 Williams, S.C. (1998) A highly conserved lens transcriptional controlelement from the Pax-6 gene. Mech. Dev. 73, 225–229

16 Chow, R.L. et al. (1999) Pax6 induces ectopic eyes in a vertebrate.Development 126, 4213–4222

17 Cai, L. et al. (2000) Misexpression of basic helix–loop–helix genes inthe murine cerebral cortex affects cell fate choices and neuronalsurvival. Development 127, 3021–3030

18 Scardigli, R. et al. (2001) Crossregulation between Neurogenin2 andpathways specifying neuronal identity in the spinal cord. Neuron 31,203–217

19 Kageyama, R. et al. (1997) bHLH transcription factors andmammalian neuronal differentiation. Int. J. Biochem. Cell Biol.29, 1389–1399

20 Kageyama, R. et al. (2000) The bHLH gene Hes1 regulates differen-tiation of multiple cell types. Mol. Cells 10, 1–7

21 Ross, S.E. et al. (2003) Basic helix–loop–helix factors in corticaldevelopment. Neuron 39, 13–25

22 Xu, Z.P. et al. (2002) Functional and structural characterization of thehuman gene BHLHB5, encoding a basic helix–loop–helix transcrip-tion factor. Genomics 80, 311–318

23 Morgan, R. (2002) The circadian gene clock is required for the correctearly expression of the head specific gene Otx2. Int. J. Dev. Biol. 46,999–1004

24 Marsich, E. et al. (2003) The PAX6 gene is activated by the basic helix–loop–helix transcription factor NeuroD/BETA2. Biochem. J. 376,707–715

0168-9525/$ - see front matter q 2004 Elsevier Ltd. All rights reserved.doi:10.1016/j.tig.2004.04.009

A common framework for understanding the origin ofgenetic dominance and evolutionary fates of geneduplications

Fyodor A. Kondrashov1 and Eugene V. Koonin2

1Section of Evolution and Ecology, University of California at Davis, Davis, CA 95616, USA2National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20892, USA

The dominance of wild-type alleles and the concomi-

tant recessivity of deleterious mutant alleles might

have evolved by natural selection or could be a by-

product of the molecular and physiological mechanisms

of gene action. We compared the properties of human

haplosufficient genes, whose wild-type alleles are

dominant over loss-of-function alleles, with haplo-

insufficient (recessive wild-type) genes, which produce

an abnormal phenotype when heterozygous for a loss-

of-function allele. The fraction of haplosufficient genes

is the highest among the genes that encode enzymes,

which is best compatible with the physiological theory.

Haploinsufficient genes, on average, have more para-

logs than haplosufficient genes, supporting the idea

that gene dosage could be important for the initial fix-

ation of duplications. Thus, haplo(in)sufficiency of a

gene and its propensity for duplication might have a

common evolutionary basis.

The contributions of the individual alleles to the phenotypeare often non-additive. Ever since Mendel’s experiments,

it has been recognized that, at many loci, wild-type allelesare dominant and mutant alleles are recessive. Fisherargued that dominance of wild-type alleles evolved bynatural selection because dominance shields heterozygousorganisms from the adverse effects of deleterious alleles[1,2]. This concept has been criticized by Wright [3] whonoted that selection favoring modifiers of dominancewould be weak and unable to overcome genetic drift.Wright suggested that dominance of wild-type alleles isnot an adaptation but rather a by-product of the ‘physiol-ogy of the organism’ and that, in dosage-sensitive genes,wild-type alleles should be recessive.

A key prediction of Fisher’s theory, which states thatdominance of wild-type alleles should be rare in a haploidorganism artificially induced to be diploid, failed to cometrue [4]. By contrast, the physiological theory of dominancewas supported by a theoretical analysis of metabolicpathways that showed that dominance of wild-type allelesof enzymes could be a simple consequence of flux functions[5]. Consequently, the physiological theory [3,5] is cur-rently the preferred explanation for dominance. Accordingto this theory, if a gene is part of a multi-step pathway, thephenotype associated with mutation of this gene should beinsensitive to the gene dosage [3]. Enzymes are thought to

Corresponding author: Fyodor A. Kondrashov ([email protected]).

Available online 19 May 2004

Update TRENDS in Genetics Vol.20 No.7 July 2004 287

www.sciencedirect.com