Phylogeny and Molecular Evolution of the Voltage …...ii Phylogeny and Molecular Evolution of the...
Transcript of Phylogeny and Molecular Evolution of the Voltage …...ii Phylogeny and Molecular Evolution of the...
Phylogeny and Molecular Evolution of the Voltage-Gated Sodium Channel Gene scn4aa in the Electric Fish Genus
Gymnotus
by
Dawn Dong-yi Xiao
A thesis submitted in conformity with the requirements for the degree of Masters of Science
Cell and Systems Biology University of Toronto
© Copyright by Dawn Dong-yi Xiao « 2014 »
ii
Phylogeny and Molecular Evolution of the Voltage-Gated Sodium
Channel Gene scn4aa in the Electric Fish Genus Gymnotus
Dawn Dong-yi Xiao
Masters of Science
Cell and Systems Biology
University of Toronto
« 2014 »
Abstract
Analyses of the evolution and function of voltage-gated sodium channel proteins (Navs) have
largely been limited to mutations from individual people with diagnosed neuromuscular disease.
This project investigates the carboxyl-terminus of the Nav paralog (locus scn4aa 3’) that is
preferentially expressed in electric organs of Neotropical weakly-electric fishes (Order
Gymnotiformes). As a model system, I used the genus Gymnotus, a diverse clade of fishes that
produce species-specific electric organ discharges (EODs). I clarified evolutionary relationships
among Gymnotus species using mitochondrial (cytochrome b, and 16S ribosome) and nuclear
(rag2, and scn4aa) gene sequences (3739 nucleotide positions from 28 Gymnotus species). I
analyzed the molecular evolution of scn4aa 3’, and detected evidence for positive selection at
eight amino acid sites in seven Gymnotus lineages. These eight amino acid sites are located in
motifs that may be important for modulation of EOD frequencies.
iii
Acknowledgments
This project would not have been possible were it were not for my supervisor Dr. Nathan
Lovejoy, for providing me with the opportunity to work on this project, and giving me the
freedom to take initiative. I am thankful for the support of my supervisory committee members,
Dr. Asher Cutter, and Dr. Mark Fitzpatrick. I am also indebted to Ian Buglass, for sharing a
positive outlook and encouragement.
I am grateful for the role that several people played in enhancing the content of my thesis,
and the role my supervisor played in facilitating these opportunities. Thanks to Hermina Ghenu,
for taking me through my first RNA extraction and cDNA amplification. I might still have your
“1 free PCR” coupon among my lab notes somewhere! Thanks to Dr. Belinda Chang, for
introducing me to the world of molecular evolution. Thanks to Mu-Quing Huang, not only for
providing those gene sequences that I obtained from lab records, but more importantly, for
providing additional perspectives on data formatting during the time we worked together.
Special thanks to Dr. Ari Chow, who not only shared tips on primer design, but inspired
me to cultivate perseverance and uphold scientific integrity. Special thanks to Dr. Shelley Brunt,
who not only provided me prompt advice on high throughput PCR techniques, but helped instill
critical thinking skills in myself and countless other students. I also wish to thank my family,
friends, and colleagues for their continued support, encouragement, and especially for sharing
advice from their graduate school experiences.
This project was funded by grants awarded to me from the Sigma Xi the Scientific
Research Society (Grant-in-Aid of Research, in spring 2009) and the Society of Systematic
Biologists (Graduate Student Research Award, in summer 2009). Thank you for taking a chance
on me! Funding was also provided through an NSERC discovery grant to Dr. Nathan Lovejoy,
and various grants & TA-ships from the University of Toronto.
iv
Table of Contents
Abstract .......................................................................................................................................... ii
Acknowledgements ...................................................................................................................... iii
Table of Contents ...................................................................................................................... iv-v
List of Tables ................................................................................................................................ vi
List of Figures .............................................................................................................................. vii
List of Appendices .................................................................................................................. viii-ix
Chapter 1: Introduction .......................................................................................................... 1-20
1.1 Overview ................................................................................................................................ 1-2
1.2 Clades of Electric Fish ........................................................................................................... 2-3
1.3 Phylogeny, Biogeography, and Morphology of Gymnotiformes .......................................... 4-6
1.4 Phylogeny, Biogeography, and Morphology of Gymnotus ................................................... 6-7
1.5 Evolutionary Adaptations of Electric Organ Discharges in Neotropical American
Knifefishes ................................................................................................................................. 8-11
1.6 Anatomy and Neuronal Control of Electric Organs ............................................................... 11
1.7 Cellular Features of Electrocytes and Molecular Basis of Membrane Excitability ........... 12-13
1.8 Genetic Evolution and Protein Expression of Voltage-Gated Sodium Channels ................... 14
1.9 Molecular Features and Mechanisms of Voltage-Gated Sodium Channels ...................... 15-18
1.10 Significance and Objectives ............................................................................................. 18-20
Chapter 2: Materials and Methods ...................................................................................... 21-34
2.1 Taxon Sampling ...................................................................................................................... 21
2.2 Locus Selection .................................................................................................................. 21-22
2.3 Primer Design .................................................................................................................... 22-25
2.3.1 Amplification Primers for scn4aa 3’ .............................................................................................................. 22-25
2.3.2 Sequencing Primers ............................................................................................................................................ 25
v
2.4 DNA and RNA Extraction ................................................................................................. 25-26
2.5 Nucleotide Amplification and Sequencing ............................................................................. 26
2.6 Nucleotide Sequence Verification and Alignment ............................................................ 26-30
2.7 Phylogenetic Reconstruction ............................................................................................. 30-31
2.8 Molecular Evolution Analyses ........................................................................................... 31-34
Chapter 3: Results .................................................................................................................. 35-52
3.1 Differences Between DNA and cDNA Sequences for the scn4aa 3’ ..................................... 35
3.2 Nucleotide Sequence Data ................................................................................................. 35-36
3.3 Phylogenetic Reconstruction ............................................................................................. 36-41
3.4 Patterns of Gymnotus scn4aa C-terminus Nucleotide Sequence Variation ....................... 41-45
3.5 Positively Selected Sites on the Gymnotus Nav1.4a C-terminus Amino Acid Alignment 46-51
Chapter 4: Discussion ............................................................................................................ 52-60
4.1 Evolutionary Relationships Among Gymnotus .................................................................. 52-53
4.2 Utility of the scn4aa 3’ for Phylogenetic Reconstruction .................................................. 53-54
4.3 Natural Selection at the Nav1.4a C-terminus Among Gymnotus Lineages ....................... 55-56
4.4 Natural Selection at Specific Sites of the Nav1.4a C-terminus Among Gymnotus ............ 57-59
4.5 Summary and Future Directions ........................................................................................ 59-60
References ............................................................................................................................... 61-76
vi
List of Tables
Table 1. Primer Sequences ........................................................................................................... 23
Table 2. cDNA Sequences Used for scn4aa 3’ Primer Design .................................................... 24
Table 3. Specimens and Nucleotide Sequences Used for Gymnotus Analysis ....................... 27-29
Table 4. Models of Evolution Analyzed for the Gymnotus Nav1.4a C-terminus ........................ 33
Table 5. Patterns of Gymnotus Nav1.4a C-terminus Nucleotide Sequence Variation ................. 43
Table 6. Nav1.4a C-terminus ω ratios for Gymnotus from the branch-site model A ................... 45
Table 7. Amino Acid Alignment of Nav1.4a C-terminus Showing Positively Selected Sites
Relative to Motifs of Functional Significance ......................................................................... 47-49
Table 8. Amino Acid Identities of Positively Selected Sites on the Nav1.4a C-terminus for
Various Gymnotus Species ...................................................................................................... 50-51
vii
List of Figures
Figure 1. Evolutionary Relationships Among Electrogenic Fishes and their Voltage-Gated
Sodium Channel Paralogs ............................................................................................................... 3
Figure 2. Published phylogenies of Gymnotiformes ..................................................................... 5
Figure 3. Published Phylogenies of Gymnotus .............................................................................. 7
Figure 4. Examples of Electric Organ Discharges from Gymnotiformes ...................................... 9
Figure 5. Schematic of Voltage-Gated Sodium Channel Motifs ................................................. 16
Figure 6. Molecular Phylogeny of Gymnotus Based on Various Alignments Using
Maximum Parsimony .................................................................................................................... 38
Figure 7. Molecular Phylogeny of Gymnotus Based on Various Alignments Using Bayesian
Inference ....................................................................................................................................... 39
Figure 8. Molecular Phylogeny of Gymnotus Based on the Total Evidence Alignment ............. 40
Figure 9. Molecular Phylogeny of Gymnotus and Positively Selected Lineages ........................ 44
viii
List of Appendices
Appendix A.0: Phylogeny and Molecular Evolution of the Voltage-Gated Sodium
Channel Gene scn4aa in the Electric Fish Order Gymnotiformes .................................... 77-78
A.0.1 Abstract .......................................................................................................................... 77-78
Appendix A.1: Introduction .................................................................................................. 79-81
A.1.1 Significance and Objectives ........................................................................................... 79-81
Appendix A.2: Materials and Methods ................................................................................ 82-96
A.2.1 Taxon Sampling .................................................................................................................. 82
A.2.2 Locus and Primer Selection ........................................................................................... 82-84
Appendix A Table 1. Primer Sequences ................................................................................................................ 83-84
A.2.3 DNA Extraction, Nucleotide Amplification, and Sequencing ............................................ 85
A.2.4 Nucleotide Sequence Verification and Alignment .............................................................. 85
A.2.5 Phylogenetic Reconstruction .......................................................................................... 85-96
Appendix A Table 2. Specimens and Nucleotide Sequences Used for Gymnotiformes Analysis ......................... 86-95
Appendix A.3: Results ......................................................................................................... 97-103
A.3.1 Nucleotide Sequence Data. ............................................................................................ 97-98
A.3.2 Phylogenetic Reconstruction ........................................................................................ 97-103
Appendix A Figure 1. Molecular Phylogeny of Gymnotiformes Based on the Cytb Nucleotide Alignment Using
Maximum Parsimony .................................................................................................................................................. 99
Appendix A Figure 2. Molecular Phylogeny of Gymnotiformes Based on the Rag2 Nucleotide Alignment
Using Maximum Parsimony ...................................................................................................................................... 100
Appendix A Figure 3. Molecular Phylogeny of Gymnotiformes Based on the scn4aa 3’ Nucleotide Alignment
Using Maximum Parsimony ...................................................................................................................................... 101
Appendix A Figure 4. Molecular Phylogeny of Gymnotiformes Based on the Total Evidence Alignment ............. 102
A.3.3 Variation in the Nav1.4a C-terminus ................................................................................. 103
Appendix A.4: Discussion .................................................................................................. 104-109
A.4.1 Gymnotiform Phylogeny ............................................................................................ 104-106
ix
A.4.2 Utility of the scn4aa 3’ for Phylogenetic Reconstruction .......................................... 106-107
A.4.3 Variation at the Nav1.4a C-terminus ................................................................................. 108
A.4.4 Summary and Future Directions ................................................................................ 108-109
Appendix A.5: References ................................................................................................. 110-114
1
Chapter 1 Introduction
1.1 Overview
Fishes are among the most diverse of vertebrates. Among the 60,000 described species of
vertebrates, half are fishes (Froese and Pauly 2012). Many clades of fishes are able to detect
electric fields in the water (electroreception). Some of them are also able to produce electric
fields (electrogenesis; Moller 1995; Maddison and Schulz 2007; Alves-Gomes 2001).
Electroreception can be used by aquatic organisms to detect electrical fields that are
produced as a byproduct of the muscle movement of predators and prey (Bedore and Kajiura
2013). It may also be used in conjunction with the production of weak electric discharges (< 10
V) for electrolocation and communication with conspecifics. Strong electric discharges (up to
600 V) may be used to stun prey (Crampton and Albert 2006). In many electrogenic clades,
patterns of electric discharges are species-specific, and vary based on variations in the
environment, anatomy, and molecular features.
Electrogenic fishes, especially Electrophorus electricus, are a classical model system for
studying the highly conserved mechanisms of membrane excitability (Gotter et al. 1998; Keesey
2005; Albert et al. 2008). The electrogenic cells (electrocytes) in these fishes do not need to
serve additional functions such as contraction or secretion, unlike other electrically excitable
cells (myocytes, neuroendocrine cells, etc). However, they do share key features with other
electrically excitable cells.
Voltage-gated sodium channels (Navs) are one of the main proteins supporting action
potentials. When they were discovered, the first homolog to be sequenced was from E. electricus
(Agnew 1984; Catterall 1984; Noda et al. 1984). Navs are the targets of various naturally
occurring toxins and synthetic drugs (Catterall et al. 2005). Navs are associated with mutations
that cause several human skeletal, cardiac, and neuronal diseases, which severely impact the
quality of life (Lehmann-Horn and Jukart-Rott 1999). An estimated 1 in 3500 people worldwide
will be affected a neuromuscular disorder during some point in their lives (Emery 1991).
2
Conserved genes involved in electrogenicity, such as the Navs, may contribute towards
reconstruction of phylogenetic relationships among electrogenic fishes. These fishes are a natural
source of variations in electric signals. Analyses of the evolutionary history of these molecules in
electrogenic fishes may contribute towards further understanding of molecular mechanisms for
membrane excitability (Zakon et al. 2006; Arnegard et al. 2010). These analyses may also
contribute towards understanding the evolutionary history of electrogenicity in these fishes.
1.2 Clades of Electrogenic Fish
Some clades of electrogenic fishes inhabit saltwater (Figure 1). The weakly electrogenic skates
(genus Raja) and strongly electrogenic electric rays (order Torpediniformes) belong to the class
of cartilaginous fishes (class Chondrichthyes). The strongly electrogenic stargazers (family
Uranoscopidae) belong to the class of ray-finned fishes (class Actinopterygii).
Several clades of electrogenic fishes inhabit freshwater (Figure 1), all of which belong to
the class of ray-finned fishes (class Actinopterygii). The weakly electrogenic African knifefishes
(Gymnarchus niloticus) and weakly electrogenic African elephantfishes (family Mormyridae)
belong to the order of bony-tongued fishes (order Osteoglossiformes). The weakly electrogenic
African catfishes (genera Auchenoglanis, Clarias, and Synodontis) and strongly electrogenic
African catfishes (genus Malapterurus) belong to the superorder of Ostariophysi fishes. The
weakly electrogenic Neotropical American knifefishes (order Gymnotiformes), and strongly
electrogenic South American electric eel (E. electricus in order Gymnotiformes) also belong to
the superorder of Ostariophysi fishes.
3
Figure 1. Evolutionary Relationships Among Electrogenic Fishes and their Voltage-Gated Sodium Channel Genes
Phylogenetic topology of Gnathostoma (jawed vertebrates), illustrating evolutionary relationships among electrogenic fishes and other
key clades (Moller 1995; Maddison and Schulz 2007). Branch length is not to scale. Electrogenic fishes are coloured as follows: saltwater
(green); freshwater (blue). Voltage-gated sodium channel genes associated with major clades (Gnathostoma, Teleostei, and Tetrapoda) are
listed in boxes (Goldin et al. 2000; Widmark et al. 2011).
4
1.3 Phylogeny, Biogeography, and Morphology of Gymnotiformes
Among the clades of electrogenic fish, Gymnotiformes was selected as the focus of this project,
for the following reasons: 1) the order Gymnotiformes is one of the most diverse clades of
electrogenic fish, with approximately 200 described species (Froese and Pauly 2012); 2)
phylogenetic analyses of Gymnotiformes are relatively well developed (Alves-Gomes 1999); 3)
Gymnotiformes are phylogenetically close to a model species for which the genome has been
sequenced (the zebrafish Danio rerio); 4) a classic model species for electrogenic properties is a
gymnotiform (the electric eel E. electricus; Keesey 2005; Albert et al. 2008); and 5) there is
active research on variations in electric field pattern among gymnotiform species, and
mechanisms of their electric signal production (Crampton et al. 2011).
Within the superorder Ostariophysi, the most basal extant order is the monophyletic,
saltwater-inhabiting Gonorhynchiformes (Figure 1). The other extant orders are monophyletic,
and represent over 90% of the earth's freshwater fishes (Saitoh et al. 2003). These include the
Characiformes (includes tetras and piranhas), Cypriniformes (includes minnows, such as D.
rerio), Gymnotiformes (electrogenic fishes, such as E. electricus); and Siluriformes (catfishes).
Based on morphological data, Siluriformes is the sister order of Gymnotiformes (Fink and Fink
1981). However, based on nucleotide data, Siluriformes is the sister order of Gymnotiformes and
Characiformes (Saitoh et al. 2003).
Within the order Gymnotiformes, there are two main pairs of sister families (Figure 2;
Froese and Pauly 2012): Electrophoridae (1 described species – E. electricus) and Gymnotidae
(38 described species); and Hypopomidae (25 described species) and Rhamphichthyidae (16
described species). Other families include Apteronotidae (85 described species) and
Sternopygidae (30 described species). It is not clear which family is the most basal (Figure 2).
Gymnotiforms are adapted to the lowland freshwaters of the Neotropics (Central and
South America), with wide geographical distributions (Crampton and Albert 2006). They occur
in various zones of the water column (benthic to epipelagic). Their ecological habitats (forest
streams, floodplains, deep fast-flowing rivers) are often turbid (Lissman 1958).
5
Based on
morphological data
(Ellis 1913)
Based on
morphological data
(Triques 1993; Gayet et al. 1994)
Based on
morphological data
(Mago-Leccia 1994)
Based on
strict consensus from
mitochondrial nucleotide data
(Alves-Gomes 1995)
Based on
morphological data
(Albert 2001)
Figure 2. Published phylogenies for Gymnotiformes
Phylogenetic topologies for Gymnotiformes based on morphological and nucleotide data from published sources. The families are
coloured as follows: Apteronotidae (light blue); Electrophoridae (dark blue); Gymnotidae (violet); Hypopomidae (red);
Rhamphichthyidae (yellow); Sternopygidae (green).
6
Gymnotiformes generally have subcutaneous eyes (often with poor sight), short bodies,
and lengthy tapering tails (Albert and Lundberg 1995). They have no pelvic, dorsal, or adipose
fins. However, they do have lengthy anal fins that undulate for locomotion, while their tail stays
rigid to facilitate their electroreceptive and electrogenic capabilities.
1.4 Phylogeny, Biogeography, and Morphology of Gymnotus
Among the families of Gymnotiformes, Gymnotidae was selected as the focus of this project, for
the following reasons: 1) among the families with myogenic electric organs in adulthood,
Gymnotidae is the most diverse; and 2) phylogenetic studies of Gymnotidae based on
morphology and nucleotide sequences exist for comparison (Albert et al. 2005; Lovejoy et al.
2010); and 3) there is active research on variations in electric field pattern among gymnotidae
species, and mechanisms of their electric signal production (Crampton et al. 2011).
In some classifications, the family Gymnotidae only contains the genus Gymnotus. In
other classifications, the family Gymnotidae also includes the sole species from the sister family
Electrophoridae (E. electricus). For clarity, this project will focus on the genus Gymnotus within
family Gymnotidae.
Within the genus Gymnotus, the Gymnotus carapo group is a diverse monophyletic clade
(Figure 3). Phylogenetic topology among the remaining Gymnotus species based on nucleotide
data differs from that based on morphological data.
Gymnotus fishes are the most geographically widespread of gymnotiforms (Albert et al.
2005). Their range extends from as far North as Southeastern Chiapas, Mexico (18° N), to as far
South as Rio Salado in the Pampas plains of Argentina (36° S). They occur in all the major river
systems of the Neotropics except for the estuarine Maracaibo Basin.
Gymnotus fishes are sometimes referred to as banded knifefishes, since many of the
species have obliquely arranged dark and light coloured bands along their body (Albert et al.
2005). Gymnotus fishes have superior mouths with protruding lower jaws, and gapes that are
large for gymnotiforms (Albert and Lundberg 1995; Albert et al. 2005).
7
Based on
morphological data
(Albert et al. 2005)
Based on
strict consensus of mitochondrial and nuclear nucleotide data
(after Figure 4 from Lovejoy et al. 2010)
Figure 3. Published Phylogenies for Gymnotus
Phylogenetic topologies for Gymnotus based on morphological and nucleotide data were obtained from published sources. The clades
are coloured as follows: G. carapo group (green); G. pantherinus group (violet); G. cylindricus group (red); G2 group (light blue); and
G1 group (dark blue).
8
1.5 Evolutionary Adaptations of Electric Organ Discharges in Neotropical Knifefishes
Gymnotiformes produce electric discharges using myogenic electric organs (EOs) derived from
hypaxial muscles (Albert and Lundberg 1995; Zakon and Unguez 1999; Crampton and Albert
2006). In most genera (within the families Electrophoridae, Hypopomidae, Rhamphichthyidae,
and Sternopygidae), there are species with additional accessory electric organs. In one family
(Apteronotidae), myogenic electric organs are replaced by neurogenic ones (derived from motor
neurons) during the first two months of life.
There are interspecific variations in electric organ discharge (EOD) frequencies and
waveform complexities (Crampton and Albert 2006; Figure 4). Species-specific EODs may be
produced in short pulses with frequencies up to 150 Hz (as short as ~ 7 ms between pulses in
families Electrophoridae, Gymnotidae, Rhamphichthyidae, and Hypopomidae) or continuous
waves with frequencies up to 2500 Hz (families Apteronotidae and Sternopygidae). In
Gymnotus, EODs are typically produced in pulses lasting 1-3 ms each, with frequencies of 15-70
Hz (equivalent to ~ 14-67 ms between pulses). The number of phases within each pulse is also
species-specific, with 3-4 being the most common (Crampton and Albert 2006). Other examples
of EOD variations include the low frequency (~ 10 Hz) monophasic pulses of E. electricus, low
frequency (~ 10-100 Hz) multiphasic pulses among Brachyhypopomus, low frequency (~ 30-150
Hz) monophasic waves among Sternopygus, and high frequency (900-1100 Hz) multiphasic
waves among Apteronotus.
Gymnotiforms are electroreceptive using two types of morphologically distinct
electroreceptors (Bullock 1982; Alves-Gomes 2001): ampullary electroreceptors for low-
frequency direct current (DC) signals (0.1 – 50 Hz); and tuberous electroreceptors for high-
frequency alternating current (AC) signals (50-2000 Hz). They are also electrogenic using a
variety of species-specific EOD patterns.
9
Figure 4. Examples of Electric Organ Discharges from Gymnotiformes
Traces of electric organ discharges scaled to the same peak-to-peak amplitude and plotted head-positive-up on the same time scale;
dotted line represents 0 voltage baseline (from Arnegard et al. 2010).
10
Gymnotiform ampullary electroreceptors are used for passive electrolocation. The
electroreceptors are likely tuned to the inadvertent electric signals from their prey's movements
(Collin and Whitehead 2004). Gymnotiforms are nocturnally active predators of aquatic
invertebrates (Winemiller and Adite 1997; Crampton and Albert 2006). Some gymnotiforms also
feed on terrestrial arthopods, shrimp, and small fish. Gymnotus fishes are also aggressive
predators of fishes and other aquatic animals (Albert et al. 2005).
Gymnotiform tuberous electroreceptors are used for active electrolocation. The
electroreceptors are tuned to the self-generated EODs, and interpret disturbances of the electric
field to navigate their habitat (Hopkins 1988).
Abiotic environmental conditions seem to constrain and correlate with certain EOD
characteristics (Stoddard 2002). Capacitive elements such as dense vegetation attenuate lower
frequencies, and may favour higher frequency EODs (von der emde 1990). Energy constraints
limit the anatomy of the electric organs in terms of the number of columns of electrocytes and
numbers of electrocytes per column, for optimal impedance-matching – species that inhabit
waters with higher conductivity tend to have EOs with more columns with fewer electrocytes
each (Hopkins 1999). Energy constraints in low oxygen habitats may have favoured pulse-type
EODs and other adaptations, such as aerial respiration among Gymnotus fishes (Crampton 1998;
Crampton and Albert 2006). Temperature fluctuations may also have favoured pulse-type EODs,
while fast-flowing water may have favoured wave-type EODs (Stoddard 2002).
Biotic evolutionary pressures also correlate with certain EOD characteristics.
Gymnotiforms coexist in polyphyletic species assemblages with piscivorous siluriforms and
other gymnotiforms (Crampton and Albert 2006). Gymnotiforms' predators include siluriforms
and Potamotrygonidaes (river stingray family within order Rajiformes) with ampullary
electroreceptors, as well as some other gymnotiforms such as E. electricus (Szabo et al. 1972;
Szamier and Bennett 1980; Lovejoy 1996; Stoddard 1999; Alves-Gomes 2001; Stoddard 2002).
Siluriforms generally do not have tuberous electroreceptors, with the possible exception of the
family Cetopsidae (Alves-Gomes 2001). Predation avoidance may have favoured higher
frequency, lower magnitude, and more complex EOD waveforms. It may have favoured lack of
DC content among wave-type EODs, and existence of occasional silence among pulse-type
11
EODs (Stoddard 1999; Alves-Gomes 2001; Stoddard 2002). It may also have favoured
androgen-induced handicaps in males through sexual selection (Hopkins 1988; Hopkins et al.
1990; Stoddard 1999; Stoddard 2003; Stoddard 2006; Zahavi 2003).
1.6 Anatomy and Neuronal Control of Electric Organs
Electric organs are typically located immediately above and along the anal fin musculature.
Within electric organs, tubes of connective tissue are arranged one above the other in the dorsal-
ventral plane. Within these tubes, electrocytes are arranged midway within stacked
compartments divided by connective tissue septa (Bennett and Grundfest 1959). Variation in
number, size, and shape of electrocytes are associated with variations in the electric organ
discharge amplitude (EOD) (Caputi 1999).
A lattice hierarchy of neurons innervates electrocytes (Lorenzo et al. 1993; Caputi 1999).
The EOD frequency is synchronized by pacemaker cells in the medulla, providing input to a
group of 70-90 relay neurons at the ventral surface of the medulla. Relay neuron processes
extend along the bulbospinal tract to provide input to electromotor neurons, which provide input
to electrocytes. In Gymnotidae, axons of the relay neurons vary in length and conduction
velocity. Relay neurons with slower fibres primarily project onto rostral electromotor neurons,
while those with faster fibres primarily project onto caudal ones.
In monophasic fish, electromotor neurons only innervate the rostral or caudal face. In
multiphasic fish, there are two morphologically distinct types of electromotor neurons. Small
(25-40 μm) round neurons with fine dendrites lacking spines innervate the rostral face of rostral
electrocytes. Large (45-60 μm) oval neurons with thick dendrites up to 200 μm long innervated
the caudal face of caudal electrocytes. Both small and large electromotor neurons innervate
electrocytes in the mid-section of the electric organ, on the electrocytes' rostral and caudal faces,
respectively. The earlier portions of the EOD waveform are produced by smaller neurons
recruiting a small number of electrocytes, according to Henneman's size principle (Henneman
1957). Variation in innervation patterns of electrocytes is associated with variations in the EOD
amplitude and waveform.
12
1.7 Cellular Features of Electrocytes and Molecular Basis of Membrane Excitability
Electrocytes are multinucleated cells with similar cellular and molecular features to myocytes in
striated muscle (Machado et al. 1976, Machado et al. 1980; Yablonka-Reuveni 2011). In
Electrophorus electricus, the innervated and its opposing non-innervated plasma membranes are
undulated, with those on the latter membrane more so. This provides an increased surface area
on which the abundant macromolecules associated with electroexcitability are anchored. The
majority of organelles and glycogen granules are located near these undulating plasma
membranes to support and provide energy for electric organ discharge (EOD) production (Gotter
et al. 1998; Machado et al. 1976; Williamson et al. 1967). Binding sites for calcium (a
ubiquitous signalling molecule) are also located near these undulating plasma membranes (de
Arujo Jorge et al. 1979). A loose filamentous network consisting mostly of microtubules, actin,
and desmin (which is characteristic of myocytes) maintains the cell morphology and
macromolecule localization (Benchimol et al. 1978; Gotter et al. 1998; Mermelstein et al. 2000).
The cells' resting potential is mainly due to K+ ions (Lester 1978), though contribution
from Cl- has not been ruled out (Ferrari and Zakon 1993). The potential of approximately -85
mV across each face (Keynes and Martins-Ferreira 1953) is within 10-15 mV of that in neurons
and myocytes (Hopkins 2006), and is similarly maintained by an abundance of Na+/K
+ ATPase
ion channels moving 3 Na+ out for every 2 K
+ in (Morth et al. 2011). These ion channels are also
concentrated at the undulating plasma membranes, especially the non-innervated membrane
(Solmó et al. 1977; Ariyasu et al. 1987).
The cells' action potentials are mainly due to cholinergic synapses and voltage-gated ion
channels at the innervated plasma membranes (Gotter et al. 1998). These action potentials occur
in a series of events similar to that in neurons and myocytes (Hodgkin et al. 1952, Gotter et al.
1998, Ruff 2003). When acetylcholine from the innervating motor neurons bind to the nicotinic
acetylcholine receptor ion channels in the innervated plasma membrane, the ion channels change
conformation, allowing Na+ and K
+ to flow down their electrochemical gradient into the cells
(Heidmann and Changeux 1978). If the cells' membrane potential depolarizes (becomes more
13
neutral or positive) by approximately 10-15 mV (Hodgkin et al. 1952; Keynes and Martins-
Ferreira 1953), an action potential will be triggered.
The voltage-gated ion channels that contribute to an action potential are mainly sodium
channels (Navs) that facilitate Na+ influx, though potassium channels that facilitate K
+ efflux are
also thought to contribute to repolarization of the cells (Nakamura et al. 1965; Ferrari and Zakon
1993), and calcium channels have been hypothesized to facilitate influx of Ca2+
(Gotter et al.
1998). The Navs are concentrated at the innervated undulating plasma membranes (Ellisman and
Levinson 1982; Fritz and Brockes 1983).
When an action potential is triggered, Navs undergo fast activation (typically < 1 ms), and
allow Na+ to flow down its chemical gradient into the cells (Hodgkin et al. 1952; Ulbricht 2005).
When the peak potential is almost reached, Navs undergo fast inactivation (typically < 1 ms), and
prevent more Na+ from flowing in. The peak potential of the innervated plasma membrane of E.
electricus electrocytes is approximately 65 mV (Keynes and Martins-Ferreira 1953), compared
with 45 mV and 30 mV in the giant squid axon (Hodgkin et al. 1952) and skeletal muscle
(Hopkins 2006), respectively. After the peak potential has been reached, the cells typically enter
a refractory period, during which the cells' potentials repolarize and Na+s recover back to their
resting states (Hodgkin et al. 1952).
Recovery typically proceeds on the order of milliseconds (Ulbricht 2005). If there is
prolonged or repeated depolarization, slow inactivation may occur, where recovery proceeds on
the order of seconds to minutes (Ulbricht 2005). If during recovery from inactivation (when the
cell is almost repolarized), there is a brief resurgent current of Na+ flowing in, then recovery may
proceed faster (Rose 2007; Cannon and Bean 2010).
Action potentials are a highly conserved feature across cell types and taxa, where there
are variations in specific characteristics. Variation in amplitude and time-course of
synchronously triggered action potentials are associated with variations in the EOD waveform,
amplitude, and frequency (Bennett 1961; Mills and Zakon 1987).
14
1.8 Genetic Evolution and Protein Expression of Voltage-Gated Sodium Channels
Voltage-gated ion channels are one of the largest superfamilies of signal transduction proteins,
and among the most common drug targets. They are encoded by homologous genes, and are
structurally conserved (Yu et al. 2005; Charalambous and Wallace 2011). Functional elements of
this superfamily are ion conductance, pore gating, and regulation. Members of this superfamily
include voltage-gated potassium, calcium, and sodium channels.
Among jawed vertebrates (infraphylum Gnathostoma), voltage-gated sodium channels
(Navs) are encoded by a family of paralogous genes belonging to four monophyletic lineages
(Lopreato et al. 2001; Figure 1). After the divergence of terrestrial vertebrates (superclass
Tetrapoda) and most living ray-finned fishes (infraclass Teleostei), tandem duplications in
Tetrapoda increased the number of paralogs in two of the four lineages to a total of ten, while
whole genome duplication in Teleostei doubled the number of paralogs to eight (Lopreato et al.
2001; Goldin 2002; Novak et al. 2006; Widmark et al. 2011; Figure 1). The protein structure and
functional elements are conserved among paralogs, especially within each of the four lineages of
Navs. However, they are even more conserved among orthologs across species (Catterall et al.
2005).
Navs in myogenic tissue are generally encoded by a single gene (scn4a) in Tetrapoda,
while there are two paralogs (scn4aa and scn4ab) in Actinopterygii (Goldin et al. 2000; Goldin
2002; Novak et al. 2006; Widmark et al. 2011). Gene duplication has allowed for the evolution
of gene-specific expression patterns and electrical characteristics (Lynch et al. 2001; Goldin et
al. 2002; Novak et al. 2006; Widmark et al. 2011). Non-electrogenic Actinopterygii express both
scn4aa and scn4ab in myocytes. While expression of scn4aa is absent in myocytes of
electrogenic fish with myogenic electric organs, it is preferentially expressed in electrocytes of
electrogenic fish with myogenic electric organs (Noda et al. 1984; Agnew et al. 1978; Zakon et
al. 2006; Arnegard et al. 2010). Nav paralogs (α subunits) are often associated with auxiliary β
subunits, which are involved in channel localization and functional modulation. However, α
subunits are sufficient for functional expression (Catterall et al. 2005).
15
1.9 Molecular Features and Mechanisms of Voltage-Gated Sodium Channels
Voltage-gated sodium channels (Navs) consist of approximately 1000 amino acids, with a
molecular weight of ~ 230 kDa prior to post-translational modification (Noda 1984; Cohen and
Levitt 1993). The channels are structured into four homologous domains DI-IV, each with six
transmembrane segments S1-6, and oriented with the amino-terminus (N-terminus) and
carboxyl-terminus (C-terminus) on the intracellular side (Noda 1984; Gordon et al. 1987;
Gordon et al. 1988; Catterall et al. 2005; Figure 5). The extracellular loops and transmembrane
segments are highly conserved among the Nav family, with > 50% amino acid sequence
similarity (Catterall et al. 2005). The voltage-sensing domain (VSD) includes transmembrane
segments S1-4 from each of DI-IV. The pore module (PM) includes transmembrane segments
S5-6 from each of DI-IV, and forms an extracellular funnel, selectivity filter, central cavity, and
activation gate (Payandeh et al. 2011; Zhang et al. 2012). The C-terminus consists of almost 300
amino acids (Noda et al. 1984), and includes several motifs: a flexible linker joining DIVS6; an
EF-hand; an IQ; and a PY (Cormier et al. 2002; Chagot et al. 2009).
The N-terminus may include conserved amino acid sequences for membrane localization
(Catterall et al. 2005; Eijkelkamp et al. 2012). The extracellular loops include specific amino
acids for modulation of surface charge by glycosylation, which increases the total molecular
mass by 13 kDa to 60 kDa (Levinson et al. 1986; Schmidt and Catterall 1987; Cohen and Levitt
1993; Liu et al. 2012). The N-terminus, intracellular inter-domain linkers (especially DI-II), and
C-terminus include paralog-specific phosphorylation sites for functional modulation (Emerick et
al. 1993; Cantrell and Catterall 2001; Scheuer 2010). The DII-III linker (Fache et al. 2004) and
the paralog-specific PY motif at the C-terminus (Fotia et al. 2004; Rougier et al. 2005) are
associated with protein internalization, which modulates the current magnitude.
16
a
b
Figure 5. Schematic of Voltage-Gated Sodium Channel Motifs
a) Schematic of the whole voltage-gated sodium channel (Gotter et al. 1998). Domains I-IV are
identified, with a rectangle representing each transmembrane segment. Phosphorylation sites of
the Electrophorus electricus Nav1.4a are identified by P (Emerick et al. 1993).
b) Schematic of motifs that change conformation during fast inactivation (Potet et al. 2009).
Domains III and IV are identified, with a big grey cylinder representing each transmembrane
segment. Helices of the fast inactivation occlusion particle (DIII-IV linker) are identified by the
small green and violet cylinders. Helices of the carboxyl-terminus (C-terminus) are identified by
the small blue and grey cylinders. The EF-hand and IQ motifs bind each other. Calmodulin binds
the IQ motif and a helix of the DIII-IV linker (violet cylinder).
17
Fast activation is triggered by changes in plasma membrane voltage being relayed by
regularly spaced positively charged amino acids on the S4 segments of DI-III, to change
conformations of the S3-4 and S4-5 linkers as well as the VSD (Payandeh et al. 2011; Zhang et
al. 2012; Payandeh et al. 2012; Ahern 2013). Changes in VSD conformation are relayed to the
PM by the S4-5 linkers, opening the activation gate, allowing sodium ions to pass the highly
conserved amino acids of the selectivity filter (Favre et al. 1996).
Slow inactivation is likely conferred by conformational changes near the selectivity filter
at the S5-6 linker and S6 segments lining the pore (Ulbricht 2005; Payandeh et al. 2012).
Fast inactivation is triggered by changes in voltage being relayed by regularly spaced
positively charged amino acids on the S4 segment of DIV (Ahern 2013). Changes in
conformation result in occlusion of the activation gate by the DIII-IV linker.
Fast inactivation is modulated by Ca2+
binding on the C-terminus being relayed to the
activation gate by calmodulin (Wingo et al. 2004; Young and Caldwell 2005; Sarhan et al.
2012). Calmodulin is a highly conserved calcium sensing protein that has been found in many
eukaryote cells, including electrocytes (Baba et al. 1984; Munjaal et al. 1986). It has 2 lobes,
each with a Ca2+
-binding EF-hand motif consisting of two pairs of α helices (Chin and Means
2000). The C-terminus EF-hand motif is structurally analogous to one lobe of calmodulin, but
with a lower affinity for Ca2+
(Miloushev et al. 2009). The IQ motif is found in many Ca2+
-
dependent calmodulin binding proteins (Bahler and Rhoads 2001). In the absence of Ca2+
, the
EF-hand binds loosely to the IQ motif, which binds tightly to the C-lobe of calmodulin, leaving
the N-lobe free. Since the N-lobe of calmodulin does not bind the DIII-IV linker, the linker is
free to occlude the activation gate (Wingo et al. 2004; Shah et al. 2006; Chagot et al. 2009;
Chagot et al. 2011; Sarhan et al. 2012). With increased levels of Ca2+, the EF-hand binds tightly
to the IQ motif, which binds loosely to either the N-lobe or C-lobe of calmodulin. When the C-
lobe of calmodulin binds to the DIII-IV linker, it is less likely to occlude the activation gate.
Resurgent current is associated with an alternate time course of fast inactivation. It may
result from a parallel process competing with the typical fast activation mechanism at the
activation gate (Cruz et al. 2011). It may result from specific amino acids on the S4 segment of
18
DIV (Jarecki et al. 2010). It may also result from specific amino acids on the C-terminus EF-
hand – a drug that binds to the C-terminus EF-hand has been shown to prolong occlusion of the
activation gate and decrease resurgent current (Hebert et al. 1994; Theiss et al. 2007; Bello et al.
2012).
1.10 Significance and Objectives
Many advances have been made in recent years, to identify the roles of various motifs in voltage-
gated sodium channel protein (Nav) channel function and modulation (Chagot et al. 2009;
Miloushev et al. 2009; Payandeh et al. 2011; Sarhan et al. 2012; Zhang et al. 2012). However,
analyses of the roles of specific amino acid sites have largely been limited to the sites that are
known to be mutated in people with diagnosed neuromuscular disease (Lehmann-Horn and
Jukart-Rott 1999). In this project, I will use the gymnotiform genus Gymnotus as a model system
to investigate the evolution and function of amino acid sites on the Nav1.4a.
Fishes of the genus Gymnotus produce species-specific electric organ discharges (EODs)
for electrolocation (foraging, navigation) and communication (Crampton and Albert 2006).
EODs are the summation of action potentials produced at the electric organ(s) (EO) by
electrogenic cells (Bennett 1961; Mills and Zakon 1987). Navs at the plasma membranes of those
cells have a key role in supporting action potentials (Agnew 1984; Catterall 1984; Noda et al.
1984). Upon neuronally triggered changes in voltage, Navs activate to allow specific ions to
discharge through their pores, across the membranes. Those same changes in voltage also trigger
Navs to inactivate, to allow the membrane voltage gradient to recover, in preparation for the next
discharge.
Navs are encoded by a family of paralogous genes that translate to highly conserved
amino acid sequences and motifs (Catterall et al. 2005). Gene duplication among teleostei and
preferential expression in various tissues (Lopreato et al. 2001; Lynch et al. 2001; Goldin 2002;
Novak et al. 2006; Widmark et al. 2011) has been predicted to allow paralogs to evolve
independently without compromising functions of Navs in other tissues. Analyses of nucleotide
sequences encoding various motifs of the EO paralog (scn4aa) from limited sampling of
gymnotiform fishes, resulted in identification of positive, neutral, and purifying selection of the
19
protein (Nav1.4a) among certain lineages (Zakon et al. 2006; Arnegard et al. 2010). However,
positively selected amino acid sites were not identified.
One component of the scn4aa gene that has not been previously analyzed for patterns of
selection among gymnotiforms includes the nucleotides encoding the protein’s carboxyl-
terminus (scn4aa 3’). This portion includes key motifs that are involved in regulation of protein
internalization, fast inactivation, and possibly also resurgent current. Modulation of these
Nav1.4a activities affects the amplitude and frequency of action potentials at the EO, which may
in turn affect those components of the EODs. Variations in EOD amplitude may be associated
with variations in multiple anatomical, cellular, and molecular characteristics (Gotter 1998;
Caputi 1999). However, variations in EOD frequency among gymnotiforms with myogenic
electric organs are likely limited to those associated with variations in Nav1.4a function.
Since species-specific characteristics of EODs among gymnotiforms (especially variation
in frequency) are the result of adaptations to abiotic and biotic selective pressures in their varied
habitats (Stoddard 2002), I predict that amino acid sites of the Nav1.4a C-terminus that contribute
to variance of (but not abolish) protein function, will show evidence of positive selection in
Gymnotus fishes. I also predict that the Nav1.4a C-terminus will only show evidence of positive
selection in some lineages of Gymnotus, as has been observed for other portions of Nav1.4a
sequences from a limited sample of gymnotiform fishes (Zakon et al. 2006; Arnegard et al.
2010). To assess patterns of selection on the Nav1.4a C-terminus among Gymnotus, I will
analyze the corresponding nucleotide sequences based the phylogenetic relationships among
these fishes (Yang 2007).
Since existing phylogenetic relationships among Gymnotus based on morphology and
nucleotide sequences are not entirely consistent with each other, I will conduct phylogenetic
analyses with additional taxa and molecular characters to contribute towards resolving remaining
inconsistencies (Wiens 1998). The additional characters that I will use are the Gymnotus scn4aa
nucleotide sequences that encode the protein's C-terminus. Since other portions of scn4aa had
been used for successful reconstruction of phylogenies from limited sampling of gymnotiform
fishes (Zakon et al. 2006; Arnegard et al. 2010), I predict that this portion of the gene will also
20
contribute towards clarification of phylogenetic relationships among a large sample of Gymnotus
species.
The objectives of this project can be summarized as follows:
1) To clarify evolutionary relationships among known and newly discovered species of
Gymnotus fishes using orthologous genetic loci, including the scn4aa 3’;
2) To determine the utility of the scn4aa 3’ locus for reconstruction of phylogenetic
relationships; and
3) To assess patterns of selection on the Nav1.4a C-terminus, thereby contributing towards
understanding the evolutionary history of Gymnotus fishes, and molecular mechanisms of
the protein.
21
Chapter 2 Materials and Methods
2.1 Taxon Sampling
Efforts were made to comprehensively sample Gymnotus species from all three clades described
in Albert et al. (2005). Outgroup species were sampled from other gymnotiform families:
Electrophoridae; Hypopomidae; and Sternopygidae. More than one individual was sampled per
species whenever possible, as a control for variation among species.
Tissues for DNA extraction were stored in either 95-100% ethanol or salt saturated buffer
(20% DMSO, 0.25 M EDTA pH 8, saturated with NaCl). Tissues for RNA extraction were
stored in RNALater. Tissue samples were from the collections of Nathan Lovejoy, William
Crampton, James Albert, and Javier Maldonaldo.
2.2 Locus Selection
The loci selected were: mitochondrial genes cytochrome b (cytb) and 16S ribosome (16S); and
nuclear genes recombination activating gene 2 (rag2) and the portion of the voltage-gated
sodium channel gene scn4aa encoding the Nav1.4a protein’s carboxyl-terminus (this region is
herein referred to as scn4aa 3’).
Cytb and 16S sequences have been successfully used for phylogenetic classification of
fish (Lovejoy and Collette 2001, Lavoué and Sullivan 2004). These are housekeeping genes
which have key conserved functions for the maintenance of every cell among various cell types
and across taxa (Warrington et al. 2000), which would decrease the chances of inaccurate
phylogeny due to variations in patterns of natural selection among clades (Kullberg et al. 1996).
Mitochondrial genes do not have introns, which simplifies the sequence alignment process.
Rag2 sequences have also been successfully used for phylogenetic classification of fish
(Lovejoy and Collette 2001, Lavoué and Sullivan 2004). Rag2 is essential to the inducible
immune response in jawed vertebrates (Rast and Litman 1998). In fish, it is a conserved single
22
copy gene (Willett et al. 1997) that seems to not have introns (Hansen and Kaattari 1996, and
Willett et al. 1997). Phylogenetic reconstruction based on single copy genes decrease the
chances of inaccurate phylogeny due to mistaken orthology (Li et al. 2007).
Scn4aa encodes the voltage-gated sodium channel protein Nav1.4a, and is part of a
sodium channel gene family that has been conserved among vertebrates (Goldin et al. 2000;
Novak et al. 2006; Widmark et al. 2011). There were nucleotide sequences from scn4aa and
scn4ab, and other members of the scn gene family from GenBank for comparison to avoid
mistaken orthology. The Nav1.4a protein's carboxyl-terminus is approximately 300 amino acids
long.
Mitochondrial genes tend to evolve rapidly compared to nuclear ones (Brown 1979). A
combination of nucleotide sequences from these different sources could complement each other
when resolving phylogenetic relationships.
2.3 Primer Design
Table 1 lists the primer sequences used for DNA amplification and sequencing. Amplification
primers for cytb, 16S, and rag2 have been previously published. Amplification primers for
scn4aa 3’ were designed as part of this project (see below). Sequencing primers for all loci
selected were designed as necessary.
For both amplification and sequencing primers, annealing characteristics such as melting
temperature, % GC content, and secondary structures were analyzed using NetPrimer
(http://www.premierbiosoft.com/netprimer/index.html).
2.3.1 Amplification Primers for scn4aa 3’
To predict intron/exon boundaries for the portion of the scn4aa gene encoding the carboxyl-
terminus (scn4aa 3’), published fish scn4aa, scn4ab, and other scn cDNA sequences were
obtained from GenBank (Table 2), and aligned using ClustalX version 1.83 (Thompson et al.
1997). The scn4aa 3’ locus was predicted to be contained within one exon by comparing the fish
scn alignment with annotated genome sequences of Danio rerio (GenBank Accession #s
DQ221253 and NW_001510719).
23
Table 1. Primer Sequences
Primers used for nucleotide sequence amplification and sequencing are identified by their target loci, name, annealing direction,
sequence, and source.
Target Locus Primer Name Amplification/Sequencing Direction
1 Nucleotide Sequence (listed as 5' → 3') Source of Sequence
scn4aa 3’ (6)1F 5' → 3' TCCTCCTGACTGTGACCCTG This study
(6)1R 3' ← 5' CATTTTTACACTTCATCACTCTCCAC This study
cytochrome b GLU-L-CARP (AKA
CytbF)
5' → 3' TGACTTGAAGAACCACCGTTG Palumbi et al. 1991
GLUDG-L 5' → 3' CGAAGCTTGACTTGAARAACCAYCGTTG Palumbi et al. 1991
HA-danio (AKA CytbR) 3' ← 5' CTCCGATCTTCGGATTACAAG Mayden et al. 2007
(C)Seq1F CAATGAGTCTGAGGAGGNTT This study
(C)Seq3F CAATGAGTTTGAGGGGGNTT This study
(C)Seq5F CAATGAGTCTGAGGGGGNTT This study
(C)Seq8F CAATGAGTTTGAGGCGGNTT This study
recombination activating
gene 2
Rag2GyF 5' → 3' ACAGGCRTCTTTGGKRTTCG Lovejoy et al. 2010
Rag2GyR 3' ← 5' TCATCCTCCTCATCTTCCTC Lovejoy et al. 2010
(R)Seq1F AGAACCACAGAGAACTGGAACAC This study
(R)Seq1R CTCTACACGCAGCCTGAACA This study
(R)Seq2R TGCATTCGCTTYTGGGA This study
16S mitochondrial
ribosomal subunit
16sar-L 5' → 3' CGCCTGTTTATCAAAAACAT Palumbi et al. 1991
16sbr-H 3' ← 5' CCGGTCTGAACTCAGATCACGT Palumbi et al. 1991
1 Amplification/sequencing direction is only identified for primers used for both amplification and sequencing, since sequencing-only
primers may have been used to sequence nucleotides in different directions.
24
Table 2. cDNA Sequences Used for scn4aa 3’ Primer Design
Voltage-gated sodium channel nucleotide sequences of fish were downloaded from GenBank for the design of primers specific to the
carboxyl-terminus. For clarity, genes and proteins were all named using the protein naming convention from Novak et al. (2006).
Superorder Acanthopterygii Ostariophysi Osteoglossomorpha
Order Tetraodontiformes Cypriniformes Gymnotiformes Siluriformes Osteoglossiformes
Species
Takifugu pardalis Tetraodon
nigroviridis
Danio rerio Apteronotus
leptorhynchus
Brachyhypopomus
pinnicaudatus
Electrophorus
electricus
Sternopygus
macrurus
Ictalurus
punctatus
Chitala
chitala
Gnathonemus
petersiii
Osteoglossum
bircirrhosum
Mem
ber
of
the
Na
v G
ene
Fam
ily
; an
d G
enB
ank
Acc
essi
on
#s
Nav1.1La BC044197 AF378142 AY204535 DQ275140
BC133130
BC150220
DQ149503
NM_200132
Nav1.1Lb NW_001513569 AF378141 AY204534 DQ275139
DQ149504
NM_001044895
Nav1.4 AB030482 1
Nav1.4a DQ221251 DQ351532 DQ351533 DQ351534 M22252 AF378144 AY204537 DQ336344 DQ275142 DQ336343
DQ149506
NW_001510719
NM_001039825
Nav1.4b DQ221252 DQ221254 AF378139 AY204532 DQ275137
DQ149505
NM_001045065
Nav1.5La DQ149507 AF378140 AY204533 DQ275138
NW_001512993
NM_001044922
Nav1.5Lb NW_001512737 AY183895
DQ149508
NM_001045123
Nav1.6a NW_001512571 DQ286578 DQ385608
NM_131628
Nav1.6b NW_001513595 AF378143 AY204536 DQ275141
NM_001045183
1 GenBank identifies this as a sequence from skeletal muscle, but unclear as to whether it's Nav1.4a or Nav1.4b.
25
Primer sequences were designed to amplify the scn4aa 3’ exon, but not scn4ab
sequences, or sequences from any other scn sequence. To accomplish this, potential primer
sequences were blasted (http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) against scn4aa
and non-scn4aa portions of the fish scn alignment. Only sequences specific to scn4aa, and not to
any other scn's were selected. The resulting primer sequences were experimentally tested to
verify amplification of scn4aa 3’. The absence of introns was confirmed by comparing
corresponding scn4aa DNA and cDNA sequences from gymnotiforms (see below).
2.3.2 Sequencing Primers
Sequencing primers were designed for amplicons that did not produce clear nucleotide sequences
using amplification primers for sequencing. Existing nucleotide sequences for those loci were
aligned using SequencherTM (Gene Code Corporation, Ann Arbor, MI). Primers were designed
to anneal with conserved regions within the loci to obtain the remaining nucleotide sequences.
2.4 DNA and RNA Extraction
To obtain genomic DNA for amplifying scn4aa sequences, excised muscle tissue was processed
using the DNeasy Blood and Tissue Spin-Column Kit (Qiagen).
RNA was obtained from electric organ tissue of a Gymnotus tigre specimen so that
scn4aa cDNA encoding the protein's carboxyl-terminus can be transcribed. To obtain RNA, fresh
tissue was homogenized (ground with a mortar and pestle at -80ºC, and vortexed with least 1 mL
Trizol /100 mg tissue). Nucleic acids, amino acids, and lipids were separated by adding a denser
chloroform organic layer (200 µL /1 mL Trizol), and further homogenizing and breaking of large
pieces of DNA (vortex for 15 s). The solution was placed at room temperature to allow contents
to drift into their phases (2-3 mins), and centrifuged to obtain clear phase separation (12000 g for
15 min at 4ºC). Nucleic acids were allowed to precipitate by adding isopropanol (500 µL /1 mL
Trizol) to the aqueous phase, and incubating at room temperature (10 mins). The nucleic acids
were pelleted (centrifugation at 12000 g for 10 min at 4ºC, and removal of supernatant), and
washed (80% ethanol /1 mL Trizol, centrifugation at 7500 g for 5 mins at 4ºC, and removal of
supernatant ethanol) to increase purity of the sample. To prevent nucleic acid degradation,
26
samples were heated to denature nucleases (70ºC for 2-3 mins), and resuspended in diethyl
pyrocarbonate treated water (81 µL; any nucleases including RNAse were inactivated in DEPC
water). DNA was selectively degraded by adding DNase I (8 µL of 10X DNase I buffer, 2 µL of
DNase I enzyme), mixing (vortex, quick spin), and incubating to activate the enzyme (42ºC for
25 mins). RNA was purified using the RNA Cleanup protocol in the RNeasy Mini Kit (Qiagen).
2.5 Nucleotide Amplification and Sequencing
Nucleotide sequences from previous studies were obtained from GenBank. This includes most of
the cytb, 16S, and rag2 data. All of the scn4aa sequences encoding the protein’s carboxyl-
terminus were experimentally obtained as part of this study. See Table 3 for the source of each
sequence.
Polymerase Chain Reaction (PCR) was used to amplify nucleic acids from target loci (1x
+(NH4) 2SO4 PCR Buffer (Fermentas), 0.8 mM dNTPs, 0.2 µM of each primer, 0.02 U/µL Taq
DNA Polymerase (Fermentas), 0.5-4 mM MgCl2). Both standard thermal cycling profiles
(denaturation at 95°C for 2.5 min; 32 cycles of denaturation at 95°C for 30 s, annealing at 53-
54°C 1 min, extension at 72°C for 1 min 30s; and extension at 72°C for 5 min) and touchdown
protocols (Don et al. 1991) were used. Concentrations of MgCl2 and annealing temperatures
were optimized for each primer pair.
Amplified products were assessed by gel electrophoresis (1% agarose in 1x TAE buffer
(50x TAE: 242 g Tris base, 57.1 mL glacial acetic acid, 100 mL 0.5 M EDTA pH 8.0, H2O to
1L), and staining with SYBRSafe (Invitrogen). PCR products showing one distinct amplicon
were purified using the QIAquick PCR Purification Kit (Qiagen). They were sequenced by
capillary electrophoresis and dye termination cycle sequencing (3730xl DNA Analyzer with KB
Basecaller software, Applied Biosystems) at the Centre for Applied Genomics (TCAG, The
Hospital for Sick Children, Toronto, Canada).
27
Table 3. Specimens and Nucleotide Sequences Used for Gymnotus Analyses
Specimens used for analysis are identified by their scientific names, tissue sample numbers, museum catalogue numbers, collection
localities, and applicable GenBank Accession numbers. Drainage basins are classified according to Albert et al. (2005): MA - Middle
America, NW - Northwestern South America, PS - Pacific Slope, GO - Guyanas-Orinoco, WA - Western Amazon, EA - East
Amazon, NE - Northeast Brazil, PA - Paraguay-Paraná basin of Argentina, SE - Southeast Brazil. Sequences obtained by the author
for this project are identified with “**”. Sequences obtained from lab records are identified with “*” or their GenBank Accession
Number, if applicable.
Genus Species Tissue sample
number
Museum catalog
number
Collection locality; <drainage basin is listed in brackets
whenever possible>
Nucleotide sequences
scn4aa 3' cytochrome b recombination
activating gene
2
16S ribosome
Family Gymnotidae
Gymnotus arapaima 2002 MZUSP 75179 Lago Mamirauá, Tefé, Amazonas, Brazil; <WA> ** GQ862595 GQ862543 GQ862647
Gymnotus arapaima 2003 MZUSP 103219 Lago Mamirauá, Tefé, Amazonas, Brazil; <WA> ** GQ862596 GQ862544 GQ862648
Gymnotus carapo 2004 MZUSP 76066 Lago Secretaria, Brazil; <WA near EA> ** GQ862599 GQ862547 GQ862651
Gymnotus carapo 2006 UF 131129 Rio Amazonas, Peru; <WA> ** GQ862601 GQ862549 GQ862653
Gymnotus carapo 2007 UF 131129 Rio Amazonas, Peru; <WA> ** GQ862602 GQ862550 GQ862654
Gymnotus carapo 2030 MZUSP 76066 Lago Secretaria, Brazil; <WA near EA> ** GQ862600 GQ862548 GQ862652
Gymnotus carapo 2040 UF 174335 Rio Guaratico, Venezuela; <GO> ** GQ862597 GQ862545 GQ862649
Gymnotus carapo 2041 UF 174335 Rio Guaratico, Venezuela; <GO> ** GQ862598 GQ862546 GQ862650
Gymnotus cataniapo 2062 UF 174330 Rio Atabapo, Venezuela; <GO> ** GQ862603 GQ862552 GQ862656
Gymnotus cataniapo 2063 UF 174332 Rio Cataniapo, Venezuela; <GO> ** GQ862604 GQ862579 GQ862683
Gymnotus coatesi 2042 MCP 34471 Lago Tefé, Brazil; <WA near EA> ** GQ862605 GQ862553 GQ862657
Gymnotus coatesi 2043 MCP 34472 Rio Tefé, Brazil; <WA near EA> ** GQ862605 GQ862554 GQ862658
Gymnotus coropinae 2010 MZUSP 75188 Lago Tefé, Brazil; <WA near EA> ** GQ862611 GQ862559 GQ862663
Gymnotus coropinae 2025 MZUSP 60611 Lago Tefé, Brazil; <WA near EA> ** GQ862612 GQ862560 GQ862664
Gymnotus coropinae 2035 ANSP 179126 Sauriwau River, Guyana; <GO> * GQ862607 GQ862555 GQ862659
Gymnotus coropinae 2036 AUM 35848 Sauriwau River, Guyana; <GO> ** GQ862608 GQ862556 GQ862660
Gymnotus coropinae 2037 ANSP 179127 Mazaruni River, Guyana; <WA near EA> ** GQ862609 GQ862557 GQ862661
Gymnotus coropinae 2038 ANSP 179127 Mazaruni River, Guyana; <WA near EA> ** GQ862610 GQ862558 GQ862662
Gymnotus curupira 2009 MZUSP 75148 Lago Tefé, Brazil; <WA near EA> ** GQ862613 GQ862561 GQ862665
Gymnotus curupira 2021 MZUSP 75146 Lago Tefé, Brazil; <WA near EA> ** GQ862614 GQ862562 GQ862666
Gymnotus cylindricus 2092 ROM 84772 Rio Tortuguero, Costa Rica; <MA> ** GQ862615 GQ862563 GQ862667
Gymnotus cylindricus 2093 ROM 84772 Rio Tortuguero, Costa Rica; <MA> ** GQ862616 GQ862564 GQ862668
Gymnotus cylindricus 2094 ROM 84772 Rio Tortuguero, Costa Rica; <MA> ** GQ862617 GQ862565 GQ862669
28
Genus Species Tissue sample
number
Museum catalog
number
Collection locality; <drainage basin is listed in brackets
whenever possible>
Nucleotide sequences
scn4aa 3' cytochrome b recombination
activating gene
2
16S ribosome
Gymnotus javari 2020 UF 122824 Iquitos, Brazil; <WA> ** GQ862618 GQ862566 GQ862670
Gymnotus jonasi 2016 MZUSP 103220 Rio Solimões, Tefé, Amazonas, Brazil; <WA near EA> ** GQ862619 GQ862567 GQ862671
Gymnotus jonasi 2471 UF 131410 Rio Ucayali, Pacaya Samiria Reserve, Peru; <PS> ** GQ862620 GQ862568 GQ862672
Gymnotus mamiraua 2012 MZUSP 103221 Rio Solimões, Tefé, Amazonas, Brazil; <WA near EA> ** GQ862621 GQ862569 GQ862673
Gymnotus mamiraua 2013 MCP 29805 Rio Solimões, Tefé, Amazonas, Brazil; <WA near EA> ** GQ862622 GQ862570 GQ862674
Gymnotus obscurus 2017 MZUSP 75155 Lago Mamirauá, Tefé, Amazonas, Brazil; <WA near
EA> ** GQ862623 GQ862571 GQ862675
Gymnotus obscurus 2018 MZUSP 75157 Lago Mamirauá, Tefé, Amazonas, Brazil; <WA near
EA> ** GQ862624 GQ862572 GQ862676
Gymnotus omarorum 7092 AMNH 239656 Laguna del Cisne, Uruguay; <PA near SE> ** ** ** **
Gymnotus omarorum 7093 AMNH 239656 Laguna del Cisne, Uruguay; <PA near SE> ** ** ** **
Gymnotus pantanal 7076 (not catalogued) Rio Parana, Corrientes, Chaco Region, Argentina;
<PA> ** ** * *
Gymnotus pantherinus 2039 (no voucher) Rio Perequê-Açu, Brazil; <NE> ** GQ862625 GQ862573 GQ862677
Gymnotus pantherinus 2945 MZUSP 87564 Rio Vermelho, Sao Paulo, Brazil; <SE> ** * * *
Gymnotus stenoleucus 2060 UF 174329 Rio Atabapo, Venezuela; <GO> ** GQ862628 GQ862576 GQ862680
Gymnotus stenoleucus 2061 UF 174331 Rio Cataniapo, Venezuela; <GO> ** GQ862629 GQ862577 GQ862681
Gymnotus stenoleucus 2064 UF 174329 Rio Atabapo, Venezuela; <GO> ** GQ862630 GQ862578 GQ862682
Gymnotus sylvius 7240 MZUSP 100267 Rio Ribeira de Iguape-Rio Juqueia-Rio São Lourenço,
Miracatú, São Paolo, Brazil; <SE> ** ** ** **
Gymnotus tigre 7090 (not catalogued) (aquarium specimen) ** ** ** **
Gymnotus tigre 7090_804pe3_1
(aliquot from
7090)
(not catalogued) (aquarium specimen) **
Gymnotus tigre 7349 (not catalogued) (aquarium specimen) ** ** ** **
Gymnotus ucamara 1927 UF 126184 Rio Ucayali, Peru; <WA> ** * * *
Gymnotus ucamara 1950 UF 126184 Rio Ucayali, Peru; <WA> ** * * *
Gymnotus varzea 2014 MZUSP 75163 Rio Solimões, Tefé, Amazonas, Brazil; <WA near EA> ** * * *
Gymnotus varzea 2015 MZUSP 75164 Rio Solimões, Tefé, Amazonas, Brazil; <WA near EA> ** * * *
Gymnotus n. sp. 2956 (not catalogued) Rio São João, Rio de Janeiro, Brazil; <SE> ** ** ** **
Gymnotus n. sp. 2957 (not catalogued) Rio São João, Rio de Janeiro, Brazil; <SE> ** ** ** **
Gymnotus aff. anguillaris 2091 AUM 36616 Rio Aponwao, Guyana; <GO> ** GQ862594 GQ862542 GQ862646
Gymnotus n. sp. chaviro 7357 (unknown) (unknown) ** ** ** **
Gymnotus n. sp. chaviro 7358 (unknown) (unknown) ** ** ** **
Gymnotus n. sp. fritzi 7109 (not catalogued) Tefé, Amazonas, Brazil; <WA near EA> ** ** ** **
29
Genus Species Tissue sample
number
Museum catalog
number
Collection locality; <drainage basin is listed in brackets
whenever possible>
Nucleotide sequences
scn4aa 3' cytochrome b recombination
activating gene
2
16S ribosome
Gymnotus n. sp. itapua 2559 MZUSP 85947 Southern Brazil ** * ** *
Gymnotus n. sp. itapua 7071 (not catalogued) Rio Parana, Corrientes, Chaco Region, Argentina;
<PA> ** * * *
Gymnotus n. sp. itapua 7072 (not catalogued) Rio Parana, Corrientes, Chaco Region, Argentina;
<PA> ** * * *
Gymnotus n. sp. RS1 2558 MZUSP 85943 Southern Brazil ** * ** *
Gymnotus n. sp. RS1 7088 MNRJ 31520 Lagoa dos Tropeiros, Piumhi, Minas Gerais Region,
Brazil ** ** ** **
Gymnotus cf. tigre 2019 UF 122823 Rio Amazonas, Peru; <WA> ** GQ862631 GQ862579 GQ862683
Gymnotus cf. tigre 2024 UF 122821 Rio Amazonas, Peru; <WA> ** GQ862632 GQ862580 GQ862684
Gymnotus sp. xingu 7305 MNRJ 33642 Xingú-Tapajós, Brazil; <PA> ** ** ** **
Family Electrophoridae
Electrophorus electricus M22252
Electrophorus electricus 2026 MZUSP 103218 Lago Secretaria, Tefé, Amazonas, Brazil; <WA near
EA> ** GQ862593 GQ862541 GQ862645
Electrophorus electricus 2619 UF 116585 Rio Nanay, Peru; <WA> ** GQ862592 GQ862540 GQ862644
Family Hypopomidae
Brachyhypopomus diazi 305 UF 174334 Rio Las Marias, Venezuela; <GO> ** GQ862589 GQ862537 GQ862641
Brachyhypopomus diazi 2408 UF 174334 Rio Alpargatón, Venezuela; <GO> ** GQ862590 GQ862538 GQ862642
Brachyhypopomus n. sp. PAL 2432 UF 148572 Rio Palenque, Ecuador; <PS> ** GQ862591 GQ862539 GQ862643
Hypopomus artedi 2232 ANSP 179505 Rio Mazaruni, Guyana; <GO> ** GQ862637 GQ862585 GQ862689
Family Sternopygidae
Sternopygus astrabes 2203 (unknown) Lago Tefé, Igarapé Repartimento, Brazil; <WA near
EA> ** * ** **
Sternopygus macrurus 2639 UF 117121 Rio Nanay, Peru; <WA> ** GQ862639 GQ862587 GQ862691
30
2.6 Nucleotide Sequence Verification and Alignment
All sequences experimentally obtained for this study were visually inspected for misreads, and
edited using SequencherTM (Gene Code Corporation, Ann Arbor, MI). Ambiguous base calls
were considered as possibly any nucleotide.
For scn4aa sequences, amplification and sequencing of the exon encoding the protein’s
carboxyl-terminus (scn4aa 3’) from the desired member of the gene family was verified as
follows. Each sequence was blasted as a translated nucleotide against the translated nucleotide
database in GenBank’s Nucleotide Collection (tblastx:
http://www.ncbi.nlm.nih.gov/blast/Blast.cgi). All sequences were found to have higher alignment
scores with scn4aa sequences than with scn4ab, any other scn, or any other nucleotide sequence.
To verify that the expected exon had been amplified, each scn4aa sequence was blasted
(http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) against the Electrophorus electricus
scn4aa mRNA sequence (Accession # M22252).
Directions and applicable codon positions of the nucleotide sequences were determined by
comparison with published Danio rerio (rag2 Accession # NM_131385, cytb and 16S Accession
# NC_002333), E. electricus (scn4aa 3’ Accession # M22252), and Pygocentrus nattereri (16S
Accession # U33591) sequences. Nucleotides from protein coding loci (cytb, rag2, and scn4aa
3’) were aligned based on their amino acid alignments using a combination of software
(Mesquite; ClustalX1.83; RevTrans http://www.cbs.dtu.dk/services/RevTrans/14). The 16S
nucleotide sequences were aligned under various gap cost settings in ClustalX 1.83 (Thompson
et al. 1997). Gap opening / gap extension values used were: 15/6.66; 7/5; 10/5; 20/5; and 10/10.
16S nucleotide positions which did not align consistently under all those settings were removed
from the analysis.
2.7 Phylogenetic Reconstruction
Phylogenetic reconstruction was conducted using the total evidence nucleotide alignment, and
compared with separate analyses of the following alignments: mitochondrial (cytb and 16S);
31
rag2; and scn4aa 3’. The cDNA sequences (Electrophorus electricus Accession # M22252; and
Gymnotus tigre sequence from tissue # 7090_804pe3_1) were not used for phylogenetic
reconstruction.
Parsimony based phylogenetic reconstruction was implemented in PAUP* (Swofford
2002) using the stepwise heuristic search algorithm with the following parameters for 2000
search replicates: tree bisection reconstruction branch swapping; and holding 10 variants at each
step. Bootstrapping was also conducted for 2000 search replicates with the same parameters
(Müller 2005).
Bayesian phylogenetic reconstruction was implemented in MrBayes 3.1.2 (Huelsenbeck
and Ronquist 2001), using the model of molecular evolution that best fit the data as determined
using MrModeltest 2.3 (Nylander 2004). It was the same model for the total evidence and
individual locus alignments – general time-reversible model, with a proportion of nucleotide
sites that are invariant, and the variation in nucleotide substitution rates across the variant
nucleotide sites estimated from a gamma distribution (GTR + I + G; Brinkman and Leipe 2001).
The total evidence and mitochondrial alignments were partitioned into the four and two loci,
respectively. The total evidence alignment was analyzed with temp = 0.2. The mitochondrial,
rag2, and scn4aa 3’ alignments were analyzed with temp = 0.2; and nperts = 2. Each of these
four analyses had 25% burnin, after running up to 5.5 million generations with four chains each
until the average standard deviation of split frequencies was 0.01 or less. All other parameters
were program defaults.
2.8 Molecular Evolution Analyses
Molecular evolution analyses were conducted to determine patterns and test hypotheses of
nucleotide sequence variation at the Gymnotus scn4aa 3’. The outgroup species were not used for
molecular evolution analyses, so that patterns of amino acid evolution among genus Gymnotus
could be examined in isolation from outgroup taxa.
For protein coding nucleotides, every three nucleotides are considered as one codon,
which encodes the amino acid identity at one amino acid site. There are multiple possible
nucleotide combinations in one codon that encode the same amino acid. Thus, some nucleotide
32
mutations within a codon would change the identity of the amino acid (non-synonymous
mutations, or dN), while other mutations would not (synonymous mutations, or dS). For
neutrally evolving amino acid sites, the ratio of non-synonymous to synonymous nucleotide
mutations (dN/dS, or ω) is expected to be 1. Amino acid sites evolving under purifying and
positive selection are expected to have ω < 1 and ω > 1, respectively. In other words, amino acid
sites evolving under purifying selection retain very few nucleotide mutations that change the
identity of the amino acid, while amino acid sites evolving under positive selection retain many
of those mutations.
For the molecular evolution analyses conducted, the parameters of the null models
prevent the hypotheses from being true, while those of the alternative models allow the
hypotheses to be true (Table 4). To determine whether the null hypotheses could be rejected, the
likelihoods (lnL values) of nested null and alternative models of evolution were compared using
the Likelihood Ratio Test (OpenOffice Spreadsheet version 3.3.0).
Maximum likelihood models of codon substitution were implemented using the codeml
program of PAML to test various hypotheses (version 4.5; Yang 2007). Given a nucleotide
alignment and phylogenetic tree, the program provides likelihoods of various models of
evolution and sites of possible positive selection. Ambiguous sites and gaps in the nucleotide
alignment were treated as the consensus identities (same nucleotide identity as in other
nucleotide sequences) and non-consensus identities (any nucleotide identity), respectively. The
phylogenetic tree for molecular evolution analyses was a strict consensus between 2 topologies:
50% majority consensus of those from parsimony analysis of the total evidence nucleotide
alignment; and 50% majority consensus of those from Bayesian analysis of the total evidence
nucleotide alignment. The phylogenetic tree and nucleotide alignment were pruned to remove
duplicate individuals of the same species those scn4aa 3’ nucleotide sequences were identical.
The individual that remained was the one with the smaller number of nucleotide ambiguities
among the scn4aa 3’ locus. The tie-breaker locus was the number of nucleotide ambiguities
among all loci.
33
Table 4. Models of Evolution Analyzed for the Gymnotus Nav1.4a C-terminus
Various null and alternative models of evolution were used to test hypotheses of codon evolution (Yang 2007). Models are categorized
by specific hypothesis tested, and their fixed and free parameters are identified.
Hypothesis tested Alternative model Null model
Name of model Parameters (fixed parameters are
underlined)
# of free
parameters
Name of model Parameters (fixed parameters are
underlined)
# of free
parameters
Variation in ω among
lineages
M0f Free ratio ω1-x, where x = # of lineages # of
lineages
minus 1
M0 One ratio ω 1
M2aII-f Branch-site
model A
background: p1-2; ω1 < 1; ω2 ~ 1
foreground: p1-2; ω1 < 1; ω2 ~ 1, ω3 > 1
2 M0 One ratio ω 1
M0f Free ratio ω1-x, where x = # of lineages # of
lineages
minus 1
M2aII-f Branch-site
model A
background: p1-2; ω1 < 1; ω2 ~ 1
foreground: p1-2; ω1 < 1; ω2 ~ 1, ω3 > 1
2
Variation in ω among sites M3 Discrete p1-3; ω1; ω2; ω3 5 M0 One ratio ω 1
Positive selection (ω > 1) at
some sites in all lineages
M2a Positive
selection
p1-3; ω1 < 1; ω2 ~ 1, ω3 > 1 4 M1a Nearly
neutral
p1-2; ω1 < 1; ω2 ~ 1 2
M8 Beta & ω p1-(x+1); q1-x ≤ 1; ωx+1 > 1
x = 10 categories in a beta distribution
4 M7 Beta p1-x; q1-x ≤ 1
x = 10 categories in a beta distribution
2
M8 Beta & ω p1-(x+1); q1-x ≤ 1; ωx+1 > 1
x = 10 categories in a beta distribution
4 M8a Beta &
(ω=1)
p1-(x+1); ω1-x ≤ 1; ωx+1 ~ 1
x = 10 categories in a beta distribution
3
Positive selection at some
sites in some lineages
M2aII-f Branch-site
model A
background: p1-2; ω1 < 1; ω2 ~ 1
foreground: p1-2; ω1 < 1; ω2 ~ 1, ω3 > 1
2 M2aII Branch-site
model A,
where ω2 = 1
background: p1-2; ω1 < 1; ω2 ~ 1
foreground: p1-2; ω1 < 1; ω2 ~ 1, ω3 ~ 1
2
34
The hypotheses “variation in ω among lineages” was tested using alternative/null model
pairs for various groups of lineages: each lineage having a different dN/dS ratio (M0f) vs all
lineages have similar dN/dS ratios (M0); one dN/dS ratio for lineages identified as having strong
positive selection (ω > 100) from the M0f analysis and another dN/dS ratio for the other lineages
(M2aII-f) vs all lineages have similar dN/dS ratios (M0); and each lineage having a different
dN/dS ratio (M0f) vs one dN/dS ratio for lineages identified as having strong positive selection
(ω > 100) from the M0f analysis and another dN/dS ratio for the other lineages (M2aII-f). The
alternate hypothesis for variation in ω among all lineages (M0f) was analyzed with three
technical replicates, due to the large number of free parametersThe hypotheses “variation in ω
among some sites” (M3 vs M0) and “positive selection (ω > 1) among some sites in all lineages”
(M2a vs M1a; M8 vs M7; and M8 vs M8a) were tested assuming all lineages had similar dN/dS
ratios.
The hypothesis positive selection (ω > 1) at some sites in some lineages was tested with
one dN/dS ratio for lineages that were identified as having strong positive selection (ω > 100)
from the M0f analysis (M2aII-f vs M2aII).
Positively selected sites on the voltage-gated sodium channel protein carboxyl-terminus
(Nav1.4a C-terminus) amino acid alignment were identified from statistically significant models
of evolution that resulted in at least 1 site class having ω > 1. Posterior probabilities of the
positively selected sites were calculated using both naïve empirical Bayes (NEB) and Bayes
empirical Bayes (BEB) approaches. The positively selected sites and posterior probabilities were
identified relative to other Nav1.4a C-terminus acid alignments for comparison. These other
amino acid alignments were translated from nucleotides that had been aligned in the same way as
for the Gymnotus and outgroup dataset.
35
Chapter 3 Results
3.1 Differences Between DNA and cDNA Sequences for the scn4aa 3’
The primers amplified the portion of the voltage-gated sodium channel gene scn4aa that encodes
the protein’s carboxyl-terminus (scn4aa 3’) nucleotide sequences, and there was no evidence of
introns. Scn4aa DNA/cDNA sequence pairs were compared for both Electrophorus electricus
(DNA sequence obtained for this study from tissue #s 2026 and 2619; cDNA sequence from
GenBank Accession # M22252) and Gymnotus tigre (DNA and cDNA sequences obtained for
this study from tissue #s 7090 and 7090_804pe3_1, respectively). There were no alignment gaps
for either DNA/cDNA sequence pair. All nucleotides were identical for each DNA/cDNA pair,
with the exception of a few ambiguous base calls from the experimental process.
3.2 Nucleotide Sequence Data
Nucleotide sequences were obtained from 59 Gymnotus individuals: 45 of which represent 19
recognized species, and 14 of which represent up to 9 undescribed species. Sequences were also
obtained from 9 outgroup individuals, which represent 6 species from other gymnotiform
families (Electrophoridae, Hypopomidae, and Sternopygidae). Table 3 identifies the specimens
used for analysis by their scientific names, tissue sample numbers, museum catalogue numbers,
and collection localities.
A total of 272 nucleotide sequences were obtained for phylogenetic analyses (excluding
cDNA from tissue # 7090_804pe3_1 and Accession # M22252). For each of cytochrome b
(cytb), 16S ribosome (16S), and recombination activating gene 2 (rag2), 23 sequences were
collected for this study, and 44 were obtained from GenBank. For the portion of the voltage-
gated sodium channel gene scn4aa that encodes the protein’s carboxyl-terminus (scn4aa 3’): all
68 sequences were collected for this study.
36
The total evidence nucleotide alignment consisted of 3739 nucleotide positions, 1258 of
which were parsimony informative, and another 173 were variable but parsimony uninformative.
The alignment consisted of nucleotide positions from the following loci: 1139 from cytb; 555
from 16S; 1250 from rag2; and 795 from scn4aa 3’. Nucleotides from the housekeeping
mitochondrial loci (cytb + 16S) included 752 variable positions, of which 686 were parsimony
informative. Nucleotides from rag2 included 332 variable positions, of which 263 were
parsimony informative. Nucleotides from scn4aa 3’ included 347 variable positions, of which
309 were parsimony informative.
Among the nucleotide sequences obtained, only 4.90% of nucleotides were ambiguous
(proportion of ambiguous sites among nucleotides: 1506/76313 cytb nucleotides; 1513/36515
16S nucleotides; 7694/83750 rag2 nucleotides; and 1615/54855 scn4aa 3’ nucleotides). The
ambiguous sites have chromatograms that do not clearly show a single nucleotide identity.
Although it is possible some are polymorphic sites, it was assumed that they were due to
experimental error for the purposes of phylogenetic analyses.
This dataset represents the most complete sampling of Gymnotus nucleotide sequence
data. Compared to the most recent molecular phylogenetic reconstruction of Gymnotus (Lovejoy
et al. 2010), this dataset includes 10 additional Gymnotus species (2 additional recognized
species, and up to 8 additional undescribed species) as well as an additional locus.
3.3 Phylogenetic Reconstruction
Molecular phylogenetic analyses were conducted using nucleotide alignments of various loci
(cytb, 16S, rag2, and scn4aa 3’) and the total evidence alignment, using both maximum
parsimony (MP) and Bayesian inference (BI) algorithms. The 50% majority-rule consensus
topologies are shown in Figures 6-8. The strict consensus topology for Gymnotus from the total
evidence nucleotide alignments using MP and BI algorithms is shown in Figure 9.
The MP consensus topologies were produced from the most parsimonious trees based on
analyses of various loci: housekeeping mitochondrial (3148 trees), rag2 (3931 trees), scn4aa 3’
(1622 trees); and the total evidence nucleotide alignment (742 trees). The BI consensus
topologies all resulted from analyses where the standard deviation of split frequencies was ≤
37
0.01. The potential scale reduction factors (psrf's) of the topologies from various loci were within
0.05 of the convergence diagnostic value of 1.00, and the burnin cutoff percentage was after the
log probability plateaued. Although the psrf's for the total evidence nucleotide alignment was
3.586, the burnin cutoff percentage was after the log probability plateaued.
The genus Gymnotus was resolved as a monophyletic group based on phylogenetic
reconstruction of each locus using an MP algorithm (Figure 6), the scn4aa 3’ locus using a BI
algorithm (Figure 7), and the total evidence nucleotide alignment using either algorithm (Figure
8).
The outgroup consisted of gymnotiform species belonging to families outside of the
family Gymnotidae (Electrophoridae, Hypopomidae, and Sternopygidae). The closest outgroup
family to Gymnotus was identified as Electrophoridae (E. electricus) based on the housekeeping
mitochondrial and total evidence nucleotide alignments using an MP algorithm (Figures 6 and 8).
However, the closest outgroup was identified as Sternopygidae (Sternopygus astrabes and
Sternopygus macrurus) based on the scn4aa 3’ and total evidence alignments using a BI
algorithm (Figures 7 and 8). Sternopygidae was also identified as the closest outgroup based on
the rag2 and scn4aa 3’ loci using an MP algorithm, although bootstrap values were either lower
or there was no corresponding node from the bootstrap phylogeny (Figure 6).
Three major monophyletic clades were consistently resolved within Gymnotus (Figures
6-8; clade names as per Lovejoy et al. 2010): Gymnotus carapo group; G2 group; and G1 group.
The G1 group was identified as the sister clade of a group composed of the other two major
clades based on the housekeeping mitochondrial and rag2 loci using an MP algorithm (Figure 6),
as well as from the total evidence nucleotide alignment using either algorithm (Figure 8).
Although the G2 group was identified as the sister clade of the other two major clades based on
the scn4aa 3’ alignment using either algorithm, it was not well supported.
Within the G. carapo group, there were five lineages for which phylogenetic topology
varied among all three loci, whether an MP or BI algorithm was used for analysis (Figures 6 and
7): Gymnotus n. sp. (tissue # 2956); G. carapo (tissue #s 2040 and 2041); Gymnotus omarorum
(tissue #s 7092 and 7093); Gymnotus obscurus (tissue #s 2017 and 2018); and the Gymnotus
pantanal and Gymnotus sp. xingu lineage (tissue #s 7076 and 7035, respectively).
38
Based on the nucleotide alignment from
cytochrome b & 16S ribosome
Based on the nucleotide alignment from
recombination activating gene 2
Based on the nucleotide alignment from
The portion of the voltage-gated sodium channel gene
scn4aa that encodess the protein’s carboxyl-terminus
Figure 6. Molecular Phylogeny for Gymnotus Based on Various Alignments Using Maximum Parsimony
Phylogenetic reconstruction was conducted based on the nucleotide alignments of various loci using maximum parsimony. The 50%
majority-rule consensus topologies are shown. Numbers above the branches indicate bootstrap values. The clades are coloured as
follows: G. carapo group (green); G2 group (light blue); and G1 group (dark blue).
39
Based on the nucleotide alignment from
cytochrome b & 16S ribosome
Based on the nucleotide alignment from
recombination activating gene 2
Based on the nucleotide alignment from
The portion of the voltage-gated sodium channel gene
scn4aa that encodess the protein’s carboxyl-terminus
Figure 7. Molecular Phylogeny for Gymnotus Based on Various Alignments Using Bayesian Inference
Phylogenetic reconstruction was conducted based on the nucleotide alignments of various loci using Bayesian inference. The 50%
majority-rule consensus topologies are shown. Numbers above the branches indicate posterior probabilities. The clades are coloured as
follows: G. carapo group (green); G2 group (light blue); and G1 group (dark blue).
40
Based on maximum parsimony Based on Bayesian inference
Figure 8. Molecular Phylogeny for Gymnotus Based on the Total Evidence Alignment
Phylogenetic reconstruction was conducted using the total evidence nucleotide alignment from
Gymnotus, consisting of nucleotide sequences from cytochrome b, 16S ribosome, recombination
activating gene 2, and the portion of the voltage-gated sodium channel gene scn4aa that encodes
the protein’s carboxyl-terminus. The 50% majority-rule consensus topologies are shown.
Numbers above the branches indicate bootstrap values and posterior, respectively. The clades are
coloured as follows: G. carapo group (green); G2 group (light blue); and G1 group (dark blue).
41
There were three Gymnotidae lineages in addition to the 3 major monophyletic clades:
the Gymnotus cylindricus lineage, the Gymnotus tigre lineage, and the Gymnotus pantherinus
lineage. Gymnotus cylindricus was identified as the sister lineage to the G. carapo group based
on the scn4aa 3’ alignment using an MP algorithm, the housekeeping mitochondrial and rag2
nucleotide alignments using a BI algorithm, and the total evidence nucleotide alignment using
either algorithm (Figures 6-8). G. tigre was identified as basal to the G. carapo group + G.
cylindricus lineage based on those same nucleotide alignment + algorithm combinations, as well
as the scn4aa 3’ alignment using a BI algorithm (Figure 7). G. pantherinus was identified as
basal to the G2 clade based on the housekeeping mitochondrial alignment using an MP
algorithm, and the scn4aa C-terminus and total evidence alignments using either algorithm.
The topology based on the total evidence nucleotide alignment using a BI algorithm
seemed to be slightly better resolved than using an MP algorithm (Figure 8). Some G. carapo
lineages (tissue #s 2040 and 2041) were topologically variable within the G. carapo group, and
better supported using a BI algorithm. One Gymnotus varzea lineage (tissue # 2014) was
identified as being derived from the other (tissue # 2015) based on the total evidence nucleotide
alignment using a BI algorithm and not any of the other phylogenetic analyses, but this
relationship was not well supported.
3.4 Patterns of Gymnotus scn4aa C-terminus Nucleotide Sequence Variation
There were 43 variable sites on the Gymnotus Nav1.4a C-terminus amino acid alignment.
Various hypotheses for patterns of nucleotide sequence variation were tested on the scn4aa 3’
nucleotide alignment using codon-based analyses (Table 4).
Variation in the ratio of non-synonymous to synonymous amino acids (dN/dS = ω)
among lineages was supported for certain lineages (Table 5). Estimating a separate ω for each
Gymnotus lineage (M0f) was not a significantly better fit for the data than the null model of one
ω for all lineages (M0). However, the separate estimations of ω for each Gymnotus lineage (M0f)
seemed to consistently identify seven lineages as having very high ω values (ω > 100; three
technical replicates). The seven lineages and their median ω values are identified on Figure 9.
42
The seven lineages were confirmed as positively selected, since estimating ω values for those
seven lineages separately from the other lineages (M2aII-f) was a significantly better fit for the
data than the null model (same ω for all lineages, M0). In addition, estimating a separate ω for
each Gymnotus lineage (M0f) was not a significantly better fit for the data than the null model
(an ω for the seven lineages with very different ω values and another ω for the rest of the
lineages, M2aII-f).
Variation in ω among amino acid sites was supported (Table 5). Estimating more than
one ω for all amino acid sites (M3) was a significantly better fit for the data than the null model
(one ω for all amino acid sites, M0). The ω values were estimated for three site classes using the
M3 model: 0.00 (70.9% of sites), 0.90 (0.001% of sites), and 0.93 (29.1% of sites).
Positive selection (ω > 1) at some amino acid sites across all lineages was not supported
(Table 5). None of the three alternative models of positive selection were a significantly better fit
for the data than their null models: M2a vs M1a; M8 vs M7; and M8 vs M8a. In addition, none
of the estimated ω values were > 1. The ω values were estimated for three site classes using the
M2a model: 0.00 (71.7% of sites), and 1.0 (28.3% of sites) for the other two classes. The ω
values were estimated for 11 site classes using the M8 model: 0.00 (8.89% of sites) for the first
seven classes, 0.00016 (8.89% of sites), 0.84 (8.89% of sites), and 1.0 (19.99% of sites) for the
last two classes.
Positive selection (ω > 1) at some amino acid sites in the seven lineages with
very different ω values was supported (Table 6). Estimating ω values for those seven lineages
separately from the other lineages (M2aII-f) was a significantly better fit for the data than the null
model (limiting some of the ω values to 1, M2aII). The ω values of the seven lineages were
estimated for two site classes using the M2aII model (Table 6): 999 for both site classes (14.2%
of sites for both of the site classes combined). The ω values for the other site classes were fixed:
0.000 (65.0% of sites), and 1.00 (20.7% of sites).
43
Table 5. Results of PAML Analyses of Gymnotus Nav1.4a C-terminus Codon Evolution
Hypotheses regarding patterns of codon evolution were tested using models of molecular evolution implemented in the codeml
program of PAML version 4.5 (Yang 2007). See Table 4 for a summary of hypotheses tested. To determine whether the alternative
models of evolution were significantly better at describing the data than the null models (p-value < 0.05), the likelihood values were
compared using the likelihood ratio test.
Hypothesis tested Alternative model Null model Likelihood
ratio test
value
Degrees
of
freedom
p-value
Name of model lnL (L =
likelihood value)
# of
parameters
Name of model lnL (L =
likelihood value)
# of
parameters
Variation in ω among lineages M0f 1
Free ratio -1975.238063 139 M0 One ratio -2000.427564 71 50.37900200 68 0.95
M2aII-f Branch-site
model A
-1976.626775 74 M0 One ratio -2000.427564 71 47.60157800 3 2.6 x 10-10
M0f 1
Free ratio -1975.238063 139 M2aII-f Branch-site
model A
-1976.626775 74 2.777424000 65 1.0
Variation in ω among all sites M3 Discrete -1984.459630 75 M0 One ratio -2000.427564 71 31.93586800 4 2.0 x 10-6
Positive selection (ω > 1) at some
sites in all lineages
M2a Positive
selection
-1984.50491 74 M1a Nearly neutral -1984.50491 72 0.0000000000 2 1.0
M8 Beta & ω -1984.472282 74 M7 Beta -1984.488361 72 0.0321580000 2 0.98
M8 Beta & ω -1984.472282 74 M8a Beta & (ω=1) -1984.504914 73 0.0652640000 1 0.80
Positive selection (ω > 1) at some
sites in some lineages
M2aII-f Branch-site
model A
-1976.626775 74 M2aII Branch-site
model A,
where ω2 = 1
-1982.112158 73 10.97076600 1 9.3 x 10-4
1 Analysis results from the first technical replicate is listed.
44
Figure 9. Molecular Phylogeny for Gymnotus and Positively Selected Lineages
A strict consensus topology was determined from the 50 % majority-rule consensus topologies
from maximum parsimony and Bayesian inference based reconstruction of the total evidence
nucleotide alignment. The median non-synonymous to synonymous amino acid ratio (dN/dS =
ω) from 3 technical replicates of the alternate hypothesis for variation in ω among all lineages
(M0f), is shown above each branch. Estimates of ω were not obtained for the 2 most basal
branches, because the analytical methods require a basal polytomy for the phylogenetic topology.
Branches with ω > 100 are coloured grey. The clades are coloured as follows: G. carapo group
(green); G2 group (light blue); and G1 group (dark blue).
45
Table 6. Nav1.4a C-terminus ω ratios for Gymnotus from the branch-site model A
The alternative model of codon evolution branch-site model A (M2aII-f) was tested using the codeml program of PAML version 4.5
(Yang 2007) using seven defined foreground lineages. See Table 5 for comparative results with the null model of evolution branch-
site model A, where ω2 = 1 (M2aII). Ratios of non-synonymous to synonymous sites (dN/dS = ω) are listed for each site class, with
those from site classes of fixed ω values highlighted in grey.
Site class 0 1 2a 2b
% of sites 65.0 20.7 10.8 3.4
ω values for the 7 lineages 0.000 1.00 999 999
ω values for the other lineages 0.000 1.00 0.000 1.00
46
3.5 Positively Selected Sites on the Gymnotus Nav1.4a C-terminus Amino Acid Alignment
Sites of possible positive selection in the seven lineages with very high ω values were identified
on the voltage-gated sodium channel protein Nav1.4a carboxyl-terminus (Nav1.4a C-terminus)
amino acid alignment using the M2aII-f model of evolution (Table 7). Posterior probabilities were
calculated using both naïve empirical Bayes (NEB) and Bayes empirical Bayes (BEB)
approaches. The NEB implementation for M2aII-f resulted in eight sites identified as positively
selected, with posterior probabilities ≥ 95%. The BEB implementation resulted in all sites being
identified as positively selected, with posterior probabilities ≥ 79.0%. This included sites with no
amino acid variation. Eight sites had posterior probabilities ≥ 95%, and they were at the same
locations as those identified using NEB. Amino acid identities at the positively selected sites
vary among the seven lineages and other Gymnotus species sampled (Table 8).
Locations of the positively selected sites were identified relative to amino acid sequences
from the Nav1.4a C-terminus of other Gymnotus and gymnotiform fishes, homologs from an
ostariophysian model species for which the genome has been sequenced (Nav1.4a and Nav1.4b of
Danio rerio), and homologs for which there is more research on protein function (Nav1.4,
Nav1.5, and Nav1.2 of Homo sapiens).
47
Table 7. Amino Acid Alignment for the Nav1.4a C-terminus Showing Positively Selected Sites Relative to Motifs of Functional
Significance
The voltage-gated sodium channel carboxyl-terminus (Nav1.4a C-terminus) amino acid sites evolving under positive selection in the 7
Gymnotus lineages with positively selected Nav1.4a C-terminus sequences are identified relative to motifs of functional significance.
The Danio rerio Nav1.4a C-terminus amino acid sequence was used as the reference sequence during the alignment process. The
Nav1.4a consensus sequence from other gymnotiforms and other paralogs of the Nav from commonly used model species are included
here for comparison. The significance of motifs highlighted in grey or identified by *, correspond to the legend on the left of that row.
The PY motif is identified on the Homo sapiens and rat Nav1.2 and Nav1.5 sequences by a green background (Cormier et al. 2002;
Rougier et al. 2005). Phosphorylation sites on the Electrophorus electricus Nav1.4a are identified by a red background (Emerick et al.
1993). Phosphorylation sites on the rat Nav1.2 are identified by a magenta background (Berendt et al. 2010). Models of evolution used
to determine sites evolving under positive selection were implemented in the codeml program of PAML version 4.5 (Yang 2007).
Positively selected sites were calculated using naïve empirical Bayes (NEB) and Bayes empirical Bayes (BEB) approaches, and
identified on the Nav1.4a C-terminus amino acid alignment by “+”. Those sites are coloured by posterior probabilities as follows: 100
% (dark blue); > 99 % (blue); and > 95 % (light blue).
Helices I-IV of the EF-hand, and helix V
a I II III IV
Ca2+
binding b & interaction sites with CaM (*)
c
IQ & its interaction sites with the EF-hand (*) d * *** ** * * * * * * ** **
H. sapiens Nav1.2 6
H. sapiens Nav1.5 5
H. sapiens Nav1.4 4
D. rerio Nav1.4b 3
D. rerio Nav1.4a 2
gymnotiforms Nav1.4a 1
Gymnotus Nav1.4a
Site positions of the C-terminus
amino acids
...SV.T...AE..S....E..Y.V.....P......EFAK.S..A...DP..L.....KVQ..A..L.M.S..R..CL...F.F.KR...ES
...SV.T...TE..S.......Y.I.....PE.....EYSV.S..A...S.........QIS..N..L.M.S..R..CM...F.F.KR...ES
....V.T...SE..G....E..Y.......P......A.SR.S....T...........KI...TL.L.M.P.....CL...F.L.K.....S
....V.T...S........E..Y.......PT.S.....NR.SE.C.T.KD....P...T....T....M.T.....CL.L...L.G....GS
ENFNNAQEESGDPLCEDDFDMFDETWEKFDVDATQFIDYDRLFDFVDALQEPLRIAKPNRLKLISMDIPIVNGDKIHSQDILLAVTREVLGDT
....L..............L....................D...............KP.....AKTNLSVSAE....CL.L..G..Q......
GN H SEA M I DRY FEG FYLE DRVPR IES L MHVPN HQ VN MY TE PFV V I K
SV Y T L N H IHS Y K S N M Q R I SY Q
S Q L LNT V M
V Q Q T
...GV.....S........C......L.....G...L..NQV....AALE..M..PKPN.HR.AKMDLNV.M....PYL......TQ......
D L Q N S CM K V S I I
V T
0 0 0 0 0 0 0 0 0 0...
0 1 2 3 4 5 6 7 8 9...
123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123
Model of Evolution Approach for calculating P Sites on the Nav1.4a C-terminus M2aII-f NEB + + + +
BEB + + + +
48
Helices I-IV of the EF-hand, and helix V a V
Ca2+
binding b & interaction sites with CaM (*)
c ********* **** * * *
IQ & its interaction sites with the EF-hand (*) d IQ-motif
H. sapiens Nav1.2 6
H. sapiens Nav1.5 5
H. sapiens Nav1.4 4
D. rerio Nav1.4b 3
D. rerio Nav1.4a 2
Gymnotiformes Nav1.4a 1
Gymnotus Nav1.4a
Site positions of the C-terminus
amino acids
G....LRIQM.ER.MAS..SKVSY..IT...K..Q..VS.III..A..RY..KQKVKKVSSIYKKDKGKECD-QGT-.IK.DTLID.L.EN-S
G....L.IQM.E..MAA..SKISY..IT...K..H..VS.MVI..AF.R.....SLKH.S.LFRQQAGSGL-SEEDA..R.....YV.SENFS
G....L.QTM.E..MAA..SKVSY..IT...K..H..VC.IKI..A..R...Q.SMKQ.SYMYRHSHDGS---GDDA..K...L.NT.SKM.G
DQ..G..ATM.E..MAN..SK.SY..ITS..K..Q..V..STI..A..S.I...CVKQ.SYMYRD.TGSK-KPTG.A..KV.M..EN.RS..G
IEMDAMKESIEAKFIMNNPTSASFEPIITTLRRKEEERAAIAVQRIYRRHLLKRAIRYACFMRQSKRKVRNPNDNEPPETEGLIARKMNTLYG
...A...Q..Q.....D..IFE...............H...II.KA..Q......L...A..H...-.KHE.-...A..-.....H.......
A E GL VM IKKLHSNHLF VV M WR AK SKM MF Y FM VVHH SLLQCC-Q-RNM-D-DDIADDDS VEQ SA FR
P S R N VLLSSTRITT DN LV R Q V SRIE E QRGHEGGMLPE T I S
T T S S QTT SLV LQ M S TM D VSKKNTVVVSG V
V V T P VW V V R G TTQQ K
V Y R V M
S N
Q
...E...K......LLD..GP.FC..V..........A..KVI..A...Y.....MEH.S.LSR..D.KL-EEQDDAVLE.....Q..SVLYD
T R KT ST Q T V Q VQ L ER MEMQ M K V G
V QT V S
1 1 1 1 1 1 1 1 1. . .
0 1 2 3 4 5 6 7 8. . .
456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456
Model of Evolution Approach for calculating P Sites on the Nav1.4a C-terminus M2aII-f NEB + + + +
BEB + + + +
49
Helices I-IV of the EF-hand, and helix V a
Ca
2+ binding
b & interaction sites with CaM (*)
c
IQ & its interaction sites with the EF-hand (*) d
H. sapiens Nav1.2
6
H. sapiens Nav1.5 5
H. sapiens Nav1.4 4
D. rerio Nav1.4b 3
D. rerio Nav1.4a 2
Gymnotiformes Nav1.4a 1
Gymnotus Nav1.4a
Site positions of the C-terminus
amino acids
-T..KT-DMTP------STTS..SYDS..K--PEK---EKFEKDKSEKEDKGKDI---------....----------.-R..KK
RPLGPPSSSS-----ISST.F..SYDS..R--.--T-SD-NL-..-Q-.R-G---SDY.HSEDLADFPPSPDRD----.-R....
HENGNSSSPSP.EKGEAGDAG.TMGLMPIS--P.D-TAW.PAPPPGQT.RPGVKES--------....---------L.V-----
DQAVED-DHPVG----CSF..HG.TQFGAKRPPVKVQSDVVLHSA--.F-PVP.SST-A.--D-....--.------L.-R....
SNPELAMALELETRPMRPNSQPPKPSQVTQTRASVTFPRPQGQ--LIPVELTSEVILRSAPTTH----SFNSSENATT-IKESIV
...........QA...LA..RM.-DFK..----A..D..---..I.....D.N....H.....I....--...----...R....
FGS PS PMDE IEALPDHP TS SIRE-SNQIPESHS-TLA-PVP I V AV TNEIRLHS H---FSEAIVD TI
GK T TQ G PKSTVTK LSIP L Y Q PD S Q K I V MV QNCFH GEL V V
M Q Q V V TGTQ VGM
N V S R Y
Y
I.A........QAK.ILAQTRMPS-LK-----.P..Y------PN...I.V.N....H...MVR....-Q...FSRAL.VR....
P R M R V G S K
T
1 2 2 2 2 2 2 2..... 2
9 0 1 2 3 4 5 6..... 7
7890123456789012345678901234567890123456789012345678901234567890123456789012345678901
Model of Evolution Approach for calculating P Sites on the Nav1.4a C-terminus M2aII-f NEB
BEB
a, d From Chagot et al. 2009. c From Chagot and Chazin 2011. b From Miloushev et al. 2009.
1 Except Gymnotidae and Apteronotidae sequences. 2 From Accession # DQ149506. 3 From Accession # DQ149505. 4 From Accession # BC172375. 5 From Chagot et al. 2009; and Accession # BC172375. 6 From Miloushev et al. 2009; and Accession # NG_008143.
50
Table 8. Amino Acid Identities of Positively Selected Sites on the Nav1.4a C-terminus for
Various Gymnotus Species
Models of evolution testing for positive selection were implemented in the codeml program of
PAML version 4.5 (Yang 2007). Amino acid identities are identified for Gymnotus species based
on the translated nucleotide alignment of Gymnotus Nav1.4a carboxyl-terminus. Properties of
amino acids were determined from the CRC Handbook of Chemistry and Physics (91st Edition)
and Kyte and Doolittle (1982). The larger the hydropathy number, the more hydrophobic the
amino acid is.
Amino acid site # of
the Nav1.4a C-
terminus (see Table
7)
Amino acid
identity
Properties that differ among various
amino acid identities at the same site
Gymnotus species/lineages with the specified
amino acid identity
(tissue #s are identified in brackets if applicable)
20 Leucine (L) Non-polar; bigger (consensus identity among Gymnotus)
Cysteine (C) Polar; smaller All members of a (positively selected) lineage,
consisting of:
G. cataniapo,
G. n. sp. FRITZI,
G. aff. anguillaris, and
G. pantherinus.
55 Isoleucine (I) Hydropathy 4.5 (consensus identity among Gymnotus)
Methionine (M) Hydropathy 1.9 All members of a (positively selected) lineage,
consisting of:
G. carapo (2004, 2006, and 2007),
G. ucamara, and
G. arapaima.
69 Threonine (T) Hydropathy -0.7; smaller (consensus identity among Gymnotus)
Asparagine (N) Hydropathy -3.5; bigger All members of a (positively selected) lineage,
consisting of:
G. xingu, and
G. pantanal.
Serine (S) Hydropathy -0.8; bigger All members of the lineage, consisting of:
G. chaviro, and
G. varzea.
85 Valine (V) Smaller (consensus identity among Gymnotus)
Isoleucine (I) Bigger All members of a (positively selected) lineage,
consisting of:
G. xingu, and
G. pantanal.
94
Isoleucine (I) Neutral (consensus identity among Gymnotus)
Threonine (T) Polar The (positively selected) lineage:
G. curupira.
51
Amino acid site # of
the Nav1.4a C-
terminus (see Table
7)
Amino acid
identity
Properties that differ among various
amino acid identities at the same site
Gymnotus species/lineages with the specified
amino acid identity
(tissue #s are identified in brackets if applicable)
113 Glycine (G) Smaller (consensus identity among Gymnotus)
Serine (S) Bigger; in Electrophorus electricus, this
site has been determined to be a serine
phosphorylation site (Emerick et al
1993)
All members of a (positively selected) lineage,
consisting of:
G. cataniapo,
G. n. sp. FRITZI,
G. aff. anguillaris, and
G. pantherinus.
134 Lysine (K) Basic; bigger (consensus identity among Gymnotus)
Glutamine (Q) Polar; smaller All members of a (positively selected) lineage,
consisting of:
G. coropinae (2025, 2036, and 2037).
154 Phenylalanine (F) Hydropathy 2.8; bigger (consensus identity among Gymnotus)
Valine (V) Hydropathy 4.2; smaller The (positively selected) lineage, consisting of:
G. curupira.
All members of a (positively selected) lineage,
consisting of:
G. xingu, and
G. pantanal.
All other members of the same monophyletic
lineage, including:
G. cf. tigre,
G. obscurus,
G. chaviro, and
G. varzea.
Leucine (L) Hydropathy 3.8; smaller The (positively selected) lineage:
G. coropinae 2025.
The lineage:
G. jonasi.
52
Chapter 4 Discussion
4.1 Evolutionary Relationships Among Gymnotus
There is currently a comprehensive phylogeny of genus Gymnotus based on morphology (Albert
et al. 2004), as well as one based on both morphology and nucleotide sequences (Lovejoy et al.
2010). Some of the proposed phylogenetic relationships from morphological and nucleotide data
are consistent with each other, while some are unclear (Figure 3). This project used additional
taxa and nucleotide sequences to provide further evidence towards clarifying phylogenetic
relationships among Gymnotus.
The genus Gymnotus and the Gymnotus carapo group were both well supported as
monophyletic, consistent with both existing phylogenies (Figure 3, 6-8). Within the G. carapo
group, the G. carapo complex (Albert et al. 2004; a subset of G. carapo group that includes G.
carapo, Gymnotus arapaima, and Gymnotus choco) was resolved as monophyletic, consistent
with both existing phylogenies. However, it was weakly supported unless Gymnotus mamiraua
and some of the new taxa were included. The most basal G. carapo variant was well supported to
be the same as that identified in the existing nucleotide-based phylogeny (Lovejoy et al. 2010).
Within the G. carapo complex + G. mamiraua clade, the topology of new taxa Gymnotus
omarorum and Gymnotus n. sp. were not well resolved. Within the G. carapo group, the
topology of Gymnotus obscurus and some of the other new taxa (Gymnotus pantanal and
Gymnotus sp. xingu) were not well resolved either.
The G1 and G2 groups were both well supported as monophyletic, but not as a single
monophyletic clade that includes Gymnotus pantherinus. This is consistent with the existing
nucleotide-based phylogeny (Lovejoy et al. 2010). As expected, the G1 group was resolved &
well supported as the most basal Gymnotus clade, when reconstructed with the same
housekeeping mitochondrial and nuclear loci as the existing nucleotide phylogeny. The G.
pantherinus taxon was well supported as basal to the G2 group, which had not been clear from
existing phylogenies.
53
The Gymnotus cylindricus taxon was well supported as the sister to the G. carapo group,
which confirms a suggestion from Lovejoy et al. 2010. The G. tigre taxon was well supported as
basal to the G. carapo + G. cylindricus clade. The topology of Gymnotus tigre may seem
inconsistent with the existing nucleotide-based phylogeny (Lovejoy et al. 2010). However, this
was simply a case of specimen re-identification. The G. tigre specimens used for this project
were adult fish, whose morphological features are more easily identified (James Albert and
Nathan Lovejoy, personal communication). The topology of the juvenile Gymnotus cf. tigre
specimens remained consistent with the existing nucleotide-based phylogeny, and likely
represents a species other than G. tigre (Lovejoy et al. 2010).
4.2 Utility of the scn4aa 3’ for Phylogenetic Reconstruction
The portion of the voltage-gated sodium channel gene scn4aa that encodes the protein’s carboxyl
terminus (scn4aa 3’) locus was one of several loci used to reconstruct the Gymnotus phylogeny.
This locus is approximately 800 nucleotides long (Noda et al. 1984), and nucleotide sequences
were obtained from 28 Gymnotus species. Analyses of these sequences showed that the scn4aa 3’
locus contributes towards a meaningful and accurate phylogenetic topology, with a reasonable
amount of resolution.
The aligned nucleotides were from an orthologous locus, which contributed towards
meaningful reconstruction of the phylogeny among Gymnotus species (Fitch 2000). The scn4aa
gene is one of two paralogs expressed in actinopterygiian myogenic tissue, and one of eight
paralogs encoded in the actinopterygiian genome (Novak et al. 2006; Widmark et al. 2011). The
scn4aa 3’ amplification primers were designed to be specific for and resulted in sole
amplification of those orthologous sequences, rather than sequences from other paralogs.
The nucleotide alignment had a large proportion of parsimony-informative characters,
and the proportion of ambiguous characters was low. This contributed towards accurate
reconstruction of the phylogeny among Gymnotus species (Wiens 1998; Hall 2011). Absence of
alignment gaps for Electrophorus electricus and Gymnotus tigre scn4aa 3’ DNA/cDNA
sequence pairs confirms the absence of introns at this locus in gymnotiforms (Widmark et al.
2011). This increases the chance of an accurate alignment, since introns tend to be more variable
54
in length (Hughes and Yeager 1997). Also, amino acids are more conserved than nucleotides,
and alignments of those sequences can be used to mitigate mis-alignment of exon indels among
species (Wernersson and Pedersen 2003). The scn4aa 3’ sequences contained the highest
proportion of parsimony-informative characters per total characters among the loci in the dataset.
In addition, the scn4aa 3’ sequences only contained 2.94% ambiguous characters, compared with
4.90% from the whole dataset.
The nucleotide characters of scn4aa 3’ seemed to be reasonably variable, which
contributed towards resolution of the phylogeny among Gymnotus species (Brown et al. 1979).
Voltage-gated sodium channels are highly conserved in nucleotide sequence and function across
species (Goldin 2002). However, scn4aa in Actinopterygii had been predicted to vary in
nucleotide sequence (Novak et al. 2006). This variability was confirmed among the
actinopterygiian order Gymnotiformes (Zakon et al. 2006; Arnegard et al. 2010), and among
genus Gymnotus (in this project). When scn4aa 3’ sequences are included for phylogenetic
reconstruction, the proposed evolutionary relationships among Gymnotus are consistent with
both the existing morphology-based and nucleotide-based phylogenies when they are consistent
with each other. Inclusion of the scn4aa 3’ sequences increased the phylogenetic resolution,
since some evolutionary relationships are proposed where they had previously been unresolved
(e.g clarifying the topologies of G. pantherinus and G. cylindricus).
When characters at a locus vary at similar rates among lineages, the resulting phylogeny
may be used as a primary means for estimation of species divergence timing (Schwartz 2007).
However, characters are unlikely to vary at similar rates among lineages if they were subjected to
selective pressures that resulted in divergence of those species. The voltage-gated sodium
channel protein Nav1.4a in gymnotiform fishes may be an example of the latter case, since the
protein has an important role in characteristics that may be under selective pressure among some
lineages.
4.3 Natural Selection at the Nav1.4a C-terminus Among Gymnotus lineages
Zakon et al. 2006 and Arnegard et al. 2010 presented analyses of patterns of selection at the
voltage-gated sodium channel protein Nav1.4a among gymnotiforms and non-electric fish. These
55
authors focused on motifs at and between the homologous domains of the protein. Purifying
selection was detected among lineages of non-electric fish, and neutral (or relaxed) selection was
detected among basal lineages of gymnotiforms. Positive selection was also detected among
gymnotiform lineages, but the analysis only included four species representing four gymnotiform
families (Zakon et al. 2006). In contrast, the project described here focused on motifs of the
Nav1.4a carboxyl-terminus (C-terminus) that may be involved in varying amplitudes and
frequencies of electric organ discharges (EODs). While only one of the gymnotiform families
was represented, the species sample was larger by seven times. Variation in selection among
lineages of Gymnotus was detected, including statistically significant positive selection in seven
lineages.
For most Gymnotus lineages, the amino acids of the Nav1.4a C-terminus seem to be
evolving under purifying selection (Figure 9). This is consistent with purifying selection being
identified for other motifs of the Nav1.4a in the Gymnotus cylindricus taxon (Arnegard et al.
2010). Purifying selection on the Nav1.4a suggests that the EODs of most Gymnotus species are
generally adapted to their habitats, with little benefit to novel variation. The order that includes
the genus Gymnotus (order Gymnotiformes) diverged from other ostariophysan orders
approximately 100 million years ago (Alves-Gomes 1999), and the genus Gymnotus diverged
from other gymnotiform families approximately 56.6 million years ago (Lovejoy et al. 2010).
Since then, Gymnotus species have adapted to a large variety of ecological habitats (Lissman
1958), among various distinct hydrogeographic regions (Albert et al. 2005). Observations of
these fishes indicate that their species-specific EOD characteristics are already fairly constrained
by their abiotic environment and biotic evolutionary pressures (Stoddard 1999; Alves-Gomes
2001; Stoddard 2002).
For two Gymnotus lineages, the amino acids seem to be evolving under neutral selection
(Figure 9). This is consistent with neutral selection being identified for other motifs of the
Nav1.4a among basal lineages of gymnotiforms (Arnegard et al. 2010). Neutral selection on the
Nav1.4a indicates that the EODs of those lineages are less constrained by abiotic and/or biotic
pressures. The habitat of one of the lineages evolving under neutral selection (the Gymnotus
cylindricus lineage) is geographically isolated relative to other Gymnotus species (Lovejoy et al.
2010), and is devoid of most electroreceptive predators with ampullary electroreceptors
56
including siluriforms (catfishes) and Potamotrygonidaes (river stingrays), as well as the electric
eel Electrophorus electricus (Szabo et al. 1972; Szamier and Bennett 1980; Lovejoy 1996;
Stoddard 1999; Alves-Gomes 2001; Stoddard 2002). The G. cylindricus lineage may be less
constrained by biotic pressures, since previous analyses have suggested predation as an
important evolutionary pressure for increased EOD complexity (Stoddard 1999).
For seven Gymnotus lineages, the amino acids are evolving under positive selection
(Table 5; Figure 9). This is the first time positively selected gymnotiform lineages have been
detected using a large sample of species. Positive selection on the Nav1.4a indicates that the
EODs of those lineages are likely under novel environmental constraints and/or biotic
evolutionary pressures. From the limited collection locality information in this project, a few
examples can be identified, where positively selected lineages are geographically isolated
relative to closely related lineages (Figure 9; Table 3; Albert et al. 2005). 1) The positively
selected lineage from which Gymnotus arapaima, Gymnotus ucamara, and some Gymnotus
carapo species are derived, only includes species from the highly diverse Western Amazon
region. However, the G. carapo lineage under purifying selection (tissue # 2040) is from the
Guyanas-Orinoco basin. 2) The positively selected Gymnotus coropinae lineage is from the
highly diverse Western Amazon region. However, the G. coropinae lineages under purifying
selection (tissue #s 2036 and 2037) are from the Guyanas-Orinoco basin. 3) The positively
selected lineage from which Gymnotus sp. xingu and Gymnotus pantanal are derived, only
includes species from the Paraguay-Paraná basin of Argentina. However, this lineage's sister
lineage and basal lineages mostly include species from the highly diverse Western Amazon
region. Specific environmental constraints and biotic evolutionary pressures may be identified in
future comparisons of EODs between positively selected lineages and closely related lineages
that are not under positive selection.
4.4 Natural Selection at Specific Sites of the Nav1.4a C-terminus Among Gymnotus
The existing analyses of patterns of selection at the voltage-gated sodium channel protein
Nav1.4a among gymnotiforms and non-electric fish focused on motifs associated with protein
internalization (DII-III linker), the voltage-sensing component of fast activation (DIIS2-4,
57
DIIS4-5 linker, DIIIS2-4, DIIIS4-5 linker, and DIVS1-2), pore module (DIIS5-6 and DIIIS5-6),
and the fast inactivation occlusion particle (DIII-IV linker) (Zakon et al. 2006; Arnegard et al.
2010). Statistically significant evidence of positive selection at specific sites among those motifs
was not identified. This project focused on motifs of the Nav1.4a carboxyl-terminus (C-terminus)
that are involved in regulation of protein internalization, fast inactivation, and possibly also
resurgent current. Statistically significant evidence of amino acid sites under purifying, neutral
(or relaxed), and positive selection were identified among these motifs.
When all the Gymnotus species were included in the analysis, there was statistically
significant evidence for variation in the level of selection (between purifying and neutral
selection) among amino acid sites of the Nav1.4a C-terminus (Table 5). This is consistent with
purifying and neutral selection being identified for amino acid sites of other Nav1.4a motifs
(Zakon et al. 2006; Arnegard et al. 2010). Most amino acid sites of the Nav1.4a C-terminus are
evolving under purifying selection (70.9% of sites), which is consistent with the Nav1.4a protein
structure and functional elements being highly conserved among orthologs across species
(Catterall et al. 2005). The finding that some amino acid sites are evolving under neutral
selection (29.101% of sites) is consistent with the Nav1.4a protein being the paralog that is
preferentially expressed in the electric organ, since it is unlikely that evolution of this paralog
would adversely affect other organs (Lopreato et al. 2001; Goldin 2002; Novak et al. 2006;
Widmark et al. 2011).
When all the Gymnotus species were included in the analysis, there was no statistically
significant positive selection found among amino acid sites of the Nav1.4a C-terminus (Table 5).
This is consistent with lack of such evidence for amino acid sites of other Nav1.4a motifs (Zakon
et al. 2006; Arnegard et al. 2010). Since most lineages of Gymnotus fishes are not evolving
under positive selection, it is not surprising that there were no positively selected amino acid
sites detected across all lineages of Gymnotus fishes.
In the seven Gymnotus lineages that are evolving under positive selection, there was
statistically significant positive selection at specific amino acid sites of the Nav1.4a C-terminus
(Table 5). This novel finding may be a result of greatly increased taxonomic representation,
compared with analyses of other Nav1.4a motifs among gymnotiforms (Zakon et al. 2006;
58
Arnegard et al. 2010). Most amino acid sites of the Nav1.4a C-terminus among the seven
positively selected lineages are under purifying selection (65.0% of sites), while a smaller
proportion are under neutral selection (20.7% of sites), and an even smaller proportion are under
positive selection (14.2% of sites; Table 6). Statistically significant positively selected sites of
the Nav1.4a C-terminus identified using both naïve empirical Bayes (NEB) and the more
sensitive Bayes empirical Bayes (BEB) approaches were identical (Table 7).
As predicted, amino acid sites of the Nav1.4a C-terminus that are positively selected, and
likely contribute to altered (but not abolished) protein function were identified in Gymnotus
fishes. Amino acid variations associated with the neutrally and positively selected sites are
unlikely to abolish Nav1.4a protein function, since EODs are essential for Gymnotus fishes'
survival, and field collection of electrogenic fish for tissue samples relies on detection of the
fishes' EODs. Amino acid variations at the eight positively selected sites likely result in altered
protein function that affects the EOD frequency. The positively selected sites are at motifs
involved in fast activation, resurgent current, and phosphorylation (Table 7). The typical time
course for Nav activation and subsequent fast inactivation (~ 1 ms for each step; Hodgkin et al.
1952; Ulbricht 2005) coincides with the time course for one Gymnotus EOD pulse (1-3 ms;
Crampton and Albert 2006). The typical time course for Nav recovery back to its resting state (on
the order of milliseconds; Ulbricht 2005) coincides with the range in Gymnotus EOD frequencies
(~ 14-67 ms between pulses; Crampton and Albert 2006). There was no evidence for selective
pressures on amplitudes of EODs, due to absence of positively selected sites at the PY motif
(Table 7). However, this does not preclude the possibility of natural selection on other
characteristics that vary with EOD amplitude such as anatomical and cellular characteristics.
At the eight positively selected sites, the amino acid identities of Gymnotus species in
positively selected lineages (as well as a few other lineages), are different from the identities in
the majority of Gymnotus species (Table 8). The Gymnotus species in positively selected
lineages had a different amino acid identity at as few as one of the eight positively selected sites.
However, even single mutations can have significant effects on physiological characteristics of
the tissue if they are at key amino acid sites of the Navs (Lehmann-Horn and Jukart-Rott 1999).
Since amino acid variants are present at very few sites for each positively selected lineage, this
provides a unique opportunity for future assessments of specific functions of those sites.
59
Predictions can be made from comparisons of EOD frequencies between species with different
amino acid identities at a particular site. These predictions can then be verified by site-directed
mutagenesis and patch clamp recordings. The presence of amino acid variations at very few sites
for each positively selected lineage also provides a unique opportunity to contribute towards
future assessments of selective pressures in various habitats of Central and South America. If the
EOD frequencies of the positively selected lineages are higher than the EOD frequencies of
comparative lineages with the consensus amino acid identity, then the habitat of species in the
positively selected lineages can be predicted to have higher predation pressure from predators
that are sensitive to lower EOD frequencies (e.g. more predatory fishes with ampullary
electroreceptors).
4.5 Summary and Future Directions
Evolutionary relationships among Gymnotus were clarified using additional taxa and nucleotide
sequences. The resultant topologies were generally consistent with previously proposed
phylogenetic relationships. This project is the first to use the portion of the voltage-gated sodium
channel gene scn4aa that encodes the protein’s carboxyl-terminus (scn4aa 3’) for phylogenetic
reconstruction. The locus contributed towards a meaningful and accurate species-level
phylogenetic topology, with a reasonable amount of resolution. This project was the first to find
evidence of purifying, neutral (relaxed), and positive selection on the scn4aa 3’ among specific
lineages of the Gymnotus genus of the order Gymnotiformes. This finding is generally consistent
with those from previous analyses of other motifs of this scn paralog where a small sample of
Gymnotiform species was used (Zakon et al. 2006; Arnegard et al. 2010). This project was also
the first to find evidence of positive selection at specific sites on the scn4aa gene, in addition to
purifying and neutral selection at specific sites. The amino acid sites under positive selection in
the seven positively selected lineages were likely under selective pressure to alter their EOD
frequencies, since amino acid sites under positive selection are part of motifs associated with
voltage-gated sodium channel protein Nav1.4a fast inactivation and possibly resurgent current.
The eight positively selected sites on the scn4aa 3’ among Gymnotus species in the seven
positively selected lineages represent amino acids that likely contribute to altered protein
function.
60
Future analyses of Gymnotus EOD frequencies among lineages experiencing neutral and
positive selection may contribute to the identification of selective pressures in particular habitats
of the Neotropics (Central and South America). Comparisons of EOD frequencies between
species with different amino acid identities at particular positively selected sites can provide
predictions of protein function that may be verified by site-directed mutagenesis and patch clamp
recordings.
The methods for determining positive selection from this project may be used in similar
projects focused on other clades of electric fish. Among the genus Gymnotus, there were eight
positively selected Nav1.4a C-terminus amino acid sites out of the 43 sites variable in amino acid
identity (Table 7). Among taxa within the order Gymnotiformes, there may be more positively
selected sites identified, since the number of sites variable in amino acid identity is more than
four times larger (at least 177 sites). The methods from this project may also be applied to other
motifs of the Nav1.4a involved in fast inactivation (DIVS4, DIII-IV linker, S5-6, intracellular
linkers). Future analyses of Nav1.4a may identify additional amino acid sites and identities that
contribute to knowledge of protein function.
61
References
Agnew, W. S. (1984). Voltage-regulated sodium channel molecules. Annu Rev Physiol. 46, 517-
30.
Agnew, W. S., Levinson, S. R., Brabson, J. S. and Raftery, M. A. (1978). Purification of the
tetrodotoxin-binding component associated with the voltage-sensitive sodium channel
from Electrophorus electricus electroplax membranes. Proc Natl Acad Sci. 75(6), 2606-
2610.
Ahern, C. A. (2013). What activates inactivation? J Gen Physiol. 142(2), 97.
Albert, J. S. and Lundberg, J. G. (1995). Gymnotiformes. The Neotropical electric eels and
knifefishes. Version 01 January 1995 (under construction).
<http://tolweb.org/Gymnotiformes/15064/1995.01.01>.
Albert, J. S. (2001). Species diversity and phylogenetic systematics of American knifefishes
(Gymnotiformes, Teleostei). Misc Publ Mus Zool. University of Michigan. 190, 1-129.
Albert, J. S., Crampton, W. G. R., Thorsen, D. H. and Lovejoy, N. R. (2005). Phylogenetic
systematics and historical biogeography of the Neotropical electric fish Gymnotus
(Teleostei: Gymnotidae). Syst Biodiv. 2(4), 375-417.
Alves-Gomes, J. A., Orti, G., Haygood, M., Heiligenberg, W. and Meyer, A. (1995).
Phylogenetic analysis of the South American electric fishes (Order Gymnotiformes) and
the evolution of their electrogenic system: a synthesis based on morphology,
electrophysiology, and mitochondrial sequence data. Mol Biol Evol. 12(2), 298-318.
Alves-Gomes, J. (1999). Systematic biology of Gymnotiform and Mormyriform electric fishes:
phylogenetic relationships, molecular clocks, and rates of evolution in the mitochondrial
rRNA genes. J Exp Biol. 202, 1167-1183.
Alves-Gomes, J. A. (2001). The evolution of electroreception and bioelectrogenesis in teleost
fish: a phylogenetic perspective. J Fish Biol. 58, 1489-1511.
62
Albert, J. S., Zakon, H. H., Stoddard, P. K., Unguez, G. A., Holmberg-Albert, S. K. S. and
Sussman, M. R. (2008). The case for sequencing the genome of the electric eel
Electrophorus electricus. J Fish Biol. 72: 331–354.
Ariyasu, R. G., Deerinck, T. J., Levinson, S. R. and Ellisman, M. H. (1987). Distribution of (Na+
+ K+)ATPase and sodium channels in skeletal muscle and electroplax. Journal of
Neurocytology. 16, 511-522.
Arnegard, M. E., Zwickl, D. J., Lu, Y. and Zakon, H. H. (2010). Old gene duplication facilitates
origin and diversification of an innovative communication system – twice. Proc Natl
Acad Sci. 107(51), 22172-22177.
Baba, M. L., Goodman, M., Berger-Cohn, J., Demaille, J. G. and Matsuda, G. (1984). The early
adaptive evolution of calmodulin. Mol Biol Evol. 1(6), 442-455.
Bedore, C. N. and Kajiura, S. M. (2013). Bioelectric fields of marine organisms: voltage and
frequency contributions to detectability by electroreceptive predators. Physiol Biochem
Zool. 86(3), 298–311.
Bahler, M. and Rhoads, A. (2002). Calmodulin signaling via the IQ motif. FEBS Lett. 513, 107-
113.
Bello, O. S., Gonzalez, J., Capani, F. and Barreto, G. E. (2012). In silico docking reveals
possible riluzole binding sites on Nav1.6 sodium channel: implications for amyotrophic
lateral sclerosis therapy. J Theor Biol. 315, 53-63.
Benchimol, M., Machado, R. D. and de Souza, W. (1978). Staining of microtubules of the
electrocyte of Electrophorus electricus L. by alcian blue and lanthanum. Experientia. 35
(5), 670-671.
Bennett, M. V. L. and Grundfest, H. (1959). Electrophysiology of electric organ in Gymnotus
carapo. J Gen Physiol. 42(5), 1067-1103.
Bennett, M. V. L. (1961). Modes of operation of electric organs. Ann N Y Acad Sci. 94, 458-
509.
63
Berendt, F. J., Park, K. S. and Trimmer, J. S. (2010). Multisite phosphorylation of voltage-gated
sodium channel alpha subunits from rat brain. J Proteome Res. 9(4), 1976-1984.
Brinkman, F. S. L. and Leipe, D. D. (2001). Chapter 14: Phylogenetic analysis (In: Baxevanis,
A. D. and Ouellette, B. F. F. Eds.), Bioinformatics: A practical guide to the analysis of
genes and proteins, Second Edition. John Wiley & Sons Inc. (Electronic), pp. 323-358.
ISBN 0-471-22392-1.
Brown, W. M., George, M. and Wilson, A. C. (1979). Rapid evolution of animal mitochondrial
DNA. Proc Natl Acad Sci. 76(4), 1967-1971.
Bullock, T. H. (1982). Electroreception. Annu Rev Neurosci. 5, 121–170.
Cannon, S. C. and Bean, B. P. (2010). Sodium channels gone wild: resurgent current from
neuronal and muscle channelopathies. J Clin Invest. 120(1), 80-83.
Catterall, W. A. (1984). The molecular basis of neuronal excitability. Science. 223(4637), 653-
661.
Cantrell, A. R. and Catterall, W. A. (2001). Neuromodulation of Na+ channels: an unexpected
form of cellular plasticity. Nat Rev Neurosci. 2, 397-407.
Catterall, W. A., Goldin, A. and Waxman, S. G. (2005). International Union of Pharmacology.
XLVII. Nomenclature and structure-function relationships of voltage-gated sodium
channels. Pharmacol Rev. 57(4), 397-409.
Caputi, A. A. (1999). The electric organ discharge of pulse Gymnotiforms: the transformation of
simple impulse into a complex spatio-temporal electromotor pattern. J Exp Biol. 202,
1229-1241.
Chagot, B., Potet, F., Balser, J. R. and Chazin, W. J. (2009). Solution NMR structure of the C-
terminal EF-hand domain of human cardiac sodium channel Nav1.5. J Biol Chem. 284
(10), 6436-6445.
64
Chagot, B. and Chazin, W. J. (2011). Solution NMR structure of apo-calmodulin in complex
with the IQ motif of human cardiac sodium channel Nav1.5. J Mol Biol. 406(1), 106-119.
Charalambous, K. and Wallace, B. A. (2011). NaChBac: the long lost sodium channel ancestor.
Biochemistry. 50(32), 6742-6752.
Chin, D. and Means, A. R. (2000). Calmodulin: a prototypical calcium sensor. Trends Cell Biol.
10(8), 322-328.
Cohen, S. A. and Levitt, L. K. (1993). Partial characterization of the rH1 sodium channel protein
from rat heart using subtype-specific antibodies. Circ Res. 73, 735-742.
Collin, S. P. and Whitehead, D. The functional roles of passive electroreception in non-electric
fishes. Animal Biology. 54(1), 1-25.
Cormier, J. W., Rivolta, I., Tateyama, M., Yang, A.-S. And Kass, R. S. (2002). Secondary
structure of the human cardiac Na+ channel C terminus. J Biol Chem. 277(11), 9233-
9241.
Crampton, W. G. R. (1998). Effects of anoxia on the distribution, respiratory strategies and
electric signal diversity of Gymnotiform fishes. J Fish Biol. 53(A), 307-330.
Crampton, W. G. R. and Albert, J. S. (2006). Evolution of electric signal diversity in
Gymnotiform fishes (In: Ladich, F., Collin, S. P., Moller, P. and Kapoor, B. G. Eds.),
Communication in fishes. Science Publishers, Enfield, New Hampshire, pp. 657-731.
Crampton, W. G. R., Lovejoy, N. R. and Waddell, J. C. (2011). Reproductive character
displacement and signal ontogeny in a sympatric assemblage of electric fish. Evolution.
65(6), 1650-1666.
Cruz, J. S., Silva, D. F., Ribeiro, L. A., Araújo, I. G. A., Magalhães, N., Medeiros, A., Freitas,
C., Araujo, I. C. and Oliveira, F. A. (2011). Resurgent Na+ current: A new avenue to
neuronal excitability control. Life Sci. 89, 564-569.
65
de Arujo Jorge, T. C., de Souza, W. and Machado, R. D. (1979). Ultrastructural localization of
calcium-binding sites in the electrocyte of the Electrophorus electricus (L.). J Cell Sci.
38, 97-104.
Don, R. H., Cox, P. T., Wainwright, B. J., Baker, K. and Mattick, J. S. (1991). 'Touchdown' PCR
to circumvent spurious priming during gene amplification. Nucl Acids Res. 19(14), 4008.
Eijkelkamp, N., Linley, J. E., Baker, M. D., Minett, M. S., Cregg, R., Werdehausen, R., Rugiero,
F. and Wood, J. N. (2012). Neurological perspectives on voltage-gated sodium channels.
Brain. 135(9), 2585-2612.
Ellis, M. M. (1913). The gymnotid eels of tropical America. Mem Carneg Mus. 6(3), 109-195.
Ellisman, M. H. and Levinson, S. R. (1982). Immunocytochemical localization of sodium
channel distributions in the excitable membranes of Electrophorus electricus. Proc Natl
Acad Sci. 79, 6707-6711.
Emerick, M. C. Shenkel, S. and Agnew, W. S. (1993). Regulation of the eel electroplax Na
channel and phosphorylation of residues on amino- and carboxyl-terminal domains by
cAMP-dependent protein kinase. Biochemistry. 32(36), 9435-9444.
Emery, A. E. H. (1991). Population frequencies of inherited neuromuscular disease – a world
survey. Neuromuscular Disorders. 1(1), 19-29.
Favre, I., Moczydiowski, E. and Schild, L. (1996). On the structural basis for ionic selectivity
among Na+, K
+, and Ca
+ in the voltage-gated sodium channel. Biophys J. 71, 3110-3125.
Ferrari, M. B. and Zakon H. H. (1993). Conductances contributing to the action potential of
Sternopygus electrocytes. J Comp Physiol A. 173, 281-292.
Fink, S. V. and Fink, W. L. (1981). Interrelationships of the ostariophysan fishes (Teleostei).
Zool J Linn Soc. 72, 297-353.
Fitch, W. M. (2000). Homology a personal view on some of the problems. Trends Genet. 16(5),
227-31.
66
Fotia, A. B., Ekberg, J., Adams, D. J., Cook, D. I., Poronnik, P. and Kumar, S. (2004).
Regulation of neuronal voltage-gated sodium channels by the ubiquitin-protein ligases
nedd4 and nedd4-2. J Biol Chem. 279(28), 28930-28935.
Fritz, L. C. and Brockes, J. P. (1983). Immunochemical properties and cytochemical localization
of the voltage-sensitive sodium channel from the electroplax of the eel (Electrophorus
electricus). J Neurosci. 3(11), 2300-2309.
Froese, R. and Pauly, D. Editors. (2012). FishBase. <http://www.fishbase.org>.
Gayet, M., Meunier, F. J. and Kirschbaum, F. (1994). Gymnotiforme fossile de bolivie et ses
relations phylogénétiques au sien des formes actuelles. Cybium. 18(3), 273-306.
Goldin, A. L. (2002). Evolution of voltage-gated Na+ channels. J Exp Biol. 205, 575-584.
Goldin, A. L., Barchi, R. L., Caldwell, J. H., Hofmann, F., Howe, J. R., Hunter, J. C., Kallen, R.
G., Mandel, G., Meisler, M. H., Netter, Y. B., Noda, M., Tamkun, M. M., Waxman, S.
G., Wood, J. N. and Catterall, W. A. (2000). Nomenclature of voltage-gated sodium
channels. Neuron. 28, 365-368.
Gordon, R. D., Fieles, W. E., Schotland, D. L., Hogue-Angeletti, R. and Barchi, R. L. (1987).
Topographical localization of the C-terminal region of the voltage-dependent sodium
channel from Electrophorus electricus using antibodies raised against a synthetic peptide.
Proc Natl Acad Sci. 84, 308-312.
Gordon, R. D., Li, Y., Fieles, W. E., Schotland, D. L. and Barchi, R. L. (1988). Topographical
localization of a segment of the eel voltage-dependent sodium channel primary sequence
(aa 927-938) that discriminates between modes of tertiary structure.
Gotter, A. L., Kaetzel, M. A. and Dedman, J. R. (1998). Electrophorus electricus as a Model
System for the Study of Membrane Excitability. Comp. Biochem. Physiol. 119A (1),
225-241.
Hall, B. G. (2011). Phylogenetic trees made easy: a how-to manual, 4th edition. Sinauer
Associates, Inc., Sunderland, MA.
67
Hansen, J. D. and Kaattari, S. L. (1996). The recombination activating gene 2 (Rag2) of the
rainbow trout Oncorhynchus mykiss. Immunogenetics. 44, 203-211.
Haynes, W. and Lide, D. (2010). CRC handbook of chemistry and physics (91st Edition): a
ready-reference book of chemical and physical data. Boca Raton, Fla. London: CRC
Taylor & Francis distributor.
Hebert, T., Drapeau, P., Pradier, L. and Dunn, R. J. (1994). Block of the rat brain IIA sodium
channel alpha subunit by the neuroprotective drug riluzole. Mol Pharmacol. 45(5), 1055-
60.
Heidmann, T. and Changeux, J.-P. (1978). Structural and functional properties of the
acetylcholine receptor protein in its purified and membrane-bound states. Ann Rev
Biochem. 47, 317-57.
Hennemann, E. (1957). Relation between size of neurons and their susceptibility to discharge.
Science. 126, 1345-1346.
Hodgkin, A. L., Huxley, A. F. and Katz, B. (1952). Measurement of current-voltage relations in
the membrane of the giant axon of Loligo. J Physiol. 116, 424-448.
Hopkins, C. D. (1988). Neuroethology of electric communication. Ann Rev Neurosci. 11, 497-
535.
Hopkins, C. D. (1999). Design features for electric communication. J Exp Biol. 202, 1217-1228.
Hopkins, C.D., Comfort, N. C., Bastian, J. and Bass, A. H. (1990). Functional analysis of sexual
dimorphism in an electric fish, Hypopomus pinnicaudatus, order Gymnotiformes. Brain
Behav Evol. 35(6), 350-367.
Hopkins, P. M. (2006). Skeletal muscle physiology. Contin Educ Anaesth Crit Care Pain. 6 (1),
1-6.
Huelsenbeck, J.P. and Ronquist, F. (2001). MrBayes: bayesian inference of phylogenetic trees.
Bioinformatics 17, 754–755.
68
Hughes, A. L. and Yeager, M. (1997). Comparative evolutionary rates of introns and exons in
murine rodents. J Mol Evol. 45(2), 125-130.
Jarecki, B. W., Piekarz, A. D., Jackson II, J. O. and Cummins, T. R. (2010). Human voltage-
gated sodium channel mutations that cause inherited neuronal and muscle
channelopathies increase resurgent sodium currents. J Clin Invest. 120, 369-378.
Kaetzel, M. A. and Dedman J. R. (1987). Identification of a 55-kDa high-affinity calmodulin-
binding protein from Electrophorus electricus. J Biol Chem. 262(4), 1818-1822.
Keesey, J. (2005). How electric fish became sources of acetylcholine receptor. J Hist Neurosci.
14 (2), 149-164.
Keynes, R. D. and Martins-Ferreira, H. (1953). Membrane Potentials in the Electroplates of the
Electric Eel. J. Physiol. 119, 315-351.
Kullberg, M., Nilsson, M., Arnason, U., Harley, E. H. and Janke, A. (2006). Housekeeping genes
for phylogenetic analysis of eutherian relationships. Mol Biol Evol. 23(8), 1493-1503.
Kyte, J. and Doolittle, R. F. (1982). A simple method for displaying the hydropathic character of
a protein. J Mol Biol. 157(1), 105-32.
Lavoué, S. and Sullivan, J. P. (2004). Simultaneous analysis of five molecular markers provides
a well-supported phylogenetic hypothesis for the living bony-tongue fishes
(Osteoglossomorpha: Teleostei). Mol Phylogent Evol. 33 (1), 171-185.
Lehmann-Horn, F. and Jukart-Rott, K. (1999). Voltage-gated ion channels and hereditary
disease. Physiol Rev. 79, 1317-1372.
Lester, H. (1978). Analysis of sodium and potassium redistribution during sustained permeability
increases at the innervated face of Electrophorus Electroplaques. J Gen Physiol. 72, 847-
862.
Levinson, S. R., Duch, D. S., Urban, B. W. and Recio-Pinto, E. (1986). The sodium channel
from Electrophorus electricus. Ann N Y Acad Sci. 479(1), 162-178.
69
Li, C., Ortí, G., Zhang, G. and Lu, G. (2007). A practical approach to phylogenomics: the
phylogeny of ray-finned fish (Actinopterygii) as a case study. BMC Evol. Biol. 7(44), 1-
11.
Lissman, H. W. (1958). On the function and evolution of electric organs in fish. J Exp Biol. 35,
156-191.
Liu, Z., Tao, J., Ye, P. and Ji, Y. (2012). Mining the virgin land of neurotoxicology: a novel
paradigm of neurotoxic peptides action on glycosylated voltage-gated sodium channels. J
Toxicol. 2012(843787).
Lopreato, G. F., Lu, Y., Southwell, A., Atkinson, A. S., Hillis, D. M., Wilcox, T. P. and Zakon,
H. H. (2001). Evolution and divergence of sodium channel genes in vertebrates. Proc
Natl Acad Sci. 98(13), 7588-7592.
Lorenzo, D., Sierra, F., Silva, A. and Macadar, O. (1990). Spinal mechanisms of electric organ
discharge synchronization in Gymnotus carapo. J Comp Physiol A. 167, 447-452.
Lorenzo, D., Sierra, F., Silva, A. and Macadar, O. (1993). Spacial distribution of the medullary
command signal within the electric organ of Gymnotus carapo. J Comp Physiol A. 173,
221-226.
Lovejoy, N. R. (1996). Systematics of myliobatoid elasmobranchs: with emphasis on the
phylogeny and historical biogeography of neotropical freshwater stingrays
(Potamotrygonidae: Rajiformes). Zool J Linn Soc. 117, 207-257.
Lovejoy, N. R. and Collette, B. (2001). Phylogenetic relationships of new world needlefishes
(Teleostei: Belonidae) and the biogeography of transitions between marine and
freshwater habitats. Copeia. 2, 324-338.
Lovejoy, N. R., Lester, K., Crampton, W. G. R., Marques, F. P. L. and Albert, J. S. (2010).
Phylogeny, biogeography, and electric signal evolution of Neotropical knifefishes of the
genus Gymnotus (Osteichthyes: Gymnotidae). Mol Phylogenet Evol. 54, 278-290.
70
Lynch, M., O'Hely, M., Walsh, B. and Force, A. (2001). The probability of preservation of a
newly arisen gene duplicate. Genetics. 159, 1789-1804.
Machado, R. D., de Souza, W., Cotta-Pereira, G. C. and de Oliveira Castro, G. (1976). On the
fine structure of the electrocyte of Electrophorus electricus L. Cell Tiss Res. 174, 355-
366.
Machado, R. D., de Souza, W., Benchimol, M., Attias, M. and Porter, K. R. (1980). Observations
on the innervated face of the electrocyte of the main organ of the electric eel
(Electrophorus electricus L.). Cell Tissue Res. 213, 69-80.
Maddison, D. R. and K.-S. Schulz (eds.) (2007). The tree of life web project.
<http://tolweb.org>.
Mago-Leccia, F. (1994). Electric fishes of the continental waters of America: classification and
catalogue of the electric fishes of the order Gymnotiformes (Teleostei: Ostariophysi) with
descriptions of new genera and species. Volume 29: Biblioteca de la Academia de
Ciencias Físicas, Matemáticas y Naturales. Fundacion para el Desarrollo de las Ciencias
Fisicas, Matematicas y Naturales (FUDECI), Clemente, Caracas, Venezuela.
Mayden, R. L., Tang, K. L., Conway, K. W., Freyhof, J., Chamberlain, S., Haskins, M.,
Schneider, L., Sudkamp, M., Wood, R. M., Agnew, M., Bufalino, A., Sulaiman, Z.,
Miya, M., Saitoh, K. and He, S. P. (2007). Phylogenetic relationships of Danio within the
order Cypriniformes: a framework for comparative and evolutionary studies of a model
species. J Exp Zool B Mol Dev Evol. 308B, 642-654.
Mermelstein, C. D. S., Costa, M. L. and Neto, V. M. (2000). The cytoskeleton of the electric
tissue of Electrophorus electricus L. An Acad Bras Ci. 72(3).
Mills, A. and Zakon, H. H. (1987). Coordination of EOD frequency and pulse duration in a
weakly electric wave fish: the influence of androgens. J Comp Physiol A. 161, 417-430.
Miloushev, V. Z., Levine, J. A., Arbing, M. A., Hunt, J. F., Pitt, G. S. and Palmer III A. G.
(2009). Solution structure of the Nav1.2 C-terminal EF-hand domain. J Biol Chem.
284(10), 6446-6454.
71
Moller, P. (1995). Electric fishes: history and behavior. Chapman & Hall, pp. 583.
Morth, J. P., Pedersen, B. P., Buch-Pedersen, M. J., Andersen, J. P., Vilsen, B., Palmgren, M. G.
and Nissen, P. (2011). A structural overview of the plasma membrane Na+, K+-ATPase
and H+-ATPase ion pumps. Nat Rev Mol Cell Biol. 12(1), 60-70.
Müller, K. F. (2005). The efficiency of different search strategies in estimating parsimony
jackknife, bootstrap, and Bremer support. BMC Evol Biol. 5(58).
Munjaal, R. P., Connor, C. G., Turner, R. and Dedman, J. (1986). Eel electric organ:
hyperexpressing calmodulin system. Mol Cell Biol. 6(3), 950-954.
Nakamura, Y., Nakajima, S. and Grundfest, H. (1965). Analysis of spike electrogenesis and
depolarizing K inactivation in electroplaques of Electrophorus electricus, L. J Gen
Physiol. 49(2), 321-49.
Noda, M., Shimizu, S, Tanabe, T., Takai, T., Kayano, T., Ikeda, T., Takahashi, H., Nakayama,
H., Kanaoka, Y., Minamino, N., Kangawa, K., Matsuo, H., Raftery, M. A., Hirose, T.,
Inayama, S., Hayashida, H., Miyata, T. and Numa, S. (1984). Primary structure of
Electrophorus electricus sodium channel deduced from cDNA sequence. Nature. 312,
121-127.
Novak, A. E., Jost, M. C., Lu, Y., Taylor, A. D., Zakon, H. H. and Ribera, A. B. (2006). Gene
duplications and evolution of vertebrate voltage-gated sodium channels. J Mol Evol. 63,
208-221.
Nylander, J.A.A. (2004). MrModeltest. Technical report. Evolutionary Biology Centre, Uppsala
University, Uppsala.
Palumbi, S., Martin, A., Romano, S., McMillan, W.O., Stice, L. and Grabowski, G. (1991). The
simple fool’s guide to PCR, version 2.0. Honolulu: Department of Zoology and Kewalo
Marine Laboratory, University of Hawaii.
72
Payandeh, J., El-Din, T. M. G., Scheuer, T., Zheng, N. and Catterall, W. A. (2012). Crystal
structure of a voltage-gated sodium channel in two potentially inactivated states. Nature.
486, 135-140.
Payandeh, J., Scheuer, T., Zheng, N. and Catterall, W. A. (2011). The crystal structure of a
voltage-gated sodium channel. Nature. 475, 353-359.
Potet, F., Chagot, B., Anghelescu, M., Viswanathan, P. C., Stepanovic, S. Z., Kupershmidt, S.,
Chazin, W. J. and Balser, J. R. (2009). Functional interactions between distinct sodium
channel cytoplasmic domains through the action of calmodulin. J Biol Chem. 284(13),
8846-8854.
Rast, J. P. and Litman, G. W. (1998). Towards understanding the evolutionary origins and early
diversification of rearranging antigen receptors. Immunol Rev. 166, 79-86.
Rose, P. K. (2007). Persistence has its own reward: repetitive firing of action potentials in
neurons. J Physiol. 580 (2), 357.
Rougier, J.-S., van Bemmelen, M. X., Bruce, C., Jespersen, T., Gavillet, B., Apothéloz, F.,
Cordonier, S., Staub, O., Rotin, D. and Abriel, H. (2005). Molecular determinants of
voltage-gated sodium channel regulation by the Nedd4/Nedd4-like proteins. AM J
Physiol Cell Physiol. 288(3), C692-701.
Ruff, R. L (2003). Neurophysiology of the neuromuscular junction: overview. Ann N Y Acad
Sci. 998, 1-10.
Saitoh, K., Miya, M., Inoue, J. G., Ishiguro, N. B. and Nishida, M. (2003). Mitochondrial
genomics of Ostariophysan fishes: perspectives on phylogeny and biogeography. J Mol
Evol. 56, 464-472.
Sarhan, M. F., Tung, C.-C., Petegem, F. V. and Ahern, C. A. (2012). Crystallographic basis for
calcium regulation of sodium channels. Proc Natl Acad Sci. 109(9), 3558-3563.
Scheuer, T. (2010). Regulation of sodium channel activity by phosphorylation. Semin Cell Dev
Biol. 22, 160-165.
73
Schmidt, J. W. and Catterall, W. A. (1987). Palmitylation, sulfation, and glycosylation of the α
subunit of the sodium channel. J Biol Chem. 262(28), 13713-13723.
Schwartz, J. H. (2007). Do molecular clocks run at all? A critique of molecular systematics. Biol
Theory. 1(4), 357-371.
Shah, V. N., Wingo, T. L., Weiss, K. L., Williams, C. K., Balser, J. R., Chazin, W. J. (2006).
Calcium-dependent regulation of the voltage-gated sodium channel hH1: intrinsic and
extrinsic sensors use a common molecular switch. Proc Natl Acad Sci. 103(10), 3592-
3597.
Solmó, C., de Souza, W., Machado, R. D. and Hassón-Voloch, A. (1977). Biochemical and
cytochemical localization of ATPases on the membranes of the electrocyte of
Electrophorus electricus. Cell Tiss Res. 185, 115-128.
Stoddard, P. K. (1999). Predation enhances complexity in the evolution of electric fish signals.
Nature. 400, 254-256.
Stoddard, P. K. (2002). Electric signals: predation, sex, and environmental constraints. Advances
in the Study of Behaviour. 31, 201-242.
Stoddard, P.K. (2006). Plasticity of the electric organ discharge waveform: contexts,
mechanisms, and implications for electrocommunication. In: Communication in Fishes.
ch. 22, pp 623-646. F. Ladich, S.P. Collin, P. Moller, B.G. Kapoor, eds. Science
Publisher, Inc., Enfield, NH, USA
Swofford, D.L. (2002). PAUP* 4:40: Phylogenetic analysis using parsimony *and other
methods. Sinauer Associates, Sunderland, MA.
Szabo, T., Kalmijn, A. J., Enger, P. S. and Bullock, T. H. (1972). Microampullary organs and a
submandibular sense organ in the fresh water ray, Potamotrygon. J Comp Physiol. 79(1),
15-27.
Szamier, R.B. and Bennett, M.V.L. (1980). Ampullary electroreceptors in the fresh water ray,
Potamotrygon. J Comp Physiol. 138(3), 225-230.
74
Theiss, R. D., Kuo, J. J. and Heckman, C. J. (2007). Persistent inward currents in rat ventral horn
neurones. J Physiol. 580(2), 507-522.
Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F. and Higgins, D. G. (1997). The
CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment
aided by quality analysis tools. Nucleic Acids Res. 25, 4876-4882.
Triques, M. L. (1993). Filogenia dos genêros de gymnotiformes (Actinopterygii, Ostariophysi),
com base em caracteres queléticos. Comun Mus Ciênc. PURCS, série zool. 6(8), 85-130.
Ulbricht, W. (2005). Sodium channel inactivation: molecular determinants and modulation.
Physiol Rev. 85, 1271-1301.
von der Emde, G. (1990). Discrimination of objects through electrolocation in the weakly
electric fish, Gnathonemus petersii. J Comp Physiol A. 167(3), 413–421.
Warrington, J. A., Nair, A., Mahadevappa, M. and Tsyganskaya, M. (2000). Comparison of
human adult and fetal expression and identification of 535 housekeeping/maintenance
genes. Physiol Genomics. 2, 143-147.
Wernersson, R. and Pedersen, A. G. (2003). RevTrans: Multiple alignment of coding DNA from
aligned amino acid sequences. Nucleic Acids Res. 31(13), 3537-3539.
Widmark, J., Sundström, G., Daza, D. O. and Larhammar, D. (2011). Differential evolution of
voltage-gated sodium channels in tetrapods and teleost fishes. Mol Biol Evol. 28(1), 859-
871.
Wiens, J. J. (1998). Does adding characters with missing data increase or decrease phylogenetic
accuracy? Syst Biol. 47(4), 625-640.
Willett, C. E., Cherry, J. J. and Steiner, L. A. (1997). Characterization and expression of the
recombination activating genes (Rag1 and Rag2) of zebrafish. Immunogenetics. 45, 394-
404.
75
Williamson, J. R., Cheung, W. Y., Coles, H. S. and Herczeg, B. E. (1967). Glycolytic control
mechanisms IV. kinetics of glycolytic intermediate changes during electric discharge and
recovery in the main organ of Electrophorus electricus. J Biol Chem. 242, 5112-5118.
Winemiller, K. O. and Adite, A. (1997). Convergent evolution of weakly electric fishes from
floodplain habitats in Africa and South America. Environmental Biology of Fishes. 49,
175-186.
Wingo, T. L., Shah, V. N., Anderson, M. E., Lybrand, T. P., Chazin, W. J. and Balser, J. R.
(2004). An EF-hand in the sodium channel couples intracellular calcium to cardiac
excitability. Nat Struct Mol Biol. 11(3), 219-225.
Yablonka-Reuveni, Z. (2011). The skeletal muscle satellite cell: still young and fascinating at 50.
J Histochem Cytochem. 59(12), 1041-1059.
Yang, Z. (2007). PAML 4: a program package for phylogenetic analysis by maximum
likelihood. Mol. Biol. Evol. 24, 1586-1591.
<http://abacus.gene.ucl.ac.uk/software/paml.html>.
Young, K. A. and Caldwell, J. H. (2005). Modulation of skeletal and cardiac voltage-gated
sodium channels by calmodulin. J Physiol. 565(2), 349-370.
Yu, F. H., Yarov-Yarovoy, V., Gutman, G. A. and Catterall, W. A. (2005). Overview of
molecular relationships in the voltage-gated ion channel superfamily. Pharmacol Rev. 57,
387-395.
Zahavi, A. (2003). Indirect selection and individual selection in sociobiology: my personal views
on theories of social behaviour. Anim Behav. 65, 859-863.
Zakon, H. H., Lu, Y., Zwickl, D. J. and Hillis, D. M. (2006). Sodium channel genes and the
evolution of diversity in communication signals of electric fishes: convergent molecular
evolution. Proc Natl Acad Sci. 103, 3675-3680.
Zakon, H. H. and Unguez, G. A. (1999). Development and regeneration of the electric organ. J
Exp Biol. 202, 1427-1434.
76
Zhang, X., Ren, W., DeCaen, P., Yan, C., Tao, X., Tang, L., Wang, J., Hasegawa, K., Kumasaka,
T., He, J., Wang, J., Clapham, D. E. and Yan, N. (2012). Crystal structure of an
orthologue of the NaChBac voltage-gated sodium channel. Nature. 486, 130-135.
77
Appendix A.0 Abstract
A.0.1 Phylogeny and Molecular Evolution of the Voltage-Gated Sodium Channel Gene scn4aa in the Electric Fish Order Gymnotiformes
Many advances have been made in recent years, to identify the roles of various motifs in voltage-
gated sodium channel protein (Nav) channel function and modulation. However, analyses of the
roles of specific amino acid sites have largely been limited to mutations from individual people
with diagnosed neuromuscular disease. In this project, I used the order Gymnotiformes as a
model system to investigate the evolution and function of amino acid sites on the Nav that is
specifically adapted to the production of electric fields. Gymnotiformes is a diverse clade of ray-
finned fishes (class Actinopterygii) that are adapted to the lowland freshwaters of Central and
South America, with wide geographical distributions. They produce species-specific electric
organ discharges (EODs) from electric organs (EOs) for electrolocation (foraging, navigation)
and communication.
To clarify evolutionary relationships among Gymnotiformes species, I reconstructed the
phylogeny using an alignment of 3570 nucleotide positions from 57 gymnotiform species. This
alignment included loci that were used for previous phylogenies of a gymnotiform genus (cytb
and rag2), as well as nucleotides encoding the carboxyl-terminus Nav gene that is preferentially
expressed in EOs (scn4aa 3’). Unfortunately nucleotide sequences from one of six gymnotiform
families could not be obtained, and further analytical techniques to obtain them were outside the
scope of this project. The maximum parsimony phylogenetic reconstruction algorithm was
successful in providing a reasonably well supported phylogeny, while the Bayesian inference
algorithm was not. Nevertheless, the results indicate that the scn4aa 3’ locus contributed towards
a meaningful phylogenetic topology that provides a reasonable amount of resolution.
Further analyses of phylogenetic topology is needed prior to analyses of patterns of
selection, due to inconsistencies among existing phylogenetic topologies of gymnotiforms and
unexpected inclusion of a siluriform species (Cetopsis coecutiens) within the Gymnotiform order
78
(from the cytb phylogeny). Since the number of sites variable in amino acid identity among order
Gymnotiformes is more than four times larger than those among genus Gymnotus (177 vs 43
sites), future analyses of scn4aa 3’ may identify additional amino acid sites that contribute to
knowledge of protein function.
79
Appendix A.1 Introduction
A.1.1 Significance and Objectives
Many advances have been made in recent years, to identify the roles of various motifs in voltage-
gated sodium channel protein (Nav) channel function and modulation (Chagot et al. 2009;
Miloushev et al. 2009; Payandeh et al. 2011; Sarhan et al. 2012; Zhang et al. 2012). However,
analyses of the roles of specific amino acid sites have largely been limited to the sites that are
known to be mutated in people with diagnosed neuromuscular disease (Lehmann-Horn and
Jukart-Rott 1999). In this project, I will use the order Gymnotiformes as a model system to
investigate the evolution and function of amino acid sites on the Nav.
Fishes of the order Gymnotiformes is a produce species-specific electric organ discharges
(EODs) for electrolocation (foraging, navigation) and communication (Crampton and Albert
2006). EODs are the summation of action potentials produced at the electric organ(s) (EO) by
electrogenic cells (Bennett 1961; Mills and Zakon 1987). Navs at the plasma membranes of those
cells have a key role in supporting action potentials (Agnew 1984; Catterall 1984; Noda et al.
1984). Upon neuronally triggered changes in voltage, Navs activate to allow specific ions to
discharge through their pores, across the membranes. Those same changes in voltage also trigger
Navs to inactivate, to allow the membrane voltage gradient to recover, in preparation for the next
discharge.
Navs are encoded by a family of paralogous genes that translate to highly conserved
amino acid sequences and motifs (Catterall et al. 2005). Gene duplication among teleostei and
preferential expression in various tissues (Lopreato et al. 2001; Lynch et al. 2001; Goldin 2002;
Novak et al. 2006; Widmark et al. 2011) has been predicted to allow paralogs to evolve
independently without compromising functions of Navs in other tissues. Analyses of nucleotide
sequences encoding the carboxyl-terminus EO paralog (scn4aa 3’) from a genus of gymnotiform
fishes (genus Gymnotus), resulted in identification of positive, neutral, and purifying selection of
the protein (Nav1.4a) among certain lineages, as well as identification of positively selected
amino acid sites (Chapters 1-4).
80
The carboxyl-terminus (C-terminus) of Navs includes key motifs that are involved in
regulation of protein internalization, fast inactivation, and possibly also resurgent current.
Modulation of these Nav1.4a activities affects the amplitude and frequency of action potentials at
the EO, which may in turn affect those components of the EODs. Variations in EOD amplitude
may be associated with variations in multiple anatomical, cellular, and molecular characteristics
(Gotter 1998; Caputi 1999). However, variations in EOD frequency among gymnotiforms with
myogenic electric organs are likely limited to those associated with variations in Nav1.4a
function.
Since species-specific characteristics of EODs among gymnotiforms (especially variation
in frequency) are the result of adaptations to abiotic and biotic selective pressures in their varied
habitats (Stoddard 2002), I predict that amino acid sites of the Nav1.4a 3’ that contribute to
variance of (but not abolish) protein function, will show evidence of positive selection in
gymnotiform fishes. I also predict that the Nav1.4a 3’ will only show evidence of positive
selection in some lineages of gymnotiforms, as has been observed for other portions of Nav1.4a
sequences from a limited sample of gymnotiform fishes (Zakon et al. 2006; Arnegard et al.
2010). To assess patterns of variation on the Nav1.4a 3’ among gymnotiforms, I will analyze the
corresponding nucleotide sequences.
Existing phylogenetic relationships among gymnotiforms consistently resolve 6 families
(Electrophoridae, Gymnotidae, Hypopomidae, Rhamphichthyidae, Apteronotidae, and
Sternopygidae). However, relationships among the families are inconsistent. I will use additional
taxa and molecular characters, to contribute towards resolving these inconsistencies (Wiens
1998). The additional characters that I will use are the gymnotiform scn4aa nucleotide sequences
that encode the protein's C-terminus. Since this portion of scn4aa has been used for successful
clarification of the phylogeny of a genus of gymnotiform fishes, (Chapters 1-4), I predict that
this portion of the gene will also contribute towards clarification of phylogenetic relationships
among gymnotiform species.
The objectives of this project can be summarized as follows:
81
1) To clarify evolutionary relationships among known and newly discovered species of
gymnotiforms using orthologous genetic loci, including the scn4aa C-terminus;
2) To determine the utility of the scn4aa 3’ locus for reconstruction of phylogenetic
relationships; and
3) To assess patterns of variation at the Nav1.4a C-terminus, thereby contributing towards
understanding the evolutionary history of Gymnotiformes, and molecular mechanisms of
the protein.
82
Appendix A.2 Materials and Methods
A.2.1 Taxon Sampling
Efforts were made to comprehensively sample gymnotiform species from as many genera as
possible among all six families described in published phylogenies (Figure 2). Outgroup species
were comprehensively sampled from multiple other ostariophysan families since there is no
consensus on the closest order to Gymnotiformes (Fink and Fink 1981; Saitoh et al. 2003). These
other families were Characiformes (Calcagnotto et al. 2005), Cypriniformes (Mayden et al.
2009), Siluriformes (Sullivan et al. 2006), and Gonorhynchiformes. Efforts were made during
outgroup species sampling so that: nucleotide sequences were available from GenBank for as
many loci as possible; specimens were more likely to be easily available (i.e. through the hobby
aquarium trade); and taxa were phylogenetically diverse. More than one individual was sampled
per species whenever possible, as a control for variation within species.
Tissues for DNA extraction were stored in either 95-100% ethanol or salt saturated buffer
(20% DMSO, 0.25 M EDTA pH 8, saturated with NaCl). Tissue samples were from the
collections of Nathan Lovejoy, William Crampton, James Albert, and Javier Maldonaldo.
A.2.2 Locus and Primer Selection
The loci selected were: mitochondrial gene cytochrome b (cytb); and nuclear genes
recombination activating gene 2 (rag2) and the portion of the voltage-gated sodium channel gene
scn4aa that encodes the protein’s carboxyl-terminus (scn4aa 3’). They were selected for similar
reasons as those for the Gymnotus phylogeny in Chapter 2.
Appendix A Table 1 lists the primer sequences used for DNA amplification and
sequencing. Amplification primers for cytb and rag2 have been previously published.
Amplification primers for scn4aa 3’ were designed as per Chapter 2. Sequencing primers for all
loci selected were designed as necessary, also as per Chapter 2.
83
Appendix A Table 1. Primer Sequences
Primers used for polymerase chain reaction and sequencing are identified by their target loci, name, annealing direction, sequence, and
source.
Target Locus Name Amplification/Sequencing Direction
1 Sequence (listed as 5' → 3') Source of Sequence
scn4aa 3’ (6)1F 5' → 3' TCCTCCTGACTGTGACCCTG Chapter 2 Table 1
(6)2F 5' → 3' GGGCTTCTCCTSCCAACTC This study
(6)3F 5' → 3' GCTTCTCCTSCCAACTCTAAACA This study
(6)1R 3' ← 5' CATTTTTACACTTCATCACTCTCCAC Chapter 2 Table 1
(6)2R 3' ← 5' TCATTCCTAGACACCARCAAACAT This study
(6)3R 3' ← 5' CATCATTCCTAGACACCAGCAAACAT This study
(6)Seq1F TTGTAATGGGAGACAANATCC This study
(6)Seq2F GTCACTCARGAGGTCCT This study
(6)Seq1R GGCCGCATASWCCTCCTCCTT This study
(6)Seq2R TGAGGAGGTRYTGGCGGTA This study
(4)2R TTCCTGCAGTGCATCAACAAAG This study
(4)3R TGGGAATACGCATGGGTTC This study
cytochrome b GLU-L-CARP (AKA
CytbF)
5' → 3' TGACTTGAAGAACCACCGTTG Palumbi et al. 1991
GLUDG-L 5' → 3' CGAAGCTTGACTTGAARAACCAYCGTTG Palumbi et al. 1991
L14841 5' → 3' AAAAAGCTTCCATCCAACATCTCAGCATGATGAAA Kocher et al. 1989
HA-danio (AKA CytbR) 3' ← 5' CTCCGATCTTCGGATTACAAG Mayden et al. 2007
CytbH15915 3' ← 5' AACTGCAGTCATCTCCGGTTTACAAGA Irwing et al. 1991
(C)Seq1F CAATGAGTCTGAGGAGGNTT Chapter 2 Table 1
(C)Seq2F CAATGAGTATGAGGAGGNTT This study
(C)Seq3F CAATGAGTTTGAGGGGGNTT Chapter 2 Table 1
(C)Seq4F CAATGAGTGTGGGGGGGNTT This study
(C)Seq5F CAATGAGTCTGAGGGGGNTT Chapter 2 Table 1
(C)Seq6F CAATGAGTATGAGGGGGNTT This study
(C)Seq7F CAATGAGTATGAGGGGGNTT This study
(C)Seq8F CAATGAGTTTGAGGCGGNTT Chapter 2 Table 1
(C)Seq9F CAATGAGTCTGAGGCGGNTT This study
(C)Seq10F CAATGAGTTTGAGGTGGNTT This study
(7)7R TCTAGTTCCTCTGGCTCCTC This study
recombination
activating gene 2
Rag2GyF 5' → 3' ACAGGCRTCTTTGGKRTTCG Lovejoy et al. 2010
84
Target Locus Name Amplification/Sequencing Direction 1 Sequence (listed as 5' → 3') Source of Sequence
Rag2-F1 (this was only used for amplification) TTTGGRCARAAGGGCTGGCC Lovejoy and Collette 2001
MHRag2-F1 (AKA
Rag2MHF1)
5' → 3' Hardman 2003
Rag2GyR 3' ← 5' TCATCCTCCTCATCTTCCTC Lovejoy et al. 2010
Rag2-R6 (this was only used for amplification) TGRTCCARGCAGAAGTACTTG Lovejoy and Collette 2001
MHRag2-R1 (AKA
Rag2MHR1)
3' ← 5' Hardman 2003
(R)Seq1F AGAACCACAGAGAACTGGAACAC Chapter 2 Table 1
(R)Seq1R CTCTACACGCAGCCTGAACA Chapter 2 Table 1
(R)Seq2R TGCATTCGCTTYTGGGA Chapter 2 Table 1
16S mitochondrial
ribosomal subunit
16sar-L 5' → 3' CGCCTGTTTATCAAAAACAT Palumbi et al. 1991
16sbr-H 3' ← 5' CCGGTCTGAACTCAGATCACGT Palumbi et al. 1991
1 Amplification/sequencing direction is only identified for primers used for both amplification and sequencing, since sequencing-only
primers may have been used to sequence nucleotides in different directions.
85
A.2.3 DNA Extraction, Nucleotide Amplification, and Sequencing
To obtain DNA, excised muscle tissue was processed using the DNeasy Blood and Tissue Spin-
Column Kit (Qiagen). Nucleotide sequences from previous studies were obtained from
GenBank. This includes most of the cytb and rag2 data. All of the scn4aa 3’ sequences were
experimentally obtained as part of this study. See Appendix A Table 2 for the source of each
sequence. Nucleotide amplification and sequencing methods were the same as those for Chapter
2.
A.2.4 Nucleotide Sequence Verification and Alignment
All sequences experimentally obtained for this study were visually inspected for misreads, and
edited using SequencherTM (Gene Code Corporation, Ann Arbor, MI). Ambiguous base calls
were considered as possibly any nucleotide. For scn4aa sequences, amplification and sequencing
of the exon encoding the protein’s carboxyl-terminus (scn4aa 3’) from the desired member of the
gene family was verified as per Chapter 2.
Directions and applicable codon positions of the nucleotide sequences were determined
by comparison with published Danio rerio (rag2 Accession # NM_131385, cytb Accession #
NC_002333) and Electrophorus electricus (scn4aa 3’ Accession # M22252) sequences.
Nucleotides from the protein coding loci (cytb, rag2, and scn4aa 3’) were aligned based on their
amino acid alignments, as per Chapter 2.
A.2.5 Phylogenetic Reconstruction
Phylogenetic reconstruction was conducted using the total evidence alignment. The resulting
phylogeny was compared with separate analyses of the following alignments: cytb; rag2; and
scn4aa 3’.
86
Appendix A Table 2. Specimens and Nucleotide Sequences Used for Gymnotiformes Analyses
Specimens used for analysis are identified by their scientific names, tissue sample numbers, museum catalogue numbers, collection
localities, and applicable GenBank Accession numbers. Sequences obtained by the author for this project are identified with “**”.
Sequences obtained from lab records are identified with “*” or their GenBank Accession Number, if applicable.
Genus Species Tissue
sample
number
Museum catalog
number
Collection Locality Nucleotide sequences
scn4aa 3’ cytochrome
b
recombination
activating gene
2
Order Gymnotiformes: Family Apteronotidae
Adontosternarchus sachsi 2877 (unknown) (unknown) ** **
Adontosternarchus sachsi 2888 (unknown) (unknown) ** **
Apteronotus albifrons 7301 (unknown) Brazil ** **
Apteronotus albifrons 2615 (unknown) (aquarium specimen) ** **
Apteronotus bonapartii 2914 (unknown) (unknown) ** **
Apteronotus bonapartii 2616 (unknown) (aquarium specimen) ** **
Apteronotus (Ubidia) magdalenensis 4008 (unknown) Colombia ** **
Apteronotus (Ubidia) magdalenensis 4009 (unknown) Colombia ** **
Compsaraia n. sp. B 1991 (unknown) Peru ** **
Magosternarchus raptor 2838 (unknown) (unknown) ** **
Magosternarchus raptor 2910 (unknown) (unknown) ** **
Orthosternarchus tamandua 2447 (unknown) Peru ** **
Orthosternarchus tamandua 2625 (unknown) (aquarium specimen) ** **
Parapteronotus hasemani 2626 (unknown) (aquarium specimen) ** **
Parapteronotus hasemani 2627 (unknown) (aquarium specimen) ** **
Platyurosternarchus macrostomus 7302 (unknown) Brazil ** **
Platyurosternarchus macrostomus 2629 (unknown) (aquarium specimen) ** **
Porotergus gimbeli 2889 (unknown) (unknown) ** **
Porotergus gimbeli 2902 (unknown) (unknown) ** **
Sternarchella schotti 2860 (unknown) (unknown) ** **
Sternarchella schotti 2876 (unknown) (unknown) ** **
87
Genus Species Tissue
sample
number
Museum catalog
number
Collection Locality Nucleotide sequences
scn4aa 3’ cytochrome
b
recombination
activating gene
2
Sternarchogiton natteneri 2863 (unknown) (unknown) ** **
Sternarchogiton natteneri 2864 (unknown) (unknown) ** **
Sternarchorhampus muelleri 2103 (unknown) (unknown) ** **
Sternarchorhampus muelleri 2102 (unknown) (unknown) ** **
Sternarchorhynchus roseni 2920 (unknown) (unknown) ** **
Sternarchorhynchus oxyrhynchus 7303 (unknown) Brazil ** **
Sternarchorhynchus oxyrhynchus 7304 (unknown) (unknown) ** **
Order Gymnotiformes: Family Electrophoridae
Electrophorus electricus 2026 MZUSP 103218 Lago Secretaria, Tefé, Brazil ** GQ862593 GQ862541
Electrophorus electricus 2619 UF 116585 Rio Nanay, Peru ** GQ862592 GQ862540
Order Gymnotiformes: Family Gymnotidae
Gymnotus arapaima 2002
MZUSP 75179 Lago Mamirauá, Tefé,
Amazonas, Brazil ** GQ862595 GQ862543
Gymnotus arapaima 2003
MZUSP 103219 Lago Mamirauá, Tefé,
Amazonas, Brazil ** GQ862596 GQ862544
Gymnotus cataniapo 2062 UF 174330 Rio Atabapo, Venezuela ** GQ862603 GQ862552
Gymnotus cataniapo 2063 UF 174332 Rio Cataniapo, Venezuela ** GQ862604 GQ862579
Gymnotus cylindricus 2092 ROM 84772 Rio Tortuguero, Costa Rica ** GQ862615 GQ862563
Gymnotus cylindricus 2093 ROM 84772 Rio Tortuguero, Costa Rica ** GQ862616 GQ862564
Gymnotus jonasi 2016
MZUSP 103220 Rio Solimões, Tefé,
Amazonas, Brazil ** GQ862619 GQ862567
Gymnotus jonasi 2471
UF 131410 Rio Ucayali, Pacaya Samiria
Reserve, Peru ** GQ862620 GQ862568
Gymnotus mamiraua 2012
MZUSP 103221 Rio Solimões, Tefé,
Amazonas, Brazil ** GQ862621 GQ862569
Gymnotus mamiraua 2013
MCP 29805 Rio Solimões, Tefé,
Amazonas, Brazil ** GQ862622 GQ862570
Gymnotus obscurus 2017
MZUSP 75155 Lago Mamirauá, Tefé,
Amazonas, Brazil ** GQ862623 GQ862571
88
Genus Species Tissue
sample
number
Museum catalog
number
Collection Locality Nucleotide sequences
scn4aa 3’ cytochrome
b
recombination
activating gene
2
Gymnotus obscurus 2018
MZUSP 75157 Lago Mamirauá, Tefé,
Amazonas, Brazil ** GQ862624 GQ862572
Gymnotus pantherinus 2039 (no voucher) Rio Perequê-Açu, Brazil ** GQ862625 GQ862573
Gymnotus pantherinus 2945
MZUSP 87564 Rio Vermelho, Sao Paulo,
Brazil ** * *
Gymnotus tigre 7090 (not catalogued) (aquarium specimen) ** ** **
Gymnotus tigre 7349 (not catalogued) (aquarium specimen) ** ** **
Gymnotus varzea 2014
MZUSP 75163 Rio Solimões, Tefé,
Amazonas, Brazil ** * *
Gymnotus varzea 2015
MZUSP 75164 Rio Solimões, Tefé,
Amazonas, Brazil ** * *
Gymnotus n. sp. fritzi 7109 (not catalogued) Tefé, Amazonas, Brazil ** ** **
Gymnotus cf. tigre 2024 UF 122821 Rio Amazonas, Peru ** GQ862632 GQ862580
Order Gymnotiformes: Family Hypopomidae
Brachyhypopomus beebei 2510 (unknown) Peru ** ** **
Brachyhypopomus beebei 2524 (unknown) Peru ** **
Brachyhypopomus brevirostris 2617 UF 116556 Rio Nanay, Peru GQ862588 GQ862536
Brachyhypopomus brevirostris 7019 (unknown) Suriname ** **
Brachyhypopomus diazi 305 UF 174334 Rio Los Marias, Venezuela ** GQ862589 GQ862537
Brachyhypopomus diazi 2408 UF 174334 Rio Alpargatón, Venezuela ** GQ862590 GQ862538
Brachyhypopomus occidentalis 2948 (unknown) Rio Atrato, Choco, Colombia ** ** **
Brachyhypopomus occidentalis 2949 (unknown) Rio Atrato, Choco, Colombia ** ** **
Brachyhypopomus occidentalis 7156 (unknown) Panama ** ** **
Brachyhypopomus occidentalis 7162 (unknown) Panama ** ** **
Brachyhypopomus n. sp. PAL 2432 UF 148572 Rio Palenque, Ecuador ** GQ862591 GQ862539
Brachyhypopomus n. sp. PAL 2433 (unknown) Rio Palenque, Ecuador ** ** **
Brachyhypopomus pinnicaudatus 2121 (unknown) Tefé, Brazil ** ** **
Brachyhypopomus pinnicaudatus 2122 (unknown) Tefé, Brazil ** ** **
89
Genus Species Tissue
sample
number
Museum catalog
number
Collection Locality Nucleotide sequences
scn4aa 3’ cytochrome
b
recombination
activating gene
2
Hypopomus artedi 2232 ANSP 179505 Rio Mazaruni, Guyana ** GQ862637 GQ862585
Hypopomus artedi 2233 AUM 35574 Rio Mazaruni, Guyana ** ** **
Hypopygus lepturus 2438 (unknown) Rio Nanay, Peru ** **
Hypopygus lepturus 2439 (unknown) Rio Nanay, Peru ** **
Microsternarchus bilineatus 2396 (unknown) Rio Atabapo, Venezuela ** **
Racenisia fimbripinna 2339
(unknown) Rio Atabapo, Santa Barbara,
Venezuela ** ** **
Racenisia fimbripinna 2340
(unknown) Rio Atabapo, Santa Barbara,
Venezuela ** ** **
Steatogenys duidae 2146 (unknown) Tefé, Brazil ** ** **
Steatogenys duidae 2147 (unknown) Tefé, Brazil ** ** **
Stegostenopos cryptogenes 2322
(unknown) Rio Atabapo, Santa Barbara,
Venezuela ** **
Order Gymnotiformes: Family Rhamphichthyidae
Gymnorhamphichthys rondoni 2153 (unknown) Brazil ** **
Gymnorhamphichthys rondoni 2154 (unknown) Brazil ** **
Rhamphyichthys "saddled" 7282 (unknown) (unknown) ** **
Rhamphyichthys "saddled" 7283 (unknown) (unknown) ** **
Rhamphyichthys "clear" 7284 (unknown) (unknown) ** ** **
Rhamphyichthys "clear" 7285 (unknown) (unknown) ** ** **
Rhamphyichthys hypostomus 7309 (unknown) Brazil ** **
Rhamphyichthys hypostomus 7310 (unknown) Brazil ** **
Rhamphyichthys lineatus 2630 (unknown) (aquarium specimen) ** ** **
Rhamphyichthys lineatus 2158 (unknown) Brazil ** ** **
Rhamphyichthys sp. 7286 (unknown) (unknown) **
Rhamphyichthys sp. 7287 (unknown) (unknown) ** **
Order Gymnotiformes: Family Sternopygidae
Archoalemus blax 7307 (unknown) Brazil ** ** **
90
Genus Species Tissue
sample
number
Museum catalog
number
Collection Locality Nucleotide sequences
scn4aa 3’ cytochrome
b
recombination
activating gene
2
Archoalemus blax 7308 (unknown) (unknown) ** ** **
Distocyclus conirostris 7306 (unknown) (unknown) ** **
Distocyclus conirostris 2911 (unknown) (unknown) ** **
Eigenmannia humboldtii 2811 (unknown) Colombia ** ** **
Eigenmannia humboldtii 2822 (unknown) Colombia ** ** **
Eigenmannia limbata 1938 UF 126255 Peru ** ** **
Eigenmannia limbata 1939 UF 126255 Peru ** ** **
Eigenmannia virescens 2817 (unknown) Colombia ** ** **
Eigenmannia virescens 2818 (unknown) Colombia ** **
Eigenmannia virescens 2309 (unknown) Venezuela ** ** **
Eigenmannia virescens 2310 (unknown) Venezuela ** ** **
Rhabdolichops caviceps 2883 (unknown) (unknown) ** ** **
Rhabdolichops caviceps 2887 (unknown) (unknown) ** ** **
Rhabdolichops eastwardi 2105 (unknown) (unknown) ** **
Rhabdolichops eastwardi 2104 (unknown) (unknown) ** **
Sternopygus aequilabiatus 2819 (unknown) Colombia ** * **
Sternopygus aequilabiatus 2820 (unknown) Colombia ** * **
Sternopygus astrabes 2203
(unknown) Lago Tefé, Igarapé
Repartimento, Brazil ** * **
Sternopygus astrabes 2204 (unknown) Brazil ** * **
Sternopygus dariensis 7223 (unknown) West of the Andes ** ** **
Sternopygus dariensis 7224 (unknown) West of the Andes ** ** **
Sternopygus macrurus 2507 UF 131396 Peru ** * **
Sternopygus macrurus 2639 UF 117121 Rio Nanay, Peru ** GQ862639 GQ862587
Order Characiformes: Family Alestidae
Alestes baremoze AMNH 226451 AY791360 AY804029
Alestopetersius hilgendorfi AMNH 233438 AY791432 AY804114
Arnoldichthys spilopterus AMNH 233399 AY791364 AY804032
91
Genus Species Tissue
sample
number
Museum catalog
number
Collection Locality Nucleotide sequences
scn4aa 3’ cytochrome
b
recombination
activating gene
2
Bathyaethiops breuseghemi AMNH 233422 AY791430 AY804113
Brycinus carolinae 1 RUSI 065136 AY791359 AY804028
Brycinus carolinae 2 AMNH 233628 AY791373 AY804045
Brycinus nurse AMNH 233415 AY804034
Brycinus schoutedeni AY791377 AY804050
Bryconaethiops microstoma AMNH 233390 AY791371 AY804041
Hydrocynus vitattus 1 AMNH 233623 AY791404 AY804083
Hydrocynus vitattus 2 RUSI 061489 AY791410 AY804091
Ladigesia roloffi AMNH 233394 AY791417 AY804097
Micralestes occidentalis RUSI 065135 AY791358 AY804027
Phenacogrammus interruptus 1 AMNH 233442 AY791421 AY804102
Phenacogrammus interruptus 2 AMNH 233444 AY791434 AY804116
Order Characiformes: Family Charicidae
Chalceus macrolepidotus AMNH 233404 AY791385 AY804060
Exodon paradoxus AMNH 233426 AY791397 AY804072
Salminus maxillosus AY791438 AY804124
Order Characiformes: Family Crenuchidae
Characidium fasciatum AMNH 233251 AY791380 AY804055
Characidium vidali MNRJ 12838 AY791388 AY804064
Order Characiformes: Family Ctenolucidae
Ctenolucius hujeta AMNH 233412 AY791384 AY804059
Order Characiformes: Family Distichodontidae
Distochodus notospilus AMNH 231537 AY791395 AY804069
Distochodus sexfasciatus AMNH 233393 AY791396 AY804071
Hemigrammocharax multifasciatus RUSI 63497 AY791407 AY804085
Neolebias ansorgii AY791423 AY804106
Neolebias trilineatus AMNH 233439 AY791425 AY804108
Order Characiformes: Family Hepsetidae
92
Genus Species Tissue
sample
number
Museum catalog
number
Collection Locality Nucleotide sequences
scn4aa 3’ cytochrome
b
recombination
activating gene
2
Hepsetus odoe AMNH 231495 AY791408 AY804086
Order Characiformes: Family Prochilodontidae
Prochilodus nigricans AMNH 233305 AY791437 AY804120
Order Characiformes: Family Serrasalmidae
Colossoma macropomum AY791386 AY804061
Piaractus brachypomus MZUSP 85849 AY791429 AY804112
Pygocentrus nattereri AY791436 AY804119
Order Cypriniformes: Family Catosomidae
Myxocyprinus asiaticus AP006764 DQ367043
Order Cypriniformes: Family Cyprinidae
Danio rerio
NM_001
039825
NC_002333 NM_131385
Barbus barbus AB238965 DQ366990
Carassius auratus DQ366941
Order Cypriniformes: Family Gobioniae
Gobio gobio AB239596 DQ367015
Order Cypriniformes: Family Leuciscinae
Cyprinella lutrensis AB070206 DQ367019
Phoxinus phoxinus EF094550 DQ367022
Order Cypriniformes: Family Tincinae
Tinca tinca AB218686 DQ367029
Order Cypriniformes: Family Xenocyprinae
Xenocypris argentea AP009059 DQ367024
Order Siluriformes: Family Akysidae
Acrochordonichthys rugosus INHS 93578 EU490899 DQ492332
Order Siluriformes: Family Anchariidae
Gogo arcuatus UMMZ 238042 FJ013160 DQ492415
Order Siluriformes: Family Ariidae
93
Genus Species Tissue
sample
number
Museum catalog
number
Collection Locality Nucleotide sequences
scn4aa 3’ cytochrome
b
recombination
activating gene
2
Bagre marinus CU 906192
AJ581355 DQ492411
Order Siluriformes: Family Aspredinidae
Mycromyzon akamai ANSP 182777 EU490892 DQ492424
Order Siluriformes: Family Auchenipteridae
Ageneiosus ucayalensis INHS 52920 EU490898 DQ492351
Order Siluriformes: Family Bagridae
Bagrus docmak CU 90408 EU490906 EU490906
Hemibagrus wyckioides INHS 93682 EU490911 DQ492349
Heterobagrus bocourti INHS 93586 EU490912 DQ492350
Olyra longicaudatus
Private Collection
H.H. Ng
EU490918 DQ492347
Rita rita Private Collection
H.H. Ng
EU490921 DQ492405
Order Siluriformes: Family Cetopsidae
Cetopsis coecutiens INHS 52923 DQ486759 DQ492419
Order Siluriformes: Family Clariidae
Clarias gabonensis CU 803712
AY995129 DQ492406
Order Siluriformes: Family Cranoglanididae
Cranoglanis bouderius ASIZB 1383452
AF475155 DQ492401
Order Siluriformes: Family Doradidae
Acanthodoras cataphractus ANSP 179854 EU490895 DQ492354
Order Siluriformes: Family Horabagridae
Horabagrus brachysoma INHS 935851;
INHS 9359052
EU490913 DQ492409
Order Siluriformes: Family Ictaluridae
Ictalurus punctatus INHS 939041;
ANSP 1803682
AY184254 DQ492398
Order Siluriformes: Family Pimelodidae
94
Genus Species Tissue
sample
number
Museum catalog
number
Collection Locality Nucleotide sequences
scn4aa 3’ cytochrome
b
recombination
activating gene
2
Phractocephalus hemioliopterus ANSP 179452 DQ486763 DQ492364
Pimelodus ornatus uncat., Coll. M.
Azpelicueta
P2791; INHS
491022
EF564741 DQ492363
Order Siluriformes: Family Plotosidae
Plotosus lineatus ANSP 182776 EU490919 DQ492418
Order Siluriformes: Family Schilbidae
Ailia coila Private Collection
H.H. Ng
EU490901 DQ492340
Schilbe intermedius CU 882512
AJ245673 DQ492395
Order Siluriformes: Family Siluridae
Kryptopterus minor TNHC 293491;
ANSP 1827782
AY458895 DQ492373
Order Siluriformes: Family Sisoridae
Bagarius yarrelli INHS 93673 EU490904 DQ492334
Order Siluriformes: Family Trichomycteridae
Trichomyceterus guianense INHS 49567 DQ486760 DQ492319
Order Gonorhynchiformes: Family Chanidae
Chanos chanos AB054133
1 Museum catalog number associated with cytb sample only.
2 Museum catalog number associated with rag2 sample only.
95
Parsimony based phylogenetic reconstruction was implemented in PAUP* (Swofford
2002) using the stepwise heuristic search algorithm with the following parameters for 2000
search replicates: tree bisection reconstruction branch swapping; and holding 10 variants at each
step. Bootstrapping was also conducted for 2000 search replicates with the same parameters
(Müller 2005).
Bayesian phylogenetic reconstruction was implemented in MrBayes 3.1.2 (Huelsenbeck
and Ronquist 2001), using the model of molecular evolution that best fit the data as determined
using MrModeltest 2.3 (Nylander 2004). It was the same model for the total evidence and
individual locus alignments – general time-reversible model, with a proportion of nucleotide
sites that are invariant, and the variation in nucleotide substitution rates across the variant
nucleotide sites estimated from a gamma distribution (GTR + I + G; Brinkman and Leipe 2001).
The total evidence alignment was partitioned into the three loci, and analyzed with temp = 0.2.
The cytb, rag2, and scn4aa 3’ alignments were analyzed with temp = 0.05, 0.1, and 0.2. Each of
these four alignments were analyzed with nperts = default, 2, and 4. The analyses were run for up
to 10 million generations with four chains each to reach an average standard deviation of split
frequencies of 0.01 or less. All other parameters were program defaults.
96
Appendix A.3 Results
A.3.1 Nucleotide Sequence Data
Nucleotide sequences were obtained from 110 gymnotiform individuals representing all 6
families: 99 of which represent 49 recognized species; 11 of which represent up to another 8
undescribed species. The numbers of species sampled per family are as follows: Apteronotidae
(15); Electrophoridae (1); Gymnotidae (11); Hypopomidae (13); Rhamphichthyidae (7); and
Sternopygidae (10). Sequences were also obtained from 65 outgroup individuals, which
represent 62 species from other ostariophysian orders. The numbers of species sampled per order
are as follows: Characiformes (29); Cypriniformes (8); Siluriformes (24); and
Gonorhynchiformes (1). Appendix A Table 2 identifies the specimens used for analysis by their
scientific names, tissue sample numbers, museum catalogue numbers, and collection localities.
A total of 409 nucleotide sequences were obtained for phylogenetic analyses. For
cytochrome b (cytb), recombination activating gene 2 (rag2), and the portion of the voltage-gated
sodium channel gene scn4aa that encodes the protein’s carboxyl-terminus (scn4aa 3’): 85, 86,
and 66 sequences were from experiments for this study, respectively; and 85, and 86, and 1 were
from GenBank, respectively.
The total evidence nucleotide alignment consisted of 3570 nucleotide positions, 1473 of
which were parsimony informative, and another 291 were variable but parsimony uninformative.
The alignment consisted of nucleotide positions from the following loci: 1146 from cytb; 1611
from rag2; and 813 from scn4aa 3’. Nucleotide positions among the housekeeping mitochondrial
locus cytb included 730 variable positions, of which 671 were parsimony informative. Those
among rag2 included 870 variable positions, of which 744 were parsimony informative. Those
among scn4aa 3’ included 564 variable positions, of which 458 were parsimony informative.
Scn4aa 3’ sequences were not obtained for Apteronotidae and outgroup taxa, despite the
use of various primers (amplification and sequencing) and polymerase chain reaction parameters
(Appendix A Table 2). Among the nucleotide sequences obtained, 18.60% of nucleotides were
97
ambiguous (proportion of ambiguous sites among nucleotides: 9743/194820 cytb nucleotides;
86260/277092 rag2 nucleotides; and 2343/56910 scn4aa 3’ nucleotides). The ambiguous sites
have chromatograms that do not clearly show a single nucleotide identity. Although it is possible
some are polymorphic sites, it was assumed that they were due to experimental error for the
purposes of phylogenetic analyses.
This dataset represents the most complete sampling of gymnotiform nucleotide sequence
data. Compared to the most recent molecular phylogenetic reconstruction of Gymnotiformes
(Alves-Gomes et al. 1995), this dataset includes 9 additional gymnotiform genera as well as an
additional locus.
A.3.2 Phylogenetic Reconstruction
Molecular phylogenetic analyses were conducted using nucleotide alignments of various loci
(cytb, rag2, and scn4aa 3’) and the total evidence alignment, using both maximum parsimony
(MP) and Bayesian inference (BI) algorithms. The 50% majority-rule consensus topologies from
the MP analyses are shown in Appendix A Figures 1-4. The MP consensus topologies were
produced from the most parsimonious trees based on analyses of various loci: cytb (4 trees); rag2
(46 trees); scn4aa 3’ (2822 trees); and the total evidence nucleotide alignment (7 trees). The BI
analyses did not converge to the target average standard deviation of split frequencies (≤ 0.01).
This was after more than two months of analysis per nucleotide alignment using a 2.50 GHz
quad-core computer with 4 GB of random access memory.
The order Gymnotiformes was resolved as a monophyletic group based on phylogenetic
reconstruction of the rag2 locus (Appendix A Figure 2). It would also have been resolved as a
monophyletic group based on the cytb locus and total evidence nucleotide alignment if it were
not for the inclusion of a siluriform species (Cetopsis coecutiens) in the group (Appendix A
Figures 1 and 4). The least close outgroup to order Gymnotiformes was identified as
Cypriniformes based on the cytb and total evidence nucleotide alignments (Appendix A Figures
1 and 4). The closest outgroup was identified as order Characiformes based on the total evidence
nucleotide alignment (Appendix A Figure 3), however this was not well supported.
98
Appendix A Figure 1. Molecular Phylogeny for Gymnotiformes Based on the cytb
Nucleotide Alignment Using Maximum Parsimony
Phylogenetic reconstruction was conducted based on the nucleotide alignment of cytochrome b
(cytb) using maximum parsimony. Individuals for which nucleotide sequences had not been
obtained for that locus were pruned from the 50% majority-rule consensus topologies. Numbers
above the branches indicate bootstrap values. The families are coloured as follows:
Apteronotidae (light blue); Electrophoridae (dark blue); Gymnotidae (violet); Hypopomidae
(red); Rhamphichthyidae (yellow); Sternopygidae (green).
99
Appendix A Figure 2. Molecular Phylogeny for Gymnotiformes Based on the rag2
Nucleotide Alignment Using Maximum Parsimony
Phylogenetic reconstruction was conducted based on the nucleotide alignment of recombination
activation gene 2 (rag2) using maximum parsimony. Individuals for which nucleotide sequences
had not been obtained for that locus were pruned from the 50% majority-rule consensus
topologies. Numbers above the branches indicate bootstrap values. The families are coloured as
follows: Apteronotidae (light blue); Electrophoridae (dark blue); Gymnotidae (violet);
Hypopomidae (red); Rhamphichthyidae (yellow); Sternopygidae (green).
100
Appendix A Figure 3. Molecular Phylogeny for Gymnotiformes Based on the scn4aa 3’
Nucleotide Alignment Using Maximum Parsimony
Phylogenetic reconstruction was conducted using the nucleotide alignment of the portion of the
voltage-gated sodium channel gene scn4aa that encodes the protein’s carboxyl-terminus (scn4aa
3’) using maximum parsimony. Individuals for which nucleotide sequences had not been
obtained for that locus were pruned from the 50% majority-rule consensus topologies. Numbers
above the branches indicate bootstrap values. The families were coloured as follows:
Apteronotidae (light blue); Electrophoridae (dark blue); Gymnotidae (violet); Hypopomidae
(red); Rhamphichthyidae (yellow); Sternopygidae (green).
101
Appendix A Figure 4. Molecular Phylogeny of Gymnotiformes Based on the Total Evidence Alignment
Phylogenetic reconstruction was conducted based on the total evidence nucleotide alignment from Gymnotus, consisting of nucleotide sequences from cytochrome b, recombination activating gene 2, and the portion of the voltage-gated sodium channel gene scn4aa that encodes the protein’s carboxyl-terminus. The 50% majority-rule consensus topology is shown. Numbers above the branches indicate bootstrap values. The families are coloured as follows: Apteronotidae (light blue); Electrophoridae (dark blue); Gymnotidae (violet); Hypopomidae (red); Rhamphichthyidae (yellow); Sternopygidae (green).
102
The families Apteronotidae and Sternopygidae were resolved as monophyletic groups
based on the cytb, rag2, and total evidence nucleotide alignments (Appendix A Figures 1-2 and
4). The family Gymnotidae was resolved as a monophyletic group based on the rag2, scn4aa 3’,
and total evidence nucleotide alignments (Appendix A Figures 2-4). It would also have been
resolved as a monophyletic group based on the cytb locus if it were not for the inclusion of a
siluriform species (C. coecutiens) in the group (Appendix A Figure 1). The families
Electrophoridae (consisting of only 1 species) and Rhamphichthyidae, as well as a clade of the
family Rhamphichthyidae as sister clade to a specific sub-group of the family Hypopomidae
(including Steatogenys duidae) were consistently resolved as monophyletic groups (Appendix A
Figures 1-4). The families Rhamphichthyidae and Hypopomidae were resolved as a
monophyletic group where the former was derived from the latter based on the cytb, rag2, and
total evidence nucleotide alignments.
Within the family Gymnotidae, three major monophyletic clades were consistently
resolved (Appendix A Figures 1-2 and 4; clade names as per Lovejoy et al. 2010): Gymnotus
carapo group; G2 group; and G1 group. Within the family Sternopygidae, the genus Sternopygus
was consistently resolved as a monophyletic group with the possible exception of Sternopygus
aequilabiatus (Appendix A Figures 1-4).
The family Electrophoridae was identified as sister clade of a group composed of all the
other Gymnotiformes families based on the rag2 locus (Appendix A. Figure 2). However, based
on the total evidence nucleotide alignment: the family Sternopygidae was sister clade of the other
gymnotiform families; the family Electrophoridae was sister clade of the family Gymnotidae;
and the family Apteronotidae was sister clade of the Electrophoridae + Gymnotidae group
(Appendix A. Figure 3).
A.3.3 Variation in the Nav1.4a C-terminus
There were 177 variable sites in the gymnotiform voltage-gated sodium channel protein Nav1.4a
carboxyl-terminus (C-terminus) amino acid alignment compared with 43 in the Gymnotus
alignment (Table 7). The eight sites that were identified as positively selected from the
Gymnotus alignment were also variable in amino acid identity in the gymnotiform alignment.
103
Appendix A.4 Discussion
A.4.1 Gymnotiform Phylogeny
There are several phylogenies of order Gymnotiformes (Chapter 1 Figure 2). Some of proposed
phylogenetic relationships are consistent with each other, while some are unclear. This project
used additional taxa and nucleotide sequences to provide further evidence towards clarifying
proposed phylogenetic relationships among gymnotiforms. However, it is still not clear which
family is the most basal.
Phylogenies were reconstructed using maximum parsimony (MP) and Bayesian inference
(BI). For both MP and BI analyses, the length of time required generally depends on the number
of taxa, number of non-ambiguous characters, and number of sequence replicates or generations
specified. How well the consensus phylogenies fit the data is estimated by bootstrap values and
posterior probabilities, respectively. For MP analyses, individual sequence replicates represent
independent calculations. For BI analyses, previous generations are used as a basis for the more
improved recent generations, so consensus phylogenies are estimated from recent generations
that are similar to each other. Unfortunately, BI analyses based on individual loci and the total
evidence alignment did not result in recent generations that met the target amount of similarity
with each other (average standard deviation of split frequencies ≤ 0.01). The possibility of
meeting this target through additional generations and computer power is low, given experience
with other datasets (Hall 2011). The possibility of meeting this target through decrease in non-
ambiguous characters exists (Wiens 1998). The possibility of meeting this target through
increase in taxa exists. Taxon sampling was deliberately diverse and inclusive (approximately 2
and 10 times the number of ingroup and outgroup species compared with Chapters 1-4,
respectively). However, none of the individual loci analyses met the target. The possibility of
using other phylogenetic reconstruction algorithms such as minimum evolution and maximum
likelihood also exists (Alves-Gomes 1999).
The outgroup consisted of ostariophysian species belonging to orders outside of order
Gymnotiformes (Characiformes, Cypriniformes, Gonorhynchiformes, and Siluriformes). The
104
order Gymnotiformes was resolved as monophyletic when reconstructed with the recombination
activating gene 2 (rag2) locus, consistent with all the existing phylogenies (Appendix A Figure
2). However, when reconstructed with the cytochrome b (cytb) locus, a siluriform species of
family Cetopsidae (Cetopsis coecutiens) was well supported as part of the gymnotiforms clade
(Appendix A Figure 1). The possibility of technical error during and downstream from GenBank
sequence download is low, since the C. coecutiens cytb nucleotide sequence in the alignment is
identical to that listed in GenBank. The possibility of technical error during initial collection of
sequence data exists. According to listings in GenBank, the C. coecutiens rag2 and cytb
sequences are associated with the same museum catalog number, but 2 different publications
(Hardman and Lundberg 2006; Sullivan et al. 2006). Although the topology of the C. coecutiens
specimen was consistent between the 2 publications, the topologies may not be comparable due
to key differences in species sampled. The possibility of other explanations also exist, due to the
intriguing observation that family Cetopsidae may be the only non-gymnotiform fish within
superorder Ostariophysi with tuberous electroreceptors (Alves-Gomes 2001). The least close
outgroup to Gymnotiformes was identified as Cypriniformes, consistent with most Ostariophysi
phylogenies (Saitoh et al. 2003). The closest outgroup to Gymnotiformes was identified as
Characiformes, but this was not well supported.
Within the order Gymnotiformes, families Electrophoridae, Gymnotidae,
Rhamphichthyidae, Apteronotidae, and Sternopygidae were resolved as monophyletic, as
excepted, with the exception of non-gymnotiform taxon C. coecutiens (Appendix A Figures 1-2
and 4). There was little evidence to support families Electrophoridae and Gymnotidae as a
monophyletic clade of sister families, consistent with the existing nucleotide phylogeny and
inconsistent with morphological phylogenies. The families Hypopomidae and Rhamphichthyidae
were resolved as a monophyletic clade, consistent with existing phylogenies. However, the
topology within this clade differs from existing phylogenies. It contains 2 sister clades, one of
which includes genera from both families (Gymnorhamphichthys, Rhamphyichthys,
Microsternarchus, Steatogenys, and Stegostenopos), and the other includes only genera from
family Hypopomidae (Brachyhypopomus, Hypopomus, and Racenisia). It was not clear which of
those sister clades genus Hypopygus is more closely related to.
105
Within family Gymnotidae, 3 major monophyletic clades were resolved, consistent with
Chapters 1-4 and the existing nucleotide phylogeny (Lovejoy et al. 2010; Appendix A Figures 1-
2 and 4). Within family Sternopygidae, the genus Sternopygus was resolved as monophyletic,
consistent with existing phylogenies (Appendix A Figures 1-4).
Within Gymnotiformes, various families have been proposed as the most basal. Data
from this project supports either Electrophoridae or Sternopygidae as the most basal clade
(Appendix A Figures 2 and 4, respectively).
A.4.2 Utility of the scn4aa 3’ for Phylogenetic Reconstruction
The portion of the voltage-gated sodium channel gene scn4aa that encodes the protein’s carboxyl
terminus (scn4aa 3’) locus was one of several loci used to reconstruct the Gymnotus phylogeny.
This locus is approximately 800 nucleotides long (Noda et al. 1984), and nucleotide sequences
were obtained from 57 gymnotiform species. Analyses of these sequences showed that the
scn4aa 3’ locus contributes towards a meaningful and accurate phylogenetic topology with a
reasonable amount of resolution. However, better use of phylogenetic reconstruction algorithms
are needed, as well as additional nucleotide sequences from certain clades of gymnotiforms and
outgroups.
The aligned nucleotides were from an orthologous locus, which contributed towards
meaningful reconstruction of the phylogeny among gymnotiform species (Fitch 2000). The
scn4aa gene is one of two paralogs expressed in actinopterygiian myogenic tissue, and one of
eight paralogs encoded in the actinopterygiian genome (Novak et al. 2006; Widmark et al.
2011). The scn4aa 3’ amplification primers were designed to be specific for and resulted in sole
amplification of those orthologous sequences, rather than sequences from other paralogs.
The nucleotide character alignment had a large proportion of parsimony-informative
characters, and the proportion of ambiguous characters was low. This contributed towards
accurate reconstruction of the phylogeny among gymnotiform species (Wiens 1998; Hall 2011).
Previous analyses had confirmed the absence of introns at this locus in gymnotiforms (Chapters
1-4; Widmark et al. 2011). This improves the accuracy of the alignment, since introns tend to be
106
more variable in length (Hughes and Yeager 1997). Also, amino acids are more conserved than
nucleotides, and alignments of those sequences can be used to mitigate mis-alignment of exon
indels among species (Wernersson and Pedersen 2003). The scn4aa C-terminus sequences
contained the highest proportion of parsimony-informative characters per total characters among
the loci in the dataset. In addition, the scn4aa 3’ sequences only contained 4.11% ambiguous
characters, compared with 18.60% from the whole dataset. However, the scn4aa 3’ primers were
not specific enough to only amplify scn4aa sequences from the family with neurogenic electric
organs (Apteronotidae) and outgroup ostariophysians. To obtain these nucleotide identities,
perhaps specific internal sequencing primers could be designed to be used with the Polymerase
Chain Reaction (PCR) products, or specific PCR products can be purified by gel filtration and
sequenced using existing primers.
The nucleotide characters of scn4aa 3’ seemed to be reasonably variable, which
contributed towards resolution of the phylogeny among gymnotiform species (Brown et al.
1979). Voltage-gated sodium channels are highly conserved in nucleotide sequence and function
across species (Goldin 2002). However, scn4aa in Actinopterygii had been predicted to vary in
nucleotide sequence (Novak et al. 2006). This variability was previously confirmed among the
actinopterygiian order Gymnotiformes with a small sample of species (Zakon et al. 2006;
Arnegard et al. 2010), and confirmed again with a more comprehensive sample of species (in
this project). When scn4aa 3’ sequences are included for phylogenetic reconstruction, the
proposed evolutionary relationships among gymnotiforms are generally consistent with existing
published phylogenies.
When characters at a locus vary at similar rates among lineages, the resulting phylogeny
may be used as a primary means for estimation of species divergence timing (Schwartz 2007).
However, characters are unlikely to vary at similar rates among lineages, if they were subjected
to selective pressures that resulted in divergence of those species. The voltage-gated sodium
channel protein Nav1.4a in Gymnotiformes may be an example of the latter case, since the
protein has an important role in characteristics that may be under selective pressure among some
lineages.
107
A.4.3 Variation at the Nav1.4a C-terminus
The existing analyses of patterns of selection at the voltage-gated sodium channel protein
Nav1.4a among gymnotiforms and non-electric fish focused on motifs at and between the
homologous domains of the protein (Zakon et al. 2006; Arnegard et al. 2010). Purifying
selection was detected among lineages of non-electric fish, and neutral (or relaxed) selection was
detected among basal lineages of gymnotiform fishes. Positive selection was also detected
among gymnotiform lineages, but the analysis only included four species representing four
gymnotiform families (Zakon et al. 2006). In contrast, the project described here focused on
motifs of the Nav1.4a carboxyl-terminus (C-terminus) that may be involved in varying
amplitudes and frequencies of electric organ discharges (EODs). All 6 of the gymnotiform
families were represented, and the species sample was larger by more than 14 times. Further
analyses of phylogenetic topology is needed prior to analyses of patterns of selection, due to
inconsistencies among existing phylogenetic topologies of gymnotiforms and unexpected
inclusion of a siluriform species (Cetopsis coecutiens) within the Gymnotiform order.
There were 8 positively selected Nav1.4a C-terminus amino acid sites out of the 43 sites
variable in amino acid identity among genus Gymnotus (Table 7 in Chapter 3). It is possible that
other Gymnotiforms are also positively selected at some of those amino acid sites, since some of
them are even more variable in amino acid identity when other gymnotiform families are
included. It is also possible that there are more positively selected sites identified among order
Gymnotiformes, since the number of sites variable in amino acid identity is more than four times
larger (at least 177 sites).
A.4.4 Summary and Future Directions
Some aspects of evolutionary relationships among Gymnotiformes were clarified using
additional taxa and nucleotide sequences. However, it is still not clear which family is the most
basal. In addition, the novel possibility of a species from family Cetopsidae (order Siluriformes)
being evolutionarily closer to order Gymnotiformes needs to be further verified.
The scn4aa 3’ locus contributed towards a meaningful phylogenetic topology that
provides a reasonable amount of resolution. Future phylogenetic reconstruction analyses that
108
make better use of existing algorithms would provide better accuracy and confidence in the
topology. In addition, future analyses would likely benefit from inclusion of scn4aa 3’ sequences
of family Apteronotidae among Gymnotiformes and of the majority of gymnotiform outgroups.
Analyses of patterns of selection may be assessed when evolutionary relationships among
gymnotiform fishes is further clarified (Yang 2007). Previous analyses of other motifs of the
scn4aa paralog among a small sample of gymnotiform species, found evidence of purifying,
neutral (relaxed), and positive selection among specific lineages (Zakon et al. 2006; Arnegard et
al. 2010). In addition, previous analyses of scn4aa 3’ among genus Gymnotus, found evidence of
purifying, neutral (relaxed), and positive selection among specific lineages and at specific amino
acid sites (Chapters 1-4). Since the number of sites variable in amino acid identity among order
Gymnotiformes is more than four times larger than those among genus Gymnotus (177 vs 43
sites), future analyses of scn4aa 3’ may identify additional amino acid sites that contribute to
knowledge of protein function.
109
Appendix A.5 References
Agnew, W. S. (1984). Voltage-regulated sodium channel molecules. Annu Rev Physiol. 46, 517-
30.
Alves-Gomes, J. A. (2001). The evolution of electroreception and bioelectrogenesis in teleost
fish: a phylogenetic perspective. J Fish Biol. 58, 1489-1511.
Alves-Gomes, J. A., Ortí, G., Haygood, M., Heiligenberg, W. and Meyer, A. (1995).
Phylogenetic analysis of the South American electric fishes (Order Gymnotiformes) and
the evolution of their electrogenic system: a synthesis based on morphology,
electrophysiology, and mitochondrial sequence data. Mol. Biol. Evol. 12(2), 298-318.
Arnegard, M. E., Zwickl, D. J., Lu, Y. and Zakon, H. H. (2010). Old gene duplication facilitates
origin and diversification of an innovative communication system – twice. Proc Natl
Acad Sci. 107(51), 22172-22177.
Bennett, M. V. L. (1961). Modes of operation of electric organs. Ann N Y Acad Sci. 94, 458-
509.
Bergsten, J. (2005). A review of long-branch attraction. Cladistics. 21(2). 163-193.
Brinkman, F. S. L. and Leipe, D. D. (2001). Chapter 14: Phylogenetic analysis (In: Baxevanis,
A. D. and Ouellette, B. F. F. Eds.), Bioinformatics: A practical guide to the analysis of
genes and proteins, Second Edition. John Wiley & Sons Inc. (Electronic), pp. 323-358.
ISBN 0-471-22392-1.
Brown, W. M., George, M. and Wilson, A. C. (1979). Rapid evolution of animal mitochondrial
DNA. Proc Natl Acad Sci. 76(4), 1967-1971.
Calcagnotto, D., Schaefer, S. A. and DeSalle, R. (2005). Relationships among characiform fishes
inferred from analysis of nuclear and mitochondrial gene sequences. Mol. Phylogent.
Evol. 36, 135-153.
110
Caputi, A. A. (1999). The electric organ discharge of pulse Gymnotiforms: The transformation
of simple impulse into a complex spatio-temporal electromotor pattern. J Exp Biol. 202,
1229-1241.
Catterall, W. A. (1984). The molecular basis of neuronal excitability. Science. 223(4637), 653-
661.
Catterall, W. A., Goldin, A. and Waxman, S. G. (2005). International Union of Pharmacology.
XLVII. Nomenclature and structure-function relationships of voltage-gated sodium
channels. Pharmacol Rev. 57 (4), 397-409.
Crampton, W. G. R. and Albert, J. S. (2006). Evolution of electric signal diversity in
Gymnotiform fishes (In: Ladich, F., Collin, S. P., Moller, P. and Kapoor, B. G. Eds.),
Communication in Fishes. Science Publishers, Enfield, New Hampshire, pp. 657-731.
Fink, S. V. and Fink, W. L. (1981). Interrelationships of the Ostariophysan fishes (Teleostei).
Zool J Linn Soc. 72, 297-353.
Fitch, W. M. (2000). Homology a personal view on some of the problems. Trends Genet. 16(5),
227-31.
Goldin, A. L. (2002). Evolution of voltage-gated Na+ channels. J Exp Biol. 205, 575-584.
Gotter, A. L., Kaetzel, M. A. and Dedman, J. R. (1998). Electrophorus electricus as a model
system for the study of membrane excitability. Comp Biochem Physiol. 119A (1), 225-
241.
Hall, B. G. (2011). Phylogenetic trees made easy: a how-to manual, 4th Edition. Sinauer
Associates, Inc., Sunderland, MA.
Hardman, M. and Lundberg, J. G. (2006). Molecular phylogeny and a chronology of
diversification for "phractocephaline" catfishes (Siluriformes: Pimelodidae) based on
mitochondrial DNA and nuclear recombination activating gene 2 sequences. Mol
Phylogenet Evol. 40(2), 410-418.
111
Hardman, M. and Page, L.M. (2003). Phylogenetic relationships among bullhead catfishes of the
genus Ameiurus (Siluriformes: Ictaluridae). Copeia. 2003 (1), 20-33.
Huelsenbeck, J.P. and Ronquist, F. (2001). MrBayes: Bayesian inference of phylogenetic trees.
Bioinformatics. 17, 754–755.
Hughes, A. L. and Yeager, M. (1997). Comparative evolutionary rates of introns and exons in
murine rodents. J Mol Evol. 45(2), 125-130.
Irwin, D.M., Kocher, T.D. and Wilson, A.C. (1991). Evolution of the cytochrome b gene of
mammals. J Mol Evol. 32, 128-144.
Kocher, T.D., Thomas, W.K., Meyer, A., Edwards, S.V., Paabo, S., Villablanca, F.X. and
Wilson, A.C. (1989). Dynamics of mitochondrial DNA evolution in mammals:
amplification and sequencing with conserved primers. Proc Natl Acad Sci USA. 86,
6196-6200.
Lopreato, G. F., Lu, Y., Southwell, A., Atkinson, A. S., Hillis, D. M., Wilcox, T. P. and Zakon,
H. H. (2001). Evolution and divergence of sodium channel genes in vertebrates. Proc Natl
Acad Sci. 98(13), 7588-7592.
Lovejoy, N. R. and Collette, B. (2001). Phylogenetic relationships of new world needlefishes
(Teleostei: Belonidae) and the biogeography of transitions between marine and
freshwater habitats. Copeia 2, 324-338.
Lovejoy, N. R., Lester, K., Crampton, W. G. R., Marques, F. P. L. and Albert, J. S. (2010).
Phylogeny, biogeography, and electric signal evolution of Neotropical knifefishes of the
genus Gymnotus (Osteichthyes: Gymnotidae). Mol Phylogenet Evol. 54, 278-290.
Lynch, M., O'Hely, M., Walsh, B. and Force, A. (2001). The probability of preservation of a
newly arisen gene duplicate. Genetics. 159, 1789-1804.
Mayden, R. L., Chen, W.-J., Bart, H. L., Doosey, M. H., Simons, A. M., Tang, K. L., Wood, R.
M., Agnew, M. K., Yang, L. Hirt, M. V., Clements, M. D., Saitoh, K., Sado, T., Miya, M.
and Nishida, M. (2009). Reconstructing the phylogenetic relationships of the earth's most
112
diverse clade of freshwater fishes – order Cyrpiniformes (Actinoptergii: Ostariophysi): A
case study using multiple nuclear loci and the mitochondrial genome. Mol. Phylogent.
Evol. 51, 500-514.
Mayden, R. L., Tang, K. L., Conway, K. W., Freyhof, J., Chamberlain, S., Haskins, M.,
Schneider, L., Sudkamp, M., Wood, R. M., Agnew, M., Bufalino, A., Sulaiman, Z., Miya,
M., Saitoh, K. and He, S. P. (2007). Phylogenetic relationships of Danio within the order
Cypriniformes: a framework for comparative and evolutionary studies of a model species.
J Exp Zool B Mol Dev Evol. 308B, 642-654.
Mills, A. and Zakon, H. H. (1987). Coordination of EOD frequency and pulse duration in a
weakly electric wave fish: the influence of androgens. J Comp Physiol A. 161, 417-430.
Müller, K. F. (2005). The efficiency of different search strategies in estimating parsimony
jackknife, bootstrap, and Bremer support. BMC Evol Biol. 5(58).
Noda, M., Shimizu, S, Tanabe, T., Takai, T., Kayano, T., Ikeda, T., Takahashi, H., Nakayama,
H., Kanaoka, Y., Minamino, N., Kangawa, K., Matsuo, H., Raftery, M. A., Hirose, T.,
Inayama, S., Hayashida, H., Miyata, T. and Numa, S. (1984). Primary structure of
Electrophorus electricus sodium channel deduced from cDNA sequence. Nature 312,
121-127.
Novak, A. E., Jost, M. C., Lu, Y., Taylor, A. D., Zakon, H. H. and Ribera, A. B. (2006). Gene
duplications and evolution of vertebrate voltage-gated sodium channels. J Mol Evol. 63,
208-221.
Nylander, J.A.A. (2004). MrModeltest. Technical report. Evolutionary Biology Centre, Uppsala
University, Uppsala.
Palumbi, S., Martin, A., Romano, S., McMillan, W.O., Stice, L. and Grabowski, G. (1991). The
simple fool’s guide to PCR, version 2.0. Honolulu: Department of Zoology and Kewalo
Marine Laboratory, University of Hawaii.
113
Saitoh, K., Miya, M., Inoue, J., Ishiguro, N. B. and Nishida, M. (2003). Mitochondrial genomics
of Ostariophysan fishes: perspectives on phylogeny and biogeography. J Mol Evol. 56,
464-472.
Schwartz, J. H. (2007). Do molecular clocks run at all? A critique of molecular systematics. Biol
Theory. 1(4), 357-371.
Stoddard, P. K. (2002). Electric signals: predation, sex, and environmental constraints. Advances
in the Study of Behaviour. 31, 201-242.
Sullivan, J. P., Lundberg, J. G. and Hardman, M. (2006). A phylogenetic analysis of the major
groups of catfishes (Teleostei: Siluriformes) using rag1 and rag2 nuclear gene sequences.
Mol Phylogent Evol. 41(3), 636-662.
Swofford, D.L. (2002). PAUP* 4:40: Phylogenetic analysis using parsimony *and other
methods. Sinauer Associates, Sunderland, MA.
Widmark, J., Sundström, G., Daza, D. O. and Larhammar, D. (2011). Differential evolution of
voltage-gated sodium channels in tetrapods and teleost fishes. Mol Biol Evol. 28(1), 859-
871.
Wiens, J. J. (1998). Does adding characters with missing data increase or decrease phylogenetic
accuracy? Syst Biol. 47(4), 625-640.
Zakon, H. H., Lu, Y., Zwickl, D. J. and Hillis, D. M. (2006). Sodium channel genes and the
evolution of diversity in communication signals of electric fishes: convergent molecular
evolution. Proc Natl Acad Sci. 103, 3675-3680.