An Assessment of Potential Diatom ‘‘Barcode’’ Genes (cox1 ... · ing fungal taxa. In...
-
Upload
nguyendung -
Category
Documents
-
view
213 -
download
0
Transcript of An Assessment of Potential Diatom ‘‘Barcode’’ Genes (cox1 ... · ing fungal taxa. In...
ARTICLE IN PRESS
http://www.elsevier.de/protisPublished online date 22 June 2007
1
Correspondinfax +44 131 24e-mail k.evans
& 2007 Elsevdoi:10.1016/j
158, 349—364, July 2007
Protist, Vol.ORIGINAL PAPER
An Assessment of Potential Diatom ‘‘Barcode’’Genes (cox1, rbcL, 18S and ITS rDNA) and theirEffectiveness in Determining Relationships inSellaphora (Bacillariophyta)
Katharine M. Evans1, Alexandra H. Wortley, and David G. Mann
Royal Botanic Garden Edinburgh, 20A Inverleith Row, Edinburgh EH3 5LR, UK
Submitted December 13, 2006; Accepted April 7, 2007Monitoring Editor: Robert A. Andersen
Due to limited morphological differentiation, diatoms can be very difficult to identify and crypticspeciation is widespread. There is a need for a narrower species concept if contentious issues suchas diatom biodiversities and biogeographies are to be resolved. We assessed the effectiveness ofseveral genes (cox1, rbcL, 18S and ITS rDNA) to distinguish cryptic species within the model‘morphospecies’, Sellaphora pupula agg. This is the first time that the suitability of cox1 as anidentification tool for diatoms has been assessed. A range of cox1 primers was tested on Sellaphoraand various outgroup taxa. Sequences were obtained for 34 isolates belonging to 22 Sellaphora taxaand three others (Pinnularia, Eunotia and Tabularia). Intraspecific divergences ranged from 0 to 5 bp( ¼ 0.8%) and interspecific levels were at least 18 bp ( ¼ c. 3%). Cox1 divergence was usually muchgreater than rbcL divergence and always much more variable than 18S rDNA. ITS rDNA sequenceswere more variable than cox1, but well-known problems concerning intragenomic variability cautionagainst its use in identification. More information and less sequencing effort mean that cox1 can be avery useful aid in diatom identification. The usefulness of cox1 for determining phylogeneticrelationships among Sellaphora species was also assessed and compared to rbcL. Tree topologieswere very similar, although support values were generally lower for cox1.& 2007 Elsevier GmbH. All rights reserved.
Key words: cox1; diatom; DNA barcoding; microbial species concept; rbcL; Sellaphora pupula agg.
Introduction
Diatoms are single-celled eukaryotes of enormousecological importance, accounting for at least20% of all carbon fixed globally through photo-synthesis each year (Mann 1999). They are majorconstituents of benthic and planktonic commu-nities, occurring in terrestrial, marine and fresh-
g author;8 [email protected] (K.M. Evans).
ier GmbH. All rights reserved..protis.2007.04.001
water environments worldwide. Despite their keyrole in the functioning of many ecosystems andtheir use as bioindicators of water quality and pastclimates, major aspects of their biology are poorlyunderstood and consequently contentious, forexample their biodiversity and geographical dis-tributions (e.g. Finlay et al. 2002; Mann and Droop1996; see also Hughes Martiny et al. 2006). Incommon with all microbial groups, this is in largepart due to the lack of a robust taxonomic system.
ARTICLE IN PRESS
350 K.M. Evans et al.
Diatom species have previously been distinguishedaccording to the shape, size, patterning andultrastructure of their silica exoskeleton, with littleor no understanding of the genetic basis of thesefeatures. Recent evidence from molecular data andmating experiments has shown that traditionalclassifications are too coarse: ‘morphospecies’(the outcome of applying the ‘morphological speciesconcept’ sensu Mayr 1942) can contain several tomany pseudo- or semi-cryptic taxa, for examplein Pseudo-nitzschia (e.g. Lundholm et al. 2006),Skeletonema (Sarno et al. 2005) and Neidium (Mannand Chepurnov 2005).
Perhaps the best studied example of crypticspeciation in diatoms is in Sellaphora pupula agg.,a common and cosmopolitan freshwater benthicspecies complex, for which many morphological,mating and sequence data have been generated(Behnke et al. 2004; Evans et al. unpubl.; Mann 1984,1989, 1999; Mann and Droop 1996; Mann et al. 1999,2004). These studies have demonstrated the exis-tence of a large number of pseudo- and semi-crypticspecies. For example, within Blackford Pond inEdinburgh, nine S. pupula agg. taxa live sympatricallywithout intergradation, of which six have now beendescribed as separate species (Mann et al. 2004).Although these species are morphologically similar,they are distinct and reproductively isolated, possessdifferent mating systems, exhibit different degrees ofgenetic relationship to each other, and differ in theirsensitivity to parasites, and molecular data and thefossil record suggest that they have been diversifyingfor at least 12 million years (Behnke et al. 2004; Mannet al. 2004).
Morphology is not always a reliable indicator ofspecies boundaries and phylogenetic relation-ships in S. pupula agg. and even where morpho-logical differences do exist, for example betweenthe sister taxa S. pupula agg. ‘‘pseudocapitate’’and S. blackfordensis, they are so subtle that theycannot be used for routine identification by eye: atfirst, despite our long experience with the genus,S. pupula agg. ‘‘pseudocapitate’’ clones wereidentified as S. blackfordensis (Behnke et al.2004). Consequently, sequence data are beingused increasingly to aid the recognition of speciesboundaries in S. pupula agg. and other diatomtaxa (Amato et al. 2007; Behnke et al. 2004;Medlin et al. 1991; Sarno et al. 2005; Vanormelin-gen et al. 2007). We initially chose rbcL as the coreidentification region because (1) it is proteincoding and so alignment is straightforward, (2) itis more variable than 18S rDNA and (3) being aplastid gene, the likelihood of amplifying contami-nant DNA (e.g. fungal) is much reduced.
Sequence data have also been used to identifymacroalgae (e.g. Brodie et al. 1996), fungi(e.g. Taylor et al. 2000) and prokaryotes (e.g. Pace1997). In recent years, there has been a push toextend this approach to animals and higher plantsunder the banner ‘DNA barcoding’ (Hebert et al.2003; Savolainen et al. 2005). For animals, anapproximately 650 bp fragment of the mitochon-drial gene, cytochrome c oxidase (cox1), has beenchosen after promising results from tests on awide range of organisms (e.g. Blaxter et al. 2004;Hajibabaei et al. 2006; Hebert et al. 2004; Wittet al. 2006). Saunders (2005) and Robba et al.(2006) have shown that cox1 is also suitable fordiscriminating red algal taxa. Recently, Seifertet al. (2007) have demonstrated its use in identify-ing fungal taxa. In contrast, in higher plants, cox1is insufficiently variable and other genes are beinginvestigated instead (Chase et al. 2005; Kress et al.2005; Newmaster et al. 2006). The only study ofcox1 in diatoms, by Ehara et al. (2000a), was notdesigned to test its use as a barcode for species,but showed that the gene is sufficiently variable toresolve some higher order relationships.
For organisms with robust taxonomies (forexample most higher plants and many animalgroups), the main role of DNA barcoding is to aidthe identification of previously described species(e.g. Cowan et al. 2006; Hajibabaei et al. 2006).The ultimate goal is the same in diatoms and othermicrobes, but because the true diversities of theseorganisms are often unknown (Pedros-Alio 2006;Slapeta et al. 2005), it would be an advantage if abarcode region gave preliminary information oninterspecific relationships.
The pros and cons surrounding a DNA barcod-ing approach to identification and discovery ofnew species have been discussed extensivelyelsewhere (e.g. Savolainen et al. 2005 andreferences therein). Two of the major concernsare whether rigid threshold approaches to intraversus interspecific variation are appropriate(e.g. Meyer and Paulay 2005; Witt et al. 2006)and whether reliance on a single, uniparentallyinherited gene provides sufficiently accurate infor-mation (e.g. Hurst and Jiggins 2005; Rubinoff2006). Whilst we acknowledge that DNA barcod-ing needs refinement (e.g. Pons et al. 2006) andfurther testing, such an approach is vital foridentifying microbial species because of theprevalence of cryptic diversity (Fig. 1). In thispaper we aim to:
(1)
Examine the performance of existing algalcox1 primer sets in amplifying the DNA ofARTICLE IN PRESS
Figure 1. Sellaphora pupula agg.: valves of selecteddemes. A. ‘‘Small capitate’’ DUN7. B. ‘‘Europa’’ GER1.C. ‘‘Southern pseudocapitate’’ AUS4. D. ‘‘Southerncapitate’’ AUS1. E. ‘‘Afro’’ AFR1. F. ‘‘Pseudocapitate’’L845 (parental to F1 clone; Table 1). Bar ¼ 5mm.Diatoms become progressively smaller with eachasexual division; size of valves is therefore of limiteduse in distinguishing taxa.
351DNA Barcoding of Diatoms
Sellaphora and other diatoms and, if neces-sary, design and test additional primers.
(2)
Assess how well cox1 and rbcL perform asbarcode regions, by reference to levels ofdivergence within and among Sellaphora taxa.We also discuss more limited data-sets avail-able for 18S and ITS rDNA.(3)
Compare the effectiveness of cox1 and rbcLin determining phylogenetic relationships inSellaphora.Results
Sellaphora pupula agg. demes that we recognizeas separate biological entities, but have not yetdescribed as distinct species, are referred to asS. pupula agg. followed by a morphological and/orgeographical identifier, e.g. S. pupula agg. ‘‘smalllanceolate’’ and S. pupula agg. ‘‘southern capi-tate’’ (Table 1); from here onwards, we will omit theS. pupula agg. prefix. Sellaphora isolates arecoded depending on their geographic origin, forexample THR isolates were collected from Threip-muir Reservoir (Table 1).
Algal cox1 Primers
The success of the various algal cox1 primers issummarized in Table 2. Briefly, the diatom primerspC1 and pB1 (Ehara et al. 2000a) amplified theexpected 1100 bp product in all 13 Sellaphora,Pinnularia cf. gibba, Seminavis cf. robusta andParalia sulcata DNAs that were tested, but only thered algal primer GazR1 (Saunders 2005) wassuccessful as a sequencing primer (in seveninstances: BLA2, BLA16, F1, BEL3, RBG1, THR2and P. cf. gibba; Table 1). Alignment of these andten GenBank diatom cox1 sequences enabled thedesign of the internal primer KEintF, whichextended the seven sequences in the 30 direction,allowing the design of a diatom-specific reversebarcode primer approximately 700 bp into thegene, KEdtmR (Table 3).
The brown algal cox1 barcode forward primer,GazF2 (Table 3) and KEdtmR amplified a singleband of the expected size (707 bp) for 28 of the30 Sellaphora and outgroup taxa (exceptions wereS. auldreekie and ‘‘small lanceolate’’). GazF2 andKEdtmR were also successful sequencing primersfor all Sellaphora taxa except S. lanceolata,S. auldreekie and ‘‘small lanceolate’’; the failureof the latter two was not unexpected, becausemore than one product was amplified. In theseinstances, substituting KEdtmR with the brownalgal cox1 reverse barcode primer GazR2 led tono improvement in either amplification or sequen-cing success (Table 2).
Sequences were also obtained for three of theseven outgroup taxa: the two raphid pennatesP. cf. gibba and Eunotia sp., and the araphidpennate Tabularia sp. For outgroup and ingroupsequencing failures, internal primers, KEint2Fand KEintR (Table 3) were trialled, and in thecases of S. auldreekie (BLA2), S. lanceolata andTabularia sp., extra information was gained. WhenPCRs and sequencing were repeated on other
ARTICLE IN PRESS
Ta
ble
1.
Identities
and
so
urc
es
of
iso
late
s(U
KN
atio
nalG
rid
Refe
rences
or
latitu
des
and
long
itud
es
are
giv
en
where
po
ssib
le),
vo
ucher
slid
es
(all
inth
ed
iato
mherb
arium
of
the
Ro
yal
Bo
tanic
Gard
en
Ed
inb
urg
h),
and
GenB
ank
accessio
nnum
bers
for
co
x1and
rbcL
seq
uences.
Taxo
nId
entifier
Orig
inal
desig
natio
nand
syno
nym
s
Vo
ucher
Orig
inIs
ola
ted
by
Date
GenB
ank
accessio
ns
Co
x1R
bcL
Sella
pho
raauld
reeki
eB
LA
2a
SE
L504B
E3652
Sco
tland
:B
lackfo
rdP
ond
(NT
253709)
D.G
.M
ann/A
.P
oulıc
ko
va
2004
EF
164932
EF
143303
Sella
pho
raauld
reeki
eD
UN
125D
E3018
Sco
tland
:D
unsap
ieLo
ch
(NT
281731)
V.A
.C
hep
urn
ov
1996
—E
F143276
Sella
pho
rab
acill
um
BLA
4S
EL438B
E3640
Sco
tland
:B
lackfo
rdP
ond
V.A
.C
hep
urn
ov
2003
EF
164941
EF
143311
Sella
pho
rab
acill
um
ST-
MA
R1
17S
ME
3149
Sco
tland
:S
tM
arg
are
t’s
Lo
ch
(NT
276739)
V.A
.C
hep
urn
ov
1996
EF
164930
EF
143273
Sella
pho
rab
lackf
ord
ensi
sD
UN
5a
SE
L223D
E3632
Sco
tland
:D
unsap
ieLo
ch
V.A
.C
hep
urn
ov
2003
EF
164949
EF
143319
Sella
pho
rab
lackf
ord
ensi
sB
LA
17
aB
l.85
—S
co
tland
:B
lackfo
rdP
ond
D.G
.M
ann
2004
EF
164948
—
Sella
pho
rab
lackf
ord
ensi
sB
AL1
a(B
al7
)3E
3266
Sco
tland
:B
alg
avie
sLo
ch
(NO
531508)
D.G
.M
ann
1997
EF
164935
EF
143310
Sella
pho
racap
itata
DU
N6
aD
unsap
ie138
E3648
Sco
tland
:D
unsap
ieLo
ch
D.G
.M
ann/A
.P
oulıc
ko
va
2004
EF
164946
EF
143286
Sella
pho
racap
itata
BE
L3
aM
f21
—B
elg
ium
:D
eo
ud
eS
cheld
earm
(511010 N
,31450 E
)
V.A
.C
hep
urn
ov
2005
EF
164939
—
Sella
pho
racap
itata
BLA
18
aB
l.89
E3608
Sco
tland
:B
lackfo
rdP
ond
D.G
.M
ann
2004
EF
164947
—
Sella
pho
rala
evi
ssim
aT
HR
224Tr
E3174
Sco
tland
:T
hre
ipm
uir
Reserv
oir
(NT
168636;
NT
163634)
V.A
.C
hep
urn
ov
1996
EF
164931
EF
143265
Sella
pho
rala
evi
ssim
aT
HR
4S
EL301T
ME
3651
Sco
tland
:T
hre
ipm
uir
Reserv
oir
D.G
.M
ann
2003
EF
164943
EF
143309
Sella
pho
rala
nceo
lata
BLA
12
SE
L432B
E3635
Sco
tland
:B
lackfo
rdP
ond
V.A
.C
hep
urn
ov
2003
EF
164953
EF
143315
Sella
pho
rap
up
ula
ag
g.
‘‘afr
o’’
AF
R1
Ng
Ng
42
E3622
Tanza
nia
:N
go
roN
go
rocra
ter
V.A
.C
hep
urn
ov
2002
EF
164940
EF
143300
Sella
pho
rap
up
ula
ag
g.
‘‘cf.
cap
itata
’’T
HR
7a
Thr3
7E
3365
Sco
tland
:T
hre
ipm
uir
Reserv
oir
D.G
.M
ann
1998
EF
164937
EF
143302
Sella
pho
rap
up
ula
ag
g.
‘‘cf.
ob
ese’’
BE
L2
aV
8E
3620
Belg
ium
:K
raenep
oel
(511040 N
,31290 E
)V.
A.
Chep
urn
ov
2001
EF
164933
EF
143266
Sella
pho
rap
up
ula
ag
g.
‘‘cf.
sm
all
blu
nt-
cap
itate
’’
TH
R8
SE
L292T
ME
3628
Sco
tland
:T
hre
ipm
uir
Reserv
oir
D.G
.M
ann
2003
EF
164950
EF
143308
Sella
pho
rap
up
ula
ag
g.
‘‘elli
ptical’’
RB
G1
13R
E3024
Sco
tland
:R
oyal
Bo
tanic
Gard
en
Ed
inb
urg
hP
ond
(NT
248753)
V.A
.C
hep
urn
ov
1996
EF
164957
EF
143271
352 K.M. Evans et al.
ARTICLE IN PRESS
Sella
pho
rap
up
ula
ag
g.
‘‘elli
ptical’’
BLA
14
aS
EL9B
E3634
Sco
tland
:B
lackfo
rdP
ond
D.G
.M
ann
2003
EF
164944
EF
143294
Sella
pho
rap
up
ula
ag
g.
‘‘elli
ptical’’
TH
R9
SE
L293T
ME
3625
Sco
tland
:T
hre
ipm
uir
Reserv
oir
D.G
.M
ann
2003
EF
164958
EF
143297
Sella
pho
rap
up
ula
ag
g.
‘‘elli
ptical’’
TH
R10
aS
EL299T
ME
3642
Sco
tland
:T
hre
ipm
uir
Reserv
oir
D.G
.M
ann
2003
EF
164952
EF
143296
Sella
pho
rap
up
ula
ag
g‘‘e
uro
pa’’
GE
R1
aF
r.S
ee.1
8E
3325
Germ
any:
Fried
heim
er
See
(531310 2
300N
,81390 5
400E
)
V.A
.C
hep
urn
ov
1997
EF
164955
EF
143318
Sella
pho
rap
up
ula
ag
g.
‘‘larg
e’’
TH
R11
79T
mN
EE
3415
Sco
tland
:T
hre
ipm
uir
Reserv
oir
V.A
.C
hep
urn
ov
1997
EF
164945
EF
143285
Sella
pho
rap
up
ula
ag
g.
‘‘little’’
RB
G2
20R
E3020
Sco
tland
:R
oyal
Bo
tanic
Gard
en
Po
nd
V.A
.C
hep
urn
ov
1996
EF
164928
EF
143275
Sella
pho
rap
up
ula
ag
g.
‘‘pseud
ocap
itate
’’F
1a
(L845.5�
Thr4
2)
E3384
F1
clo
ne
(see
Meth
od
s)
V.A
.C
hep
urn
ov
1999
EF
164942
EF
143291
Sella
pho
rap
up
ula
ag
g.
‘‘sm
all
blu
nt-
cap
itate
’’T
HR
13
6T
mE
3026,
E3161-2
Sco
tland
:T
hre
ipm
uir
Reserv
oir
D.G
.M
ann
1996
EF
164954
EF
143272
Sella
pho
rap
up
ula
ag
g.
‘‘sm
all
cap
itate
’’D
UN
7S
EL92D
E3638
Sco
tland
:D
unsap
ieLo
ch
D.G
.M
ann
2003
EF
164936
EF
143269
Sella
pho
rap
up
ula
ag
g.
‘‘sm
all
lanceo
late
’’T
HR
14
5T
ME
3160
Sco
tland
:T
hre
ipm
uir
Reserv
oir
V.A
.C
hep
urn
ov
1996
—E
F143264
Sella
pho
rap
up
ula
ag
g.
‘‘so
uth
ern
cap
itate
’’A
US
1a
HL5
E3623
Austr
alia
:H
ackin
gR
iver
(341040 2
200S
,1511030 2
700E
)
V.A
.C
hep
urn
ov
2001
EF
164959
EF
143312
Sella
pho
rap
up
ula
ag
g.
‘‘so
uth
ern
pseud
ocap
itate
’’
AU
S4
HL18
E3466
Austr
alia
:H
ackin
gR
iver
V.A
.C
hep
urn
ov
2001
EF
164934
EF
143317
Sella
pho
rap
up
ula
ag
g.
‘‘sp
ind
le’’
BLA
16
aK
E68
E3658
Sco
tland
:B
lackfo
rdP
ond
K.M
.E
vans
2005
EF
164951
EF
143298
Sella
pho
racf.
min
ima
—B
M42
E3211
Sco
tland
:B
lackfo
rdP
ond
D.G
.M
ann
1997
EF
164929
EF
143279
Sella
pho
racf.
sem
inulu
m—
TM
37
E3168
Sco
tland
:T
hre
ipm
uir
Reserv
oir
D.G
.M
ann
1997
EF
164956
EF
143280
Euno
tia
sp
.—
EU
N392T
—S
co
tland
:Lo
ch
Tulla
(NN
294428)
D.G
.M
ann
2003
EF
164960
—
Melo
sira
num
mulo
ides
—B
4—
Sco
tland
:N
ort
hB
erw
ick
(NT
553858)
D.G
.M
ann
c.
2000
——
Od
onte
llasi
nensi
s—
Od
onte
llaC
an.
Is.
—C
anary
Isle
s:
Gra
nC
anaria
A.
de
laJara
/D
.G.
Mann
1999
——
Para
liasu
lcata
—a
NB
K-2
—S
co
tland
:N
ort
hB
erw
ick
D.G
.M
ann
1999
——
Pin
nula
ria
cf.
gib
ba
—a
Clo
ne
12
E3603
Sco
tland
:Lo
ch
of
Cra
iglu
sh
(NO
041444)
A.
Po
ulıc
ko
va
2004
EF
164938
EF
143304
Sem
inavi
scf.
rob
ust
a—
aF
1-1
0A
—F
1clo
ne
(see
Meth
od
s)
T.M
oens
2000
——
Tab
ula
ria
sp
.—
Tab
ula
ria
Nap
les
—It
aly
:N
ap
les
W.H
.C.F
.K
oo
istr
a—
EF
164927
—
aC
lones
are
aliv
eand
availa
ble
up
on
req
uest;
the
rem
ain
der
are
dead
,p
red
om
inantly
due
toasexualre
pro
ductio
nw
hic
heventu
ally
results
incells
that
are
too
sm
all
tore
main
via
ble
.
353DNA Barcoding of Diatoms
ARTICLE IN PRESS
Table 4. Levels of cox1 (c. 624 bp) and rbcL(c. 1400 bp) divergence among Sellaphora isolatesoriginally thought to belong to the same taxon:percentage divergence and number of bp differ-ences. The degree of divergence among ‘‘elliptical’’and Sellaphora laevissima isolates indicates addi-tional cryptic diversity.
Taxon cox1 rbcL%divergence
%divergence
(no. bp) (no. bp)
Sellaphora capitata 0 0Sellaphora bacillum 0 0Sellaphorablackfordensis
0.8% (5) 0
‘‘Elliptical’’ 3.2% (20) 0.4% (5)Sellaphoralaevissima
7.9% (49) 2.2% (30)
Table 2. Amplification and sequencing successes ofa range of algal cox1 primers.
Primer set/ Amplification Sequencingprimer success success
GazF1 0/13 0/13GazR1 7/13a
pC1 13/13 0/13pB1 0/13GazF2 37/40 32/40KEdtmR 32/40GazF2 0/3b n/aGazR2 0/4b
KEintF n/a 7/7KEint2F n/a 3/8KEintR n/a 3/8
aAs a sequencing primer of the pC1/pB1 PCRproduct.
bGazR2 only tested in instances where KEdtmRhad failed.
Table 3. Sequences of brown algal and diatom cox1primers used in this study.
Name Sequence (50-30)
Brown algalGazF2 CAA CCA YAA AGA TAT WGG TACGazR2 GGA TGA CCA AAR AAC CAA AADiatomKEintF GAG AGC AAA AAG TTT ACC ATT
TCAKEdtmR AAA CTT CWG GRT GAC CAA AAAKEint2F GAA GCW GGW GTW GGT ACW
GGW TGKEintR CAA ATA AAA TTR ATW GCW CCT
AA
354 K.M. Evans et al.
representatives of the same outgroup taxa, in twocases (S. cf. robusta and Melosira nummuloides),non-diatom sequences were obtained; BLASTsearches listed an invertebrate and oomycete,respectively, as the closest matches, both possi-ble contaminants in the cultures.
In summary, cox1 barcodes were obtained for22 of the 23 Sellaphora taxa (‘‘small lanceolate’’was the one exception) and for three of the sevenoutgroup taxa. In all instances sequences wereunambiguous. Primer set GazF2 and KEdtmR willprobably work well for most species within theraphid pennate lineage and for some araphids, butprobably not for those belonging to the centriclineage. Although GazR2 was not tested exten-sively, due to sequence overlap with KEdtmR
(Table 3), it is likely that it will perform comparablyto KEdtmR when used together with GazF2. The34 (31 ingroup and 3 outgroup) cox1 sequenceswere aligned with no gaps over 624 bp (thebarcode region, excluding the primers was665 bp). Seven isolates lacked some sequenceinformation: up to 16 bp could not be determinedfor six isolates and 115 bp for Tabularia sp.
Intra- and Interspecific cox1 and rbcLDivergences
Cox1 data were obtained for five Sellaphora taxarepresented by more than one isolate, viz.S. capitata (n ¼ 3), S. blackfordensis (n ¼ 3), ‘‘ell-iptical’’ (n ¼ 4), S. bacillum (n ¼ 2) and S. laevis-sima (n ¼ 2; Table 1). Despite the geographicrange over which the S. capitata isolates weresampled (Scotland and Belgium), all three hadidentical cox1 and rbcL sequences; the twoS. bacillum isolates were also identical at bothregions (Table 4). In contrast, although the threeS. blackfordensis isolates had identical rbcLsequences, in cox1 there were 5 bp differencesbetween the two Edinburgh isolates (BLA17 andDUN5; lakes separated by c. 5 km) and 1 bpdifference between the isolate from DunsapieLoch (DUN5) and an isolate from Balgavies Loch(BAL1; lakes separated by c. 80 km; Table 4). Thecox1 data for S. laevissima and ‘‘elliptical’’ isolatesstrongly support our previous suspicions of addi-tional cryptic diversity (Table 4). The cox1sequences of the two S. laevissima isolatesdiffered by 49 bps ( ¼ 7.9%) compared with 30rbcL bp differences ( ¼ 2.2%; Fig. 3). There were
ARTICLE IN PRESS
355DNA Barcoding of Diatoms
up to 20 cox1 bp differences ( ¼ 3.2%) between‘‘elliptical’’ isolates compared to 5 rbcL differ-ences ( ¼ 0.4%).
In all cases, interspecific cox1 divergencewas much greater than the corresponding rbcLdivergence (Fig. 3). For example, betweenS. blackfordensis and its closest known relative,‘‘pseudocapitate’’, with which it can form an F1
(Behnke et al. 2004), there were up to 23 bpdifferences ( ¼ 3.7%) in cox1 compared to 21 rbcLdifferences ( ¼ 1.6%); and between S. capitata andthe closely related ‘‘afro’’, there were 18 cox1differences ( ¼ 2.9%) compared to 13 rbcLdifferences ( ¼ 0.9%; no data are available forreproductive compatibility of these two taxa).
Cox1 and rbcL Phylogenetic Analyses
For cox1 (alignment 624 bp), 298 characters werevariable and 206 were parsimony-informative.Within Sellaphora itself, 225 characters werevariable and 158 were parsimony-informative.Maximum parsimony (MP) analysis found fourtrees of length 899 steps. No new islands of treeswere found using the second search strategy.
Figure 2. Phylogenies obtained from ML analysis of coxidentified by species or deme name plus a suffix indicati(B), three Sellaphora isolates are different to those usufficiently closely related (i.e. closer than to any other ttaxa (see Methods). Branch lengths are proportional to tupon them. Numbers above branches represent JK su
For maximum likelihood (ML) analyses, the hier-archical likelihood ratio (HLR) test selected themodel GTR+I+G, for which —ln L ¼ 4755.8711,K ¼ 10, and base frequencies were A ¼ 0.2910,C ¼ 0.1123, G ¼ 0.1375 and T ¼ 0.4592. The ratematrix was A—C ¼ 1.0447, A—G ¼ 3.6703,A—T ¼ 0.7520, C—G ¼ 0.0005, C—T ¼ 6.7541and G—T ¼ 1.0000. The gamma distributionshape parameter was 0.6504. ML analysis pro-duced a tree with likelihood —ln L ¼ 4751.08707(Fig. 2A), which is very similar in topology to theMP strict consensus tree (not shown).
For rbcL (alignment 1399 bp), 252 characterswere variable and 150 were parsimony-informa-tive. Within Sellaphora itself, 188 characters werevariable and 111 were parsimony-informative. MPanalysis found 369 trees of length 577 steps.No new islands of trees were found using thesecond search strategy. The HLR test selected themodel TrN+I+G, for which —ln L ¼ 5021.2383,K ¼ 7, and base frequencies were A ¼ 0.2997,C ¼ 0.1689, G ¼ 0.1984 and T ¼ 0.3330, withrates of change A—G ¼ 3.0015 and C—T ¼5.3232. The gamma distribution shape parameterwas 0.6793. ML analysis produced a tree with
1 (A) and rbcL (B) sequences. Sellaphora isolates areng their provenance (see Table 1). In the rbcL analysissed in the cox1 analysis (A), but are consideredaxon in the tree) to act as placeholders for the samehe number of substitutions inferred to have occurredpport values where greater than 50%.
ARTICLE IN PRESS
356 K.M. Evans et al.
likelihood —ln L ¼ 4998.8777 (Fig. 2B), which isvery similar in topology to the MP strict consensustree (not shown).
As Figure 2A shows, in cox1 analysis, allSellaphora taxa formed a monophyletic group,with 84% bootstrap (BS) in MP analysis (notshown), although jack-knife (JK) support in MLanalysis was lower. Whereas S. pupula agg. wasparaphyletic, S. bacillum was strongly supportedas a monophyletic group in both types of analysis(100% MP BS, 96% ML JK) and S. laevissima waswell supported in MP (97% BS); strong supportwas also found for the monophyly of two recentlydescribed species (Mann et al. 2004) for whichthree clones of different provenance wereincluded, viz. S. blackfordensis (100% MP BS,97% ML JK) and S. capitata (98% MP BS, 91%ML JK). The two ‘‘elliptical’’ groupings (Table 4)were strongly supported (over 98% MP BS, over85% ML JK), as were two deme pairs (100% MPBS, over 95% ML JK): ‘‘small blunt-capitate’’+‘‘cf.small blunt-capitate’’ and S. capitata + ‘‘afro’’.Within Sellaphora, one clade, the ‘‘blackfordensis’’group was particularly well supported (98% MPBS, 85% ML JK). This group comprisesS. blackfordensis, S. capitata, ‘‘afro,’’ ‘‘cf. obese,’’‘‘southern capitate,’’ ‘‘cf. capitata,’’ ‘‘pseudocapi-tate’’ and ‘‘southern pseudocapitate’’.
A comparison of Figure 2A and 2B shows that thecox1 and rbcL data-sets give very similar results interms of topology, although support values aregenerally higher for rbcL. As with cox1, with rbcL allSellaphora formed a monophyletic group, althoughwith stronger support (94% MP BS, 100% ML JK).Sellaphora bacillum, S. blackfordensis and S. capitatawere also all strongly supported as monophyleticgroups (at least 95% MP BS and ML JK). Both‘‘elliptical’’ clades were supported although lessstrongly than in the cox1 analysis (over 80% MPBS, over 55% ML JK). As for cox1, the deme pair‘‘small blunt-capitate’’+‘‘cf. small blunt-capitate’’ waswell supported (97% MP BS, 88% ML JK). The samewas true for the ‘‘blackfordensis’’ group (93% MP BS,91% ML JK). The major difference between the twophylogenies is the position of S. cf. minima, which isresolved outside S. pupula agg. using cox1 andwithin S. pupula agg. using rbcL.
Discussion
Is cox1 a Suitable Gene for DiatomIdentification?
Level of Variation Within and AmongSellaphora Taxa: Despite the much shorter
length of the cox1 sequences, the absoluteamount of intraspecific divergence was mostlyeither equal or greater than that for rbcL. Forexample, for S. capitata and S. bacillum isolatesthere was neither cox1 nor rbcL divergence,whereas for the three S. blackfordensis isolatesfrom Dunsapie Loch, Blackford Pond andBalgavies Loch, although there was no rbcLdivergence, there were up to five cox1 bpdifferences (Table 4). Therefore, for some species,cox1 may provide useful information on popula-tion structure or racial differences: breedingexperiments have shown no intrinsic reproductivebarriers between Dunsapie, Blackford and Balga-vies clones of S. blackfordensis (Behnke et al.2004; Chepurnov and Mann, unpubl.), thoughthese data do not refer to the same clones asthose used in the present study. ‘‘Elliptical’’ andS. laevissima have been shown by rbcL (and in thecase of S. laevissima, also by rDNA) analysis tocontain two molecular clades (Behnke et al. 2004;Evans et al. unpubl.) and in both cases the cox1sequences were more divergent than the rbcLdata (Table 4 and Fig. 2). Two groupings of‘‘elliptical’’ isolates were well supported (Fig. 2),viz. the ‘upland’ isolates, THR9 and THR10(identical cox1 sequences, but one rbcL bpdifference), both from the mesotrophic, uplandThreipmuir Reservoir and the ‘urban’ isolates,RBG1 and BLA14 (two cox1 bp differences, butrbcL sequences identical), both from highlyeutrophic ponds in Edinburgh. The degree ofdivergence between the two S. laevissima isolatesand the two ‘‘elliptical’’ subclades supports theirdelineation as separate taxa (Table 4: both sets oftaxa also differ subtly in their morphologies; Evanset al. unpubl.), but before formalizing this, it will beimportant to obtain additional data from nuclearmarkers (e.g. rDNA). Levels of interspecific cox1divergence between previously recognized taxa orclosely related demes were at least 18 bp or 2.9%(between S. capitata and ‘‘afro’’; Fig. 3). Inter-specific cox1 divergences were always muchgreater than corresponding rbcL data (Fig. 3).
Other genes commonly used to identify diatomsand to construct phylogenies are the nuclear rDNAregions, 18S and ITS, which are approximately1800 bp and 650—850 bp in length, respectively(e.g. Beszteri et al. 2005; Orsini et al. 2004). As yet,there are few 18S and ITS data for Sellaphora taxa(Behnke et al. 2004, Evans et al. unpubl.) and so itis possible only to make a few comparisons withcox1 and rbcL and these are not direct (i.e. thesequences were not all obtained from the sameisolates). Nevertheless, it is apparent that the 18S
ARTICLE IN PRESS
Figure 3. Divergences of cox1 (c. 624 bp) and rbcL(c. 1400 bp) sequences between selected taxa. Thedashed line indicates identical levels of divergence.
357DNA Barcoding of Diatoms
region is less variable than rbcL and much lessvariable than cox1; for example between twoS. laevissima isolates that belong to different rbcLclades (THR1 and THR4: GenBank accessionsEF151979 and EF151981, respectively), there arejust 8 bp differences in 18S rDNA ( ¼ c. 0.4%,compared with 7.9% for cox1 and 2.2% for rbcL);and between S. blackfordensis and ‘‘pseudocapi-tate’’ isolates (BLA6 and US1; EF151969 andAJ544649, respectively) there are just 3 bp differ-ences ( ¼ c. 0.2%, compared with 3.7% for cox1and 1.6% for rbcL). In contrast to 18S, the ITSregion (ITS-1—5.8S—ITS-2) is highly variableand is often chosen to investigate within-popula-tion genetic variation, for example in Pseudo-nitzschia (e.g. Orsini et al. 2004). Behnke et al.(2004) reported 23 differences ( ¼ c. 2.8%) amongsix S. blackfordensis isolates, five of whichoriginated from the same pond; and 92 differences( ¼ c. 11%) between S. laevissima isolatesobtained from Scotland and the US (as for theS. laevissima isolates included in this study, theseprobably represent distinct taxa).
To sum up, we have shown that cox1 issufficiently variable to (a) distinguish previously
recognized taxa, (b) draw attention to potentialcryptic taxa and (c) act in some cases as ageographical marker. This accords with thediagnostic potential of cox1 in metazoa (e.g.Hajibabaei et al. 2006) and red algae (Robbaet al. 2006; Saunders 2005).Other Considerations: Besides being suitablyvariable, there are a number of other factors toconsider before deciding on a gene(s) for barcodeidentification and preliminary phylogenetic ana-lyses. The first is the mode of inheritance.Uniparental inheritance of organelles carries therisk that sequence data will not accurately reflectspecies boundaries if hybridization and subse-quent introgression have occurred (e.g. Gompertet al. 2006). This scenario is probably lesscommon in diatoms than in many land plants,because most diatom clones, like individualanimals, have a finite and short life (of a few yearsonly) because of cell size reduction, after whichregeneration via a sexual phase is usually obliga-tory. Hence reduced fertility of hybrids could notbe circumvented indefinitely by vegetative propa-gation, leading to the likelihood of strong selectionfor effective mechanisms of reproductive isolation,or restoration of full fertility, wherever incipientdiatom species occur in sympatry. Nevertheless, itis desirable to also sequence a nuclear gene(Rubinoff 2006), particularly when examiningclosely related species.
In animals and land plants (excluding mostconifers), mitochondria and chloroplasts arematernally inherited. In algae, maternal inheritanceof organelles also appears to be dominant(e.g. Coyer et al. 2002), and may even occur,surprisingly, in organisms that reproduce isoga-mously. Thus, the isogamous brown alga Scytosi-phon lomentaria exhibits uniparental inheritance ofmitochondria, but biparental inheritance of thechloroplasts (Kato et al. 2006). The isogamousgreen alga Chlamydomonas reinhardtii transmitsboth mitochondria and chloroplasts uniparentally,but from different mating types (Aoyama et al.2006). How diatoms inherit organelles during thesexual phase is poorly known. The sperm of someoogamous centric diatoms certainly lose anyvisible sign of chloroplasts (during hologenousgametogenesis: Jensen et al. 2003), implyinginheritance via the egg alone; however manycentric diatoms are monoecious, producing eggsand sperm during different phases of the samelife cycle (Chepurnov et al. 2004), so that eachclone will transmit its chloroplasts to progeny.In pennate diatoms the parental chloroplastsappear to retain their integrity throughout sexual
ARTICLE IN PRESS
358 K.M. Evans et al.
reproduction and are segregated at the firstdivision of the F1 initial cell (Mann 1996). Nothingis known about mitochondrial inheritance indiatoms; we are currently researching this issue,using the cox1 variation we detected amongS. blackfordensis (Fig. 2A).
Second, it is important to know to whichgenome the gene belongs. For example in higherplants, gene transfer from the mitochondria to thenucleus is a frequent, ongoing process, leading toparalogous gene sequences (see references inGray et al. 2004). So far, only one diatommitochondrial genome has been sequenced in itsentirety, that of Thalassiosira pseudonana. At leastin this diatom, there is no evidence for recentlarge-scale transfers of mitochondrial DNA to thenucleus and cox1 is part of the mitochondrialgenome (Armbrust et al. 2004).
Third, evolutionarily frequent horizontal transferof mitochondrial genes between distantly relatedangiosperm taxa has been documented; in con-trast, horizontal transfer of chloroplast DNA in landplants is likely to be much lower (Bergthorssonet al. 2003). Whether horizontal transfer ofmitochondrial genes also occurs between algaltaxa is unknown since much fewer mitochondrion-based phylogenetic analyses have been con-ducted.
Fourth, if there are introns within the barcodinggene, or if the gene is split into two or moresegments, there could be amplification andsequencing problems. Unlike animals (Beagleyet al. 1996), introns are abundant in land-plantmitochondria (Knoop 2004). They are also found insome algal mitochondrial genes, for example thecox1 genes of the diatoms T. pseudonana(2338 bp in length; Armbrust et al. 2004) andT. nordenskioeldii (2311—2468 bp in length; Eharaet al. 2000b) and several green algae (Watanabeet al. 1998). An isolate of the brown alga Pylaiellalittoralis possessed three introns in a cox1 genethat was approximately 9000 bp in length (Fon-taine et al. 1997); however, no introns were foundin the cox1 genes of four other brown algae(Oudot-Le Secq et al. 2002, 2006). Lastly, in thehaptophyte Emiliania huxleyi, cox1 is splicedfrom two mitochondrial segments, 10 kbp apart(Sanchez Puerta et al. 2004). So far, introns foundin diatom cox1 genes occur outside the ‘barcoderegion’, i.e. beyond the first 700 bp. RbcL can alsocontain introns (e.g. in green algae: Nozaki et al.2002), but we have detected none in over 150species of diatoms (unpubl.).
Fifth, although not an absolute requirement forDNA barcoding, ease of alignment is an advan-
tage if the sequence data are to contribute tophylogenetic analyses. In contrast to 18S and ITSrDNA, both cox1 and rbcL are coding genes andconsequently contain few indels, makingsequences easy to align. For ITS in particular, thisis not the case. For example, Behnke et al. (2004)found that the length of the ITS region (excluding5.8S) in selected Sellaphora taxa ranged from 519to 734 bp. Furthermore, ITS and to a lesser degree18S rDNA have the additional problem of intra-individual variation (Alvarez and Wendel 2003;Alverson and Kolnick 2005; Vanormelingen et al.2007). Within one S. auldreekie isolate, Behnkeet al. (2004) recorded three types of ITSsequences that differed at 48 positions and twoindels of 50 and 4 bp. Cloning of PCR fragments istherefore necessary to obtain accurate measure-ments of intra-individual ITS variation. Conse-quently, we do not consider ITS to be a suitablebarcode region for diatoms. Surprisingly, ITS isone of the genes currently being considered forbarcoding higher plants (Kress et al. 2005); in largepart, this is because of the huge number of ITSsequences already available, which is not the casefor diatoms.
Sixth, it is desirable to reduce as much aspossible the likelihood of amplifying contaminantDNA. Although it is relatively easy to detect carry-over of other photosynthetic organisms duringisolation, small heterotrophic flagellates are lesseasy to exclude. In addition, under sub-optimalculture conditions, mite-mediated fungal transferbetween cultures can occur: two of the cox1sequences in this study, obtained from preservedDNA collected originally for rbcL analysis, turnedout to belong to non-diatoms. Using a chloroplastmarker, it is unlikely that any non-specific pro-ducts will be amplified as long as the culture isunialgal, which is easy to verify by low-resolutionmicroscopy.
So should cox1 be chosen as a barcode genefor diatoms? From our preliminary data, it iscertainly a serious contender, because it has ahigher degree of variability than any other genefor which we possess information, apart from ITS(so that sequencing effort and expenditure arelower: two sequencing reactions are requiredcompared to four for rbcL and up to six for 18SrDNA), and as far as we are aware there are noexamples of intragenomic cox1 variation. On thedownside, relatively little is known about theevolution of algal mitochondrial genomes. Inaddition, mitochondrial mode of inheritance hasnot yet been established (although the same istrue for chloroplasts), nor have diatom-wide
ARTICLE IN PRESS
359DNA Barcoding of Diatoms
primers been designed. Taking these factors intoaccount, we now use cox1 as a relatively quickand cheap screening method to determinewhether diatom clones are of particular interestto us; if so, we obtain rbcL, 18S rDNA or otherdata to place the clone phylogenetically.
How Widely Can the Diatom cox1 BarcodePrimers be Used?
The cox1 barcode primers, GazF2 and KEdtmRyielded a product of the anticipated size for alltaxa except two belonging to the S. pupulaspecies complex (Table 2). The same primerswere also suitable for sequencing the majority ofSellaphora isolates as well as the other raphidsPinnularia cf. gibba (which is fairly closely relatedto Sellaphora: Behnke et al. 2004, Evans et al.unpubl.) and Eunotia sp. (which is widely sepa-rated from Sellaphora: Sims et al. 2006). Surpris-ingly, cox1 sequences could not be obtained forSeminavis cf. robusta despite its closer relation-ship to Sellaphora than Eunotia sp. Outside theraphid pennates their use was limited, workingpartially only for the araphid pennate Tabularia sp.No cox1 sequences were obtained for centricdiatoms. Additional primers will therefore need tobe developed for a universal cox1 diatom barcod-ing system to be viable.
Cox1 as a Phylogenetic Marker
The cox1 Sellaphora phylogeny suggests essen-tially the same relationships as those that re-sulted from analysis of rbcL (Fig. 2), althoughsupport values are generally lower. In both cases,Sellaphora clearly emerged as a monophyleticgroup. Whereas S. bacillum and S. laevissima arenatural groups, S. pupula agg. is paraphyletic,because the clade that contains the demes andsegregate species of S. pupula agg. also includesS. bacillum. In the cox1 tree (Fig. 2A), the closestrelative to S. bacillum is the recently describedS. lanceolata (Mann et al. 2004). In the rbcLphylogeny (Fig. 2B), ‘‘small lanceolate’’ was evenmore closely related to S. bacillum, but we wereunable to obtain a cox1 sequence for this taxon.Sellaphora blackfordensis and S. capitata formedstrongly supported monophyletic groups in bothdata-sets, supporting the recent recognition ofthese as species (Mann et al. 2004). Two taxonpairs were very well supported in the cox1 analysis:‘‘small blunt-capitate’’+‘‘cf. small blunt-capitate’’and S. capitata + ‘‘afro’’. Cox1 sequences for ‘‘small
blunt-capitate’’+‘‘cf. small blunt-capitate’’ wereidentical (as were rbcL sequences) suggesting thatdespite slight morphological differences (Evans et al.unpubl.), both isolates belong to the same taxon.
The cox1 and rbcL analyses resolved a numberof identical clades within Sellaphora, one ofwhich was particularly well supported in both:the ‘‘blackfordensis’’ group (S. blackfordensis,S. capitata, ‘‘cf. obese’’, ‘‘afro’’, ‘‘pseudocapitate’’,‘‘southern pseudocapitate’’, ‘‘cf. capitata’’, and‘‘southern capitate’’). The strong support for thisgroup correlated with several cox1 amino-acidchanges in sites 55—65 (numbering according tothe T. pseudonana exon, GenBank NC007405),producing the unique motif RPDSDFLNYNH.Three of these (55Arg, 57Asp and 65His) representchanges in amino-acid type (acidic, basic, polar,non-polar) relative to all other Sellaphora. Mostother non-synonymous changes among Sella-phora cox1 sequences involved replacement ofamino-acids by others of the same charge group.In contrast, there were no obvious correlationswith specific synapomorphies in the rbcL data-setfor the blackfordensis group.
The major difference between the two phyloge-nies is in the position of S. cf. minima, which isresolved outside S. pupula agg. using cox1 andwithin S. pupula agg. with rbcL (Fig. 2). AllS. pupula agg. taxa and S. bacillum possesstransverse bars at the poles of each cell; thesebars are absent in the other Sellaphora taxaincluded in this study. Sellaphora cf. minima lackspolar bars and its position in the rbcL phylogeny,which has low support, is likely to be an anomaly.
Overall, our analyses show that the cox1barcode region, despite its short length, can bea valuable phylogenetic marker. Though obtainedprimarily to characterize diatom taxa for identifi-cation, cox1 barcode sequences can provideenough phylogenetic information to indicate whenmore extensive studies, using multiple longersequences, are likely to be rewarding.
Methods
Choice of isolates: We used an extensiverbcL data-set (unpubl.) as a basis for selectingSellaphora isolates for cox1 analysis. RbcL datawere available for isolates belonging to 23Sellaphora taxa (19 S. pupula (Kutz.) Mereschk.agg. demes or species and four other Sellaphoraspecies; see Table 1 for GenBank accessions);18S and ITS rDNA data-sets are currently muchmore limited (Behnke et al. 2004; unpubl.). One
ARTICLE IN PRESS
360 K.M. Evans et al.
isolate from each of the 23 Sellaphora taxa waschosen unless (a) isolates originated from morethan one water body (S. bacillum (Ehrenb.) D. G.Mann, S. blackfordensis D. G. Mann et S. Droop,and S. capitata D. G. Mann et S. M. McDonald)and/or (b) molecular data (Behnke et al. 2004,unpubl.) suggested they harbour cryptic diversity(S. laevissima (Kutz.) D. G. Mann and ‘‘elliptical’’).Although rbcL data suggested that isolates of‘‘small blunt capitate’’ (THR13) and ‘‘cf. smallblunt capitate’’ (THR8; Table 1) may belong to thesame species (despite displaying morphologicaldifferences: unpubl.), both were included inthis study to test for a similar lack of cox1divergence.
To allow a diatom-wide assessment of theeffectiveness of different primer sets in amplifyingthe cox1 region, seven non-Sellaphora taxa wereincluded that represented other major evolutionarylineages of diatoms (Round et al. 1990). Non-Sellaphora raphid pennates were represented byEunotia sp., Seminavis cf. robusta Danielidis etD. G. Mann and Pinnularia cf. gibba Krammer;araphid pennates by Tabularia sp.; and centrics byParalia sulcata (Ehrenb.) Cleve, Melosira nummu-loides (Dillw.) Ag. and Odontella sinensis (Grev.)Grun.
Isolation, culture and identification of iso-lates: Sellaphora and freshwater out-group taxawere obtained from epipelic communities growingon mud at 0.5—1 m depth in a range of fresh-waters; marine outgroup taxa were obtained fromvarious net plankton samples (Table 1; clonesavailable on request). Epipelic communities wereharvested as described by Mann and Chepurnov(2005). Clones were isolated either by pipette orby streaking on agar (freshwater taxa only). Cellswere then transferred into small volumes ofmedium in the wells of Repli dishes and grownas monoclonal cultures in 50 mm diameter Petridishes. Freshwater species were kept in WCmedium with silicate (Guillard and Lorenzen1972), marine species in Roshchin medium (Mannand Chepurnov 2004). Stock cultures were kept at15 1C on a 12:12 h L—D cycle at low irradiance(c. 5—20mmol photons m�2 s�1). For molecularstudies, two 90 mm Petri dishes were inoculated,one for DNA extraction, the other to provide cellsfor a voucher slide. Two clones were isolated fromamong the F1 progenies of two sexual crosses: (1)between a North American clone (L845: suppliedby the late Dr. D. B. Czarnecki) and a Scottishclone (Thr 42) of ‘‘pseudocapitate’’; and (2)between two clones of S. cf. robusta (75 and 88)isolated from Veerse Meer, Zeeland, The Nether-
lands (Chepurnov et al. 2002). Mating experimentswere performed as described by Mann et al.(1999).
The preparation of voucher slides and compila-tion of digital images for identification purposesare described in more detail in the SupplementaryMaterial. Identities were assigned to clones of theS. pupula complex (Table 1) on the basis of theirmorphological similarity to clones and demesstudied previously (e.g. Behnke et al. 2004; Mannet al. 1999, 2004; Taylor 1996).
DNA extraction, PCR, sequencing and cox1primer design: DNA was extracted using one ofthree methods, two of which are described inSluiman and Guihal (1999) and Jones et al. (2005);for a few samples, a Qiagen Plant DNeasy kit(Qiagen, Crawley, UK) was used. RbcL, ITS and18S rDNA regions were amplified and sequencedas described in Jones et al. (2005) and Behnkeet al (2004). The respective lengths of theseregions are approximately 1400 bp, 650—850 bpand 1800 bp.
For cox1, red algal (GazF1 and GazR1; Saun-ders 2005) and diatom (pC1 and pB1; Ehara et al.2000a) cox1 primers were tested, predominantlyusing Sellaphora DNAs (AUS1, BAL1, BLA2,BLA16, DUN6, F1, BEL3, RBG1, THR2 andTHR7), but also P. cf. gibba, P. sulcata and S. cf.robusta (Table 1). GazF1 and GazR1 weredesigned by Saunders to amplify a 710 bp cox1barcode in red algae, whereas pC1 and pB1 wereused to amplify 1100 bp of diatom cox1 (also fromnear the 50 end of the gene) to construct a limitedphylogeny. PCR conditions were as detailed inSaunders (2005) and Ehara et al. (2000a). Ampli-fication products were purified using a GFX PCRpurification kit (Amersham Biosciences, LittleChalfont, UK) and subsequent concentrationsestimated by electrophoresis on a 1% agarosegel. PCR fragments were sequenced directly in10ml volumes using 1 mM of PCR primer, 2 ml ofdye terminator cycle sequencing reagents and1.5ml of sequencing reaction buffer (BeckmanCoulter, High Wycombe, UK). Sequencing PCRconditions were 35 cycles of 96 1C for 20 s, 50 1Cfor 20 s and 60 1C for 4 min. Excess dye-labellednucleotides were removed by standard ethanol—sodium acetate precipitation. Sequence productswere re-suspended in 35ml of sample loadingsolution (Beckman Coulter) and run on a CEQ8000DNA sequencer (Beckman Coulter).
Sequences were edited using CEQ software(Beckman Coulter) and aligned in Sequencher 4.5(Gene Codes Corporation, Ann Arbor, MI, USA).Also included in the alignment were the ten publicly
ARTICLE IN PRESS
361DNA Barcoding of Diatoms
available (GenBank) diatom cox1 sequences:Thalassiosira pseudonana Hasle & Heimdal(NC007405, the focus of the diatom genomesequencing project; Armbrust et al. 2004),T. nordenskioeldii Cleve (AB020229), Aulacoseiraambigua (Grun. in Van Heurck) Simonsen(AB009418), Skeletonema costatum (Grev.) Cleve(AB020227), Ditylum brightwellii (T. West) Grun.(AB020223), Rhizosolenia setigera Brightwell(AB020226), Fragilaria striatula Lyngbye (AB020224),Thalassionema nitzschioides (Grun.) Grun. ex Hustedt(AB020228), Cylindrotheca closterium (Ehrenb.)Reimann & Lewin (AB020222) and Nitzschia frustu-lum (Kutz.) Grun. in Cleve & Grun. (AB020225).
PRIMER3 (Rozen and Skaletsky 2000) was used todesign additional primers, primarily focusing onSellaphora. These were tested in combination withbrown algal cox1 barcode primers, GazF2 andGazR2 (Table 3; these unpublished primersequences were kindly provided by Prof. GarySaunders subsequent to the initial tests). Reactionvolumes were 50ml and contained 2ml of DNAtemplate, 1�NH4 buffer, 2.5 mM MgCl2, 200mMof dNTPs, 0.3mM of each primer and 1.25 unitsof Taq (Bioline, London, UK). PCR conditionswere 95 1C for 3 min; 35 cycles of 95 1C for 30 s,50 1C for 1 min and 72 1C for 1.5 min; and, finally,72 1C for 5 min. PCR purifications, sequencing,editing and aligning were performed as detailedabove. GenBank accessions are provided inTable 1.
Assessing levels of intra- and interspecificcox1 and rbcL divergence: Intra- and interspe-cific thresholds of cox1 and rbcL divergenceswere estimated by calculating the number of bpdifferences between Sellaphora isolates belongingto the same taxa and those belonging to selectedclosely related taxa (at least 60% MP BS in eitherthe cox1 or rbcL phylogenetic analyses).
Cox1 and rbcL phylogenetic analyses: Cox1and rbcL phylogenies were constructed usingexactly the same Sellaphora isolates except fortwo S. capitata clones (rbcL data were from BEL1and BLA10: GenBank accessions EF143301 andEF143295, respectively) and one S. blackfordensisclone (rbcL data were from BLA6: EF143282);these clones were nevertheless from the samelocations. Two of the three outgroup taxa used inthe rbcL analysis differed from those used inthe cox1 analysis because they were chosenwith different criteria in mind (Fallacia cf. forcipata(Grev.) Stickle et Mann and Rossia sp.: EF143289and EF143281, respectively).
MP and ML analyses were carried out usingPAUP* 4.0b10 (Swofford 2002). MP analyses
implemented two heuristic search strategies.Search strategy I involved 1000 replicates, savingall trees and using tree-bisection-reconnection(TBR) branch-swapping, with all charactersequally weighted and gaps treated as missingdata; starting trees were obtained via randomstepwise addition. Strategy II involved 10,000replicates, saving only two trees at each step, tocheck for the presence of islands of trees notfound in the first search. Parsimony BS valueswere calculated with 100 replicates of a 1000-replicate heuristic search, saving all trees, withTBR branch-swapping. ML analyses were con-ducted using models selected by the HLR test inModelTest (Posada and Crandall 1998). Fullheuristic ML searches of 100 replicates werecarried out, with the molecular clock not enforced,using TBR branch-swapping, saving two treesper replicate. JK values were calculated from a100-replicate ML analysis, resampling 50% ofcharacters, with starting trees obtained byneighbour-joining.
Acknowledgements
We thank Prof. Gary Saunders for very kindlyproviding us with the brown algal cox1 barcodeprimer sequences, Dr. Michelle Hollingsworth foradvice on molecular work, Dr. Pete Hollingsworthfor fruitful discussions, and Prof. Wim Vyverman,Jeroen Van Wichelen, Dr. Wiebe Kooistra, Prof.A. Poulıckova, and especially Dr. Victor Chepur-nov, for providing material and for isolating clones.Lastly, we thank two anonymous reviewers fortheir constructive comments. This work wassupported by a Natural Environment ResearchCouncil Fellowship (NE/C518373/1) to KME.
Appendix A. Supplementary Material
Supplementary material associated with this articlecan be found in the online version at doi:10.1016/j.protis.2007.04.001.
References
Alvarez I, Wendel JF (2003) Ribosomal ITS sequences andplant phylogenetic inference. Mol Phylogenet Evol 29:417—434
Alverson AJ, Kolnick L (2005) Intragenomic nucleotidepolymorphism among small subunit (18S) rDNA paralogs inthe diatom genus Skeletonema (Bacillariophyta). J Phycol 41:1248—1257
ARTICLE IN PRESS
362 K.M. Evans et al.
Amato A, Kooistra WHCF, Levialdi Ghiron JH, Mann DG,Proschold T, Montresor M (2007) Reproductive isolationamong sympatric cryptic species in marine diatoms. Protist158: 193—207
Aoyama H, Hagiwara Y, Misumi O, Kuroiwa T, Nakamura S(2006) Complete elimination of maternal mitochondrial DNAduring meiosis resulting in the paternal inheritance of themitochondrial genome in Chlamydomonas species. Proto-plasma 228: 231—242
Armbrust EV, Berges JA, Bowler C, Green BR, Martinez D,Putnam NH, Zhou S, Allen AE, Apt KE, Bechner M,Brzezinski MA, Chaal BK, Chiovitti A, Davis AK, DemarestMS, Detter JC, Glavina T, Goodstein D, Hadi MZ, HellstenU, Hildebrand M, Jenkins BD, Jurka J, Kapitonov VV,Kroger N, Lau WWY, Lane TW, Larimer FW, Lippmeier JC,Lucas S, Medina M, Montsant A, Obornik M, Parker MS,Palenik B, Pazour GJ, Richardson PM, Rynearson TA,Saito MA, Schwartz DC, Thamatrakoln K, Valentin K, VardiA, Wilkerson FP, Rokhsar DS (2004) The genome of thediatom Thalassiosira pseudonana: ecology, evolution andmetabolism. Science 306: 79—86
Beagley CT, Okada NA, Wolstenholme DR (1996) Twomitochondrial group I introns in a metazoan, the sea anemoneMetridium senile: one intron contains genes for subunits 1 and3 of NADH dehydrogenase. Proc Natl Acad Sci USA 93:5619—5623
Behnke A, Friedl T, Chepurnov VA, Mann DG (2004)Reproductive compatibility and rDNA sequence analyses inthe Sellaphora pupula species complex (Bacillariophyta).J Phycol 40: 193—208
Bergthorsson U, Adams KL, Thomason B, Palmer JD(2003) Widespread horizontal transfer of mitochondrial genesin flowering plants. Nature 424: 197—201
Beszteri B, Acs E, Medlin LK (2005) Ribosomal DNAsequence variation among sympatric strains of the Cyclotellameneghiniana complex (Bacillariophyceae) reveals crypticdiversity. Protist 156: 317—333
Blaxter M, Elsworth B, Daub J (2004) DNA taxonomy of aneglected animal phylum: an unexpected diversity of tardi-grades. Proc R Soc Lond B 271: S189—S192
Brodie J, Hayes PK, Barker GL, Irvine LM (1996) Molecularand morphological characters distinguishing two Porphyraspecies (Rhodophyta: Bangiophycidae). Eur J Phycol 31:303—308
Chase MW, Salamin N, Wilkinson M, Dunwell JM, PrasadKesanakurthi R, Haidar N, Savolainen V (2005) Land plantsand DNA barcodes: short-term and long-term goals. PhilTrans R Soc Lond B 360: 1889—1895
Chepurnov VA, Mann DG, Sabbe K, Vyverman W (2004)Experimental studies on sexual reproduction in diatoms. IntRev Cytol 237: 91—154
Chepurnov VA, Mann DG, Vyverman W, Sabbe K, Danie-lidis DB (2002) Sexual reproduction, mating system andprotoplast dynamics of Seminavis (Bacillariophyceae).J Phycol 38: 1004—1019
Cowan RS, Chase MW, Kress WJ, Savolainen V (2006)300,000 species to identify: problems, progress, and pro-spects in DNA barcoding of land plants. Taxon 55: 611—616
Coyer JA, Peters AF, Hoarau G, Stam WT, Olsen JL (2002)Inheritance patterns of ITS1, chloroplasts and mitochondria inartificial hybrids of the seaweeds Fucus serratus and F.evanescens (Phaeophyceae). Eur J Phycol 37: 173—178
Ehara M, Inagaki Y, Watanabe KI, Ohama T (2000a)Phylogenetic analysis of diatom cox1 genes and implicationsof a fluctuating GC content on mitochondrial genetic codeevolution. Curr Genet 37: 29—33
Ehara M, Watanabe KI, Ohama T (2000b) Distribution ofcognates of group II introns detected in mitochondrial cox1genes of a diatom and a haptophyte. Gene 256: 157—167
Finlay BJ, Monaghan EB, Maberly SC (2002) Hypothesis:the rate and scale of dispersal of freshwater diatom species isa function of their global abundance. Protist 153: 261—273
Fontaine J-M, Goux D, Kloareg B, Loiseaux-de Goer S(1997) The reverse-transcriptase-like proteins encoded bygroup II introns in the mitochondrial genome of the brown algaPylaiella littoralis belong to two different lineages whichapparently coevolved with the group II ribosyme lineages.J Mol Evol 44: 33—42
Gompert Z, Nice CC, Fordyce JA, Forister ML, Shapiro AM(2006) Identifying units for conservation using molecularsystematics: the cautionary tale of the Karner blue butterfly.Mol Ecol 15: 1759—1768
Gray MW, Lang BF, Burger G (2004) Mitochondria of protists.Annu Rev Genet 38: 477—524
Guillard RRL, Lorenzen CL (1972) Yellow-green algae withchlorophyllide c. J Phycol 8: 10—14
Hajibabaei M, Janzen DH, Burns JM, Hallwachs W, HebertPDN (2006) DNA barcodes distinguish species of tropicalLepidoptera. Proc Natl Acad Sci USA 103: 968—971
Hebert PDN, Cywinska A, Ball SL, deWaard JR (2003)Biological identifications through DNA barcodes. Proc R SocLond B 270: 313—321
Hebert PDN, Stoeckle MY, Zemlak TS, Francis CM (2004)Identification of birds through DNA barcodes. PLoS Biology 2:1657—1663
Hughes Martiny JB, Bohannan BJM, Brown JH, ColwellRK, Fuhrman JA, Green JL, Horner-Devine MC, Kane M,Adams Krumins J, Kuske CR, Morin PJ, Naeem S, ØvreasL, Reysenbach A-L, Smith VH, Staley JT (2006) Microbialbiogeography: putting microorganisms on the map. Nat RevMicrobiol 4: 102—112
Hurst GDD, Jiggins FM (2005) Problems with mitochondrialDNA as a marker in population, phylogeographic andphylogenetic studies: the effects of inherited symbionts. ProcR Soc Lond B 272: 1525—1534
Jensen KJ, Moestrup Ø, Schmid A-MM (2003) Ultrastruc-ture of the male gametes from two centric diatoms,Chaetoceros laciniosus and Coscinodiscus wailesii (Bacillar-iophyceae). Phycologia 42: 98—105
Jones HM, Simpson GE, Stickle AJ, Mann DG (2005) Lifehistory and systematics of Petroneis (Bacillariophyta), withspecial reference to British waters. Eur J Phycol 40: 61—87
Kato Y, Kogame K, Nagasato C, Motomura T (2006)Inheritance of mitochondrial and chloroplast genomes in the
ARTICLE IN PRESS
363DNA Barcoding of Diatoms
isogamous brown alga Scytosiphon lomentaria (Phaeophy-ceae). Phycol Res 54: 65—71
Knoop V (2004) The mitochondrial DNA of land plants:peculiarities in phylogenetic perspective. Curr Genet 46:123—139
Kress WJ, Wurdack KJ, Zimmer EA, Weigt LA, Janzen DH(2005) Use of DNA barcodes to identify flowering plants. ProcNatl Acad Sci USA 102: 8369—8374
Lundholm N, Moestrup Ø, Kotaki Y, Hoef-Emden K,Scholin C, Miller P (2006) Inter- and intraspecific variationof the Pseudo-nitzschia delicatissima complex (Bacillariophy-ceae) illustrated by rRNA probes, morphological data andphylogenetic analyses. J Phycol 42: 464—481
Mann DG (1984) Observations on copulation in Naviculapupula and Amphora ovalis in relation to the nature of diatomspecies. Ann Bot 54: 429—438
Mann DG (1989) The species concept in diatoms: evidencefor morphologically distinct, sympatric gamodemes in fourepipelic species. Plant Syst Evol 164: 215—237
Mann DG (1996) Chloroplast Morphology, Movements andInheritance in Diatoms. In Chaudhary BR, Agrawal SB (eds)Cytology, Genetics and Molecular Biology of Algae. SPBAcademic Publishing, Amsterdam, pp 249—274
Mann DG (1999) The species concept in diatoms. Phycologia38: 437—495
Mann DG, Chepurnov VA (2004) What have the Romansever done for us? The past and future contribution ofculture studies to diatom systematics. Nova Hedwigia 79:237—291
Mann DG, Chepurnov VA (2005) Auxosporulation, matingsystem, and reproductive isolation in Neidium (Bacillario-phyta). Phycologia 44: 335—350
Mann DG, Droop SJM (1996) Biodiversity, biogeography andconservation of diatoms. Hydrobiologia 336: 19—32
Mann DG, Chepurnov VA, Droop SJM (1999) Sexuality,incompatibility, size variation, and preferential polyandry innatural populations and clones of Sellaphora pupula (Bacillar-iophyceae). J Phycol 35: 152—170
Mann DG, McDonald SM, Bayer MM, Droop SJM, Che-purnov VA, Loke RE, Ciobanu A, Hans du Buf JM(2004) The Sellaphora pupula species complex (Bacillariophy-ceae): morphometric analysis, ultrastructure and mating dataprovide evidence for five new species. Phycologia 43:459—482
Mayr E (1942) Systematics and the Origin of Species.Columbia University Press, New York
Medlin LK, Elwood HJ, Stickel S, Sogin ML (1991)Morphological and genetic variation within the diatomSkeletonema costatum (Bacillariophyta): evidence for a newspecies, Skeletonema pseudocostatum. J Phycol 27:514—524
Meyer CP, Paulay G (2005) DNA barcoding: error rates basedon comprehensive sampling. PLoS Biology 3: 2229—2238
Newmaster SG, Fazekas AJ, Ragupathy S (2006) DNAbarcoding in land plants: evaluation of rbcL in a multigenetiered approach. Can J Bot 84: 335—341
Nozaki H, Takahara M, Nakazawa A, Kita Y, Yamada T,Takano H, Kawano S, Kato M (2002) Evolution of rbcL groupIA introns and intron open reading frames within the colonialVolvocales (Chlorophyceae). Mol Phylogenet Evol 23:326—338
Orsini L, Procaccini G, Sarno D, Montresor M (2004)Multiple rDNA ITS-types within the diatom Pseudo-nitzschiadelicatissima (Bacillariophyceae) and their relative abun-dances across a spring bloom in the Gulf of Naples. MarEcol Progr Ser 271: 87—98
Oudot-le Secq MP, Kloareg B, Loiseaux-de Goer S (2002)The mitochondrial genome of the brown alga Laminariadigitata: a comparative analysis. Eur J Phycol 37: 163—172
Oudot-Le Secq M-P, Loiseaux-de Goer S, Stam WT, OlsenJL (2006) Complete mitochondrial genomes of the threebrown algae (Heterokonta: Phaeophyceae) Dictyota dichot-oma, Fucus vesiculosus and Desmarestia viridis. Curr Genet49: 47—58
Pace NR (1997) A molecular view of microbial diversity andthe biosphere. Science 276: 734—740
Pedros-Alio C (2006) Marine microbial diversity: can it bedetermined? Trends Microbiol 14: 257—263
Pons J, Barraclough TG, Gomez-Zurita J, Cardoso A,Duran DP, Hazell S, Kamoun S, Sumlin WD, Vogler AP(2006) Sequence-based species delimitation for the DNAtaxonomy of undescribed insects. Syst Biol 55: 595—609
Posada D, Crandall KA (1998) MODELTEST: testing themodel of DNA substitution. Bioinformatics 14: 817—818
Robba L, Russell SJ, Barker GL, Brodie J (2006) Assessingthe use of the mitochondrial cox1 marker for use in DNAbarcoding of red algae (Rhodophyta). Am J Bot 93:1101—1108
Round FE, Crawford RM, Mann DG (1990) The Diatoms.Cambridge University Press, Cambridge
Rozen S, Skaletsky HJ (2000) PRIMER3 on the WWW forGeneral Users and for Biologist Programmers. In Krawetz S,Misener S (eds) Bioinformatics Methods and Protocols:Methods in Molecular Biology. Humana Press, Totowa,pp 365—386
Rubinoff D (2006) Utility of mitochondrial DNA barcodes inspecies conservation. Conservat Biol 20: 1026—1033
Sanchez Puerta MV, Bachvaroff R, Delwiche CF (2004) Thecomplete mitochondrial genome sequence of the haptophyteEmiliania huxleyi and its relation to Heterokonts. DNA Res 11:1—10
Sarno D, Kooistra WHCF, Medlin LK, Percopo I, Zingone A(2005) Diversity in the genus Skeletonema (Bacillariophyceae).II. An assessment of the taxonomy of S. costatum-like specieswith the description of four new species. J Phycol 41:151—176
Saunders GW (2005) Applying DNA barcoding to redmacroalgae: a preliminary appraisal holds promise for futureapplication. Phil Trans R Soc Lond B 360: 1879—1888
Savolainen V, Cowan RS, Vogler AP, Roderick GK, Lane R(2005) Towards writing the encyclopaedia of life: an introduc-tion to DNA barcoding. Phil Trans R Soc Lond B 360:1805—1811
ARTICLE IN PRESS
364 K.M. Evans et al.
Seifert KA, Samson RA, deWaard JR, Houbraken J,Levesque CA, Moncalvo J-M, Louis-Seize G, Hebert PDN(2007) Prospects for fungus identification using CO1 DNAbarcodes, with Penicillium as a test case. Proc Natl Acad SciUSA 104: 3901—3906
Sims PA, Mann DG, Medlin LK (2006) Evolution of thediatoms: insights from fossil, biological and molecular data.Phycologia 45: 361—402
Slapeta J, Moreira D, Lopez-Garcıa P (2005) The extent ofprotist diversity: insights from molecular ecology of freshwatereukaryotes. Proc R Soc Lond B 272: 2073—2081
Sluiman HJ, Guihal C (1999) Phylogenetic position ofChaetosphaeridium (Chlorophyta), a basal lineage in theCharophyceae, inferred from 18S rDNA sequences. J Phycol35: 395—402
Swofford DL (2002) PAUP*. Phylogenetic Analysis UsingParsimony (*and other methods). 4.0b10. Sinauer, Sunderland
Taylor JW, Jacobson DJ, Kroken S, Kasuga T, Geiser DM,Hibbett DS, Fisher MC (2000) Phylogenetic species recognitionand species concepts in fungi. Fungal Genet Biol 31: 21—32
Taylor NG (1996) Variation Within Sellaphora pupula (Bacillar-iophyta) in Relation to the Species Concept. MSc Thesis,University of Edinburgh
Vanormelingen P, Chepurnov VA, Mann DG, Cousin S,Vyverman W (2007) Congruence of morphological, reproduc-tive and ITS rDNA sequence data in some AustralasianEunotia bilunaris (Bacillariophyta). Eur J Phycol 42: 61—79
Watanabe KI, Ehara M, Inagaki Y, Ohama T (1998)Distinctive origins of group I introns found in the cox1 genesof three green algae. Gene 213: 1—7
Witt JDS, Threloff DL, Hebert PDN (2006) DNA barcodingreveals extraordinary cryptic diversity in an amphipod genus:implications for desert spring conservation. Mol Ecol 15:3073—3082