ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER...

22
ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael E. Pfrender Department of Biology, Utah State University, Logan, Utah 84322-5305 USA e-mail: [email protected] Charles P. Hawkins Western Center for Monitoring and Assessment of Freshwater Ecosystems, Department of Watershed Science, Utah State University, Logan, Utah 84322-5210 USA e-mail: [email protected] Mark Bagley National Exposure Research Laboratory, U.S. EPA, Cincinnati, Ohio 45268 USA e-mail: [email protected] Gregory W. Courtney Department of Entomology, Iowa State University, Ames, Iowa 50011-3222 USA e-mail: [email protected] Brian R. Creutzburg Western Center for Monitoring and Assessment of Freshwater Ecosystems, Department of Watershed Science, Utah State University, Logan, Utah 84322-5210 USA e-mail: [email protected] John H. Epler Crawfordville, Florida 323 27 USA e-mail: johnepler3@comcast.net Steve Fend U.S. Geological Survey, Menlo Park, California 94025 USA e-mail: [email protected] Current address: Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana 46556 USA Leonard C. Ferrington, Jr. Department of Entomology, University of Minnesota, Saint Paul, Minnesota 55108-6125 USA e-mail: [email protected] Paula L. Hartzell Pacific Cooperative Studies Unit, University of Hawaii at Ma noa, and the Hawaii Department of Land and Natural Resources, Honolulu, Hawaii USA e-mail: [email protected] Suzanne Jackson National Exposure Research Laboratory, U.S. EPA, Cincinnati, Ohio 45268 USA e-mail: [email protected] David P. Larsen Pacific States Marine Fisheries Commission, Corvallis, Oregon 97333 USA e-mail: [email protected] C. Andre ´ Le ´vesque Agriculture and Agri-Food Canada, Ottawa, Ontario, K1A 0C6 Canada e-mail: [email protected] John C. Morse Department of Entomology, Soils, & Plant Sciences, Clemson University, Clemson, South Carolina 29634- 0315 USA e-mail: [email protected] Matthew J. Petersen Department of Entomology, Iowa State University, Ames, Iowa 50011-3222 USA e-mail: mjp266@cornell.edu Dave Ruiter Centennial, Colorado 80121 USA e-mail: [email protected] The Quarterly Review of Biology, September 2010, Vol. 85, No. 3 Copyright © 2010 by The University of Chicago Press. All rights reserved. 0033-5770/2010/8503-0003$15.00 Volume 85, No. 3 September 2010 THE QUARTERLY REVIEW OF BIOLOGY 319

Transcript of ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER...

Page 1: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

ASSESSING MACROINVERTEBRATE BIODIVERSITY INFRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN

DNA-BASED APPROACHES

Michael E. Pfrender�Department of Biology, Utah State University,

Logan, Utah 84322-5305 USAe-mail: [email protected]

Charles P. HawkinsWestern Center for Monitoring and Assessment of

Freshwater Ecosystems, Department of Watershed Science,Utah State University, Logan, Utah 84322-5210 USA

e-mail: [email protected]

Mark BagleyNational Exposure Research Laboratory, U.S. EPA,

Cincinnati, Ohio 45268 USAe-mail: [email protected]

Gregory W. CourtneyDepartment of Entomology, Iowa State University,

Ames, Iowa 50011-3222 USAe-mail: [email protected]

Brian R. CreutzburgWestern Center for Monitoring and Assessment of

Freshwater Ecosystems, Department of Watershed Science,Utah State University, Logan, Utah 84322-5210 USAe-mail: [email protected]

John H. EplerCrawfordville, Florida 323 27 USA

e-mail: [email protected]

Steve FendU.S. Geological Survey, Menlo Park, California

94025 USAe-mail: [email protected]

�Current address: Department of BiologicalSciences, University of Notre Dame, Notre Dame,Indiana 46556 USA

Leonard C. Ferrington, Jr.Department of Entomology, University of Minnesota,

Saint Paul, Minnesota 55108-6125 USAe-mail: [email protected]

Paula L. HartzellPacific Cooperative Studies Unit, University of Hawaii at

Ma�noa, and the Hawaii Department of Land andNatural Resources, Honolulu, Hawaii USA

e-mail: [email protected]

Suzanne JacksonNational Exposure Research Laboratory, U.S. EPA,

Cincinnati, Ohio 45268 USAe-mail: [email protected]

David P. LarsenPacific States Marine Fisheries Commission,

Corvallis, Oregon 97333 USAe-mail: [email protected]

C. Andre LevesqueAgriculture and Agri-Food Canada, Ottawa,

Ontario, K1A 0C6 Canadae-mail: [email protected]

John C. MorseDepartment of Entomology, Soils, & Plant Sciences,

Clemson University, Clemson, South Carolina 29634-0315 USA

e-mail: [email protected]

Matthew J. PetersenDepartment of Entomology, Iowa State University,

Ames, Iowa 50011-3222 USAe-mail: [email protected]

Dave RuiterCentennial, Colorado 80121 USA

e-mail: [email protected]

The Quarterly Review of Biology, September 2010, Vol. 85, No. 3

Copyright © 2010 by The University of Chicago Press. All rights reserved.

0033-5770/2010/8503-0003$15.00

Volume 85, No. 3 September 2010THE QUARTERLY REVIEW OF BIOLOGY

319

Page 2: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

David SchindelConsortium for the Barcode of Life, National

Museum of Natural History, SmithsonianInstitution, Washington, DC 20013-7012 USA

e-mail: [email protected]

Michael WhitingDepartment of Biology,

Brigham Young University,Provo, Utah 84602 USA

e-mail: [email protected]

keywordsbarcoding, invertebrates, bioassessment, biodiversity, freshwater,

next-generation sequencing

abstractAssessing the biodiversity of macroinvertebrate fauna in freshwater ecosystems is an essential

component of both basic ecological inquiry and applied ecological assessments. Aspects of taxonomicdiversity and composition in freshwater communities are widely used to quantify water quality andmeasure the efficacy of remediation and restoration efforts. The accuracy and precision of biodiversityassessments based on standard morphological identifications are often limited by taxonomic resolutionand sample size. Morphologically based identifications are laborious and costly, significantly con-straining the sample sizes that can be processed. We suggest that the development of an assay platformbased on DNA signatures will increase the precision and ease of quantifying biodiversity in freshwaterecosystems. Advances in this area will be particularly relevant for benthic and planktonic inverte-brates, which are often monitored by regulatory agencies. Adopting a genetic assessment platform willalleviate some of the current limitations to biodiversity assessment strategies. We discuss the benefitsand challenges associated with DNA-based assessments and the methods that are currently available.As recent advances in microarray and next-generation sequencing technologies will facilitate atransition to DNA-based assessment approaches, future research efforts should focus on methods fordata collection, assay platform development, establishing linkages between DNA signatures andwell-resolved taxonomies, and bioinformatics.

Freshwater Biodiversity Assessment:Background and Significance

QUANTIFYING SPECIES compositionand richness is fundamental to thestudy of freshwater ecosystems, but ob-

taining accurate and precise estimates ofthese biodiversity metrics is both difficultand costly. We need accurate, precise, rapid,and cost-effective methods to assess the statusof local and regional biodiversity and to pre-dict responses to changes in climate, invasivespecies, and land and water alterations(Sharley et al. 2004; Carew et al. 2003, 2005,2007a,b; Ball et al. 2005; Pfenninger et al.2007; Sinclair and Greens 2008). Resourcescientists and managers need such informa-tion to understand the richness and vari-ability of natural ecosystems, as well as howbiological communities respond to bothstress (e.g., natural and anthropogenic dis-turbances) and management (e.g., restora-tion practices). The implementation of ahigh-throughput, DNA-based identifica-

tion system for biodiversity assessmentcould greatly improve data quality, whilereducing both the costs of obtaining dataand the time between sample collectionand data compilation.

The availability of high-quality biodi-versity data is especially critical for effectiveecological assessment. Local, state, and fed-eral environmental management agenciesthroughout the United States and elsewhereuse biodiversity data derived from samples ofbenthic invertebrate assemblages to quantifythe ecological condition of aquatic ecosys-tems (e.g., Rosenberg and Resh 1993;USEPA 2002). These assessments generallyuse community-level indices based on as-pects of taxonomic composition to measurethe degree to which biological communitiesdiffer from those that would be expected tooccur under reference or baseline condi-tions (Hughes et al. 1986; Reynoldson andWright 2000; Stoddard et al. 2006; Hawkinset al. 2010). Assessments based on these in-dices are key to quantifying both the biolog-

320 Volume 85THE QUARTERLY REVIEW OF BIOLOGY

Page 3: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

ical impacts of pollution and the degree towhich management practices are effective inrestoring or rehabilitating damaged ecosys-tems. However, the utility of these indices isstrongly influenced by their accuracy andprecision—two statistical properties that aremarkedly affected by how thoroughly a site issampled and how accurately the biota isidentified.

For biological indices to realize their po-tential, they need to be as accurate andprecise as possible—that is, they need to char-acterize the targeted biological assemblagewithout bias. Sampling error associated withestimates of the biota present at a site shouldbe small enough to allow unambiguous de-tection of ecologically significant changes incondition. Recent empirical studies showhow inadequate sample counts and coarsetaxonomic resolution can independentlyhinder detection of real biological impacts.First, Cao et al. (2002a,b, 2007) and Cao andHawkins (2005) showed how the use of small(i.e., 100–300 count) samples produces bothimprecision and bias when estimating rela-tive differences in taxa richness and compo-sition among locations, regardless of thetaxonomic resolution used in assess-ments. Second, the use of more highly re-solved identifications can reveal the effects oflandscape and waterway alteration on fresh-water assemblages that are not detectedwhen coarse taxonomy is used (review byJones 2008). For example, Hawkins et al.(2000) showed that a measure of taxa com-pleteness based on genus/species level iden-tifications detected the effects of watershedalternation on stream invertebrate assem-blages in the Sierra Nevada of California,USA, whereas an otherwise similar, family-based measure detected no difference be-tween streams in reference and managedwatersheds. Similar results based on a vari-ety of assemblage-level indices and analysesfor freshwater invertebrate assemblages havebeen reported in northeastern France(Guerold 2000); Florida, USA (King andRichardson 2002); New York, USA (Arscottet al. 2006); North Carolina, USA (Hawkins2006); Northern Territories, Australia(Lamche and Fukuda 2008); and West Vir-ginia, USA (Pond et al. 2008).

The potential costs of drawing incorrectinferences, as either false-negatives or false-positives, from inaccurate or imprecise indi-ces can be staggering. The inability to detectecological degradation when it actually existscondemns freshwater ecosystems to contin-ued degradation. Incorrect assessments thatsystems are degraded when they really arenot can trigger expensive but unwarrantedrestoration/remediation, as well as litigationand reduction in public support. Consider-ing that approximately $110 million arespent each year in the U.S. on water qual-ity assessments and that $260 million isviewed as the amount that is actuallyneeded (ASIWPCA 2002), the develop-ment of indices that are as accurate andprecise as possible, as well as those that arecost-effective and able to be rapidly imple-mented, should be a national priority.

Recent and rapidly emerging develop-ments in the analysis of genetic material (i.e.,diagnostic DNA markers) should provide afast, cost-effective means of addressing theseneeds. These DNA-based assay tools have thepotential to greatly improve the quality ofdata collected from freshwater ecosystems,therefore allowing us to better assess andpredict the consequences of landscape andwaterway alteration on these systems. In thispaper, we review the opportunities and chal-lenges associated with moving toward theroutine use of DNA markers for the identifi-cation of the taxonomic composition of bulksamples of freshwater invertebrates.

Limitations to CurrentMorphologically-Based

Bioassessmentsdata quality and cost

The quality of biological surveys dependson the degree to which field samples accu-rately and precisely characterize the biota atsites. Data quality largely depends on twosample properties: (1) how well the collectedsample represents the biota inhabiting thetargeted site (i.e., the quality of the site scaledesign), and (2) how well the collected sam-ple(s) are evaluated (i.e., quality of sample

September 2010 321DIAGNOSTIC DNA MARKERS AND BIOASSESSMENT

Page 4: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

processing). The first property is a functionof both the mix of habitats sampled and thenumber of individuals collected. The secondproperty is dependent on the taxonomic res-olution and the specimen misidentificationrate. Selecting a coarse taxonomic resolutioncan obscure responses of one or more finer-level taxa (e.g., species) to environmentalalteration (Hawkins et al. 2000; Jones2008). Because processing samples of ben-thic invertebrates is time consuming, typi-cally less than 1 m2 of stream or lake bottomis sampled, and only 100–500 individualsfrom the sample are usually identified. Thissmall number of individuals is then used tocharacterize the entire benthic assem-blage—i.e., millions of individuals—at thesite. Both assessment accuracy and precisionimprove when the area sampled or the num-ber of individuals included in such a samplecan be increased (Cao et al. 2002a, b; Lorenzet al. 2004; Ostermiller and Hawkins 2004;Cao and Hawkins 2005; Clarke et al. 2006;Nichols et al. 2006) (Figure 1), yet smallsubsamples continue to be used because ofthe unacceptable costs associated with pro-cessing larger samples (Carter and Resh2001). Variability in descriptions of freshwa-ter benthic invertebrate assemblages is con-siderably influenced by the type of waterbody as well as with the specific biologicalmetric examined. For example, the percent-age of total variance (i.e., across sites andamong replicate samples) in metric valuesthat was associated with sampling error(within-stream replicates) ranged from0–99% across all combinations of 27 metricsand 19 types of European streams, and aver-aged 3–28% among the 27 different biolog-ical metrics (Clarke et al. 2006). This sourceof error could nearly be eliminated for manymetrics if more extensive areas of streamcould be sampled, thus resulting in theidentification of a larger number ofinvertebrates.

While significant limitations can be over-come through increased sample size, dataquality is further compromised by the factthat identifications are generally made at alevel of taxonomic resolution above the spe-cies level (e.g., genus, family, or higher tax-onomic levels), and both the consistency and

the accuracy of identifications can varygreatly across laboratories. The use of coarsetaxonomic resolution in bioassessments canblur species-specific signals and the sensitiv-ity of assessments. This lack of sensitivity ul-timately limits our ability to detect effects ofeither adding or removing stressors (Lenatand Resh 2001; Schmidt-Kloiber and Nijboer2004; Arscott et al. 2006; Hawkins 2006). Un-fortunately, one reason we are forced to usethese coarse levels of taxonomic resolution isthat most of the individuals in benthic sam-ples are juveniles that cannot be identified tospecies based on their morphological traits.Juveniles, and specimens that have beendamaged beyond recognition, may make upthe bulk of a sample, resulting in poor andpotentially misleading interpretations of spe-cies composition.

Inconsistencies among labs and individu-als in identification skills will produce data ofvariable quality. For example, in the recentnational assessment of wadeable streams andrivers in the USA (USEPA 2006), identifica-tion errors ranged between 8% and 30%(mean � 21%) across eight labs (Stribling etal. 2008). Also, some taxa are more cryptic tospecies identification than others. For in-stance, larvae of the ubiquitous Chironomi-dae (midges) are notoriously difficult toidentify; Epler (2001) reported a 6–60%misidentification rate among these organ-isms. Such variability in data quality willcause variation in assessments as well as inassessment quality at specific sites. This vari-ability compromises our ability to combinedata sets for regional assessments, and, inorder to use data from multiple labs or indi-viduals for such assessments, it would first benecessary to post-process all samples to acommon level of taxonomic resolution,which typically translates to the lowest qualitydata in the data sets of interest.

turn-around timeThe time required to conduct a biological

assessment is largely a function of the time ittakes to identify the taxa collected in a sam-ple. For assessments based on benthic mac-roinvertebrates, the set of taxa used mostwidely across the U.S. and elsewhere(USEPA 2002), the turn-around time can

322 Volume 85THE QUARTERLY REVIEW OF BIOLOGY

Page 5: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

Figure 1. Effect of Sample Size on BioassessmentExamples of how both assessment precision (top panel) and accuracy (bottom panel) are affected by the size

of fixed-count samples. The graph in the top panel is modified from data presented in Ostermiller and Hawkins(2004) and illustrates how the precision of an O/E index used in bioassessment is affected by the sample countsused to calibrate models that predict the number of taxa expected to occur at a site (E). There is a strongrelationship (r2�0.84) between the precision of the O/E index (i.e., the standard deviation of O/E valuesobserved at reference quality sites) and the sample count. This relationship implies that samples would needto contain about 2,000 individuals (SD � 0.224 – 0.000119*count) in order to achieve perfect precision. Inreality, the linear relationship observed here would likely become asymptotic (concave up) at higher samplecounts, thus requiring even higher counts to achieve high precision. The graph in the bottom panel is modifiedfrom data presented by Cao and Hawkins (2005) and illustrates how the numbers of taxa lost with increasingsimulated stress are underestimated when small fixed-count samples are used in biological assessments. Each datapoint is a mean value derived from 11 fixed-count resamplings of the assemblage that resulted following theapplication of 9 increasingly severe levels of stress. The dashed line represents a 1:1 correspondence betweenestimated and true taxa loss. The magnitude of difference between estimated and true taxa loss increases withdecreasing sample size, and estimates can often imply that assemblages are either not losing or are even gaining taxa(negative values of estimated taxa loss) when taxa loss is actually occurring.

September 2010 323DIAGNOSTIC DNA MARKERS AND BIOASSESSMENT

Page 6: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

often be several months. Although there aremany labs around the world that specialize inthe identification of freshwater benthic in-vertebrates, their throughput is affected bythe timing of sample delivery, the time ittakes to process an individual sample, and byeconomies of scale. For example, samplesare typically collected in the field from latespring to early autumn, and accumulatedbatches of samples are sent to contractedlabs. These samples are then typically pro-cessed in the order that they are received.Because individual samples can take 2–8hours to process, many samples will sit forseveral weeks or months before they are pro-cessed. Reduction in sample processing timewould improve our capability to act promptlyin response to environmental degradation.

Improving the reliability and defensibilityof general survey data, as well as those dataused in bioassessments, will require largersamples and more accurate and refinedidentification of the taxa contained therein.More rapid turnaround in the time it takesto convert field samples into usable datawould greatly enhance the use of biologicalinformation for management purposes. Thepresent constraints could be minimized if wecould rapidly assess the identities of all spe-cies in a large, bulk sample of invertebrates,and the use of diagnostic DNA markers, aswe discuss below, has great potential to pro-vide us with that ability.

Genetic Approaches to BioassessmentA number of DNA-based assay platforms

are currently available and are being used inincreasingly diverse ecological contexts(Thomas and Klaper 2004). Most notably,genetic characterization of prokaryotic com-munities through the application of next-generation DNA sequencing and DNAmicroarrays is becoming commonplace(DeSantis et al. 2005; Brodie et al. 2007; Heet al. 2007; Dinsdale et al. 2008; Zhou et al.2008). A similar genetic characterization andmeasurement of diversity in eukaryotic com-munities is now emerging (e.g., Creer et al.2010). The same basic approaches used inmicrobial communities are applicable to eu-karyotic communities and should prove tobe particularly useful in benthic invertebrate

biodiversity assessment. These technical ap-proaches differ substantially in their datarequirements, application, and utility forhigh-throughput assays. The first set of theseapproaches, including PCR-based fragmentanalysis and the use of micro- and macroar-rays, requires considerable up-front DNAdata infrastructure to design assay tools thattarget a predefined and well-characterizedbiota. These techniques rely on a prioriknowledge to design effective PCR prim-ers for DNA amplification and/or hybrid-ization probes on arrays.

An alternative approach, using next-gen-eration sequencing strategies (Marguilis etal. 2005), requires minimal initial develop-ment, but relies heavily on computationallyintensive data processing of DNA fragment(sequence) data to detect the occurrence ofunique sequences of DNA that are assumedto represent different taxa. These inferredtaxa form operational taxonomic units,which may or may not be linked to an estab-lished taxonomic framework. Ultimately, asemphasized below, establishing the link be-tween DNA-level data and ecological assess-ments based on traditional taxonomy is anecessary component of any of these efforts.

In the following sections, we provide abrief overview of genetic techniques thathave been used in biodiversity assessment aswell as emerging techniques amenable tohigh-throughput assessment, and we discussthe data infrastructure required to employthese techniques. Our goal is to highlightsome of the currently used and most prom-ising avenues for high-throughput systems,not to provide a comprehensive explanationof all possible techniques. In particular, weemphasize the limitations and a priori datarequirements for these diverse approaches.For a brief description of a wide variety ofgenetic techniques suitable for freshwaterbioassessment, see Box 1.

PCR-based fragment and sequenceanalysis

Initial DNA-based applications for biodi-versity assessment implemented PCR-basedtechniques targeting defined DNA regionsin the mitochondrial or nuclear genomes inspecific taxa. One widely used approach is

324 Volume 85THE QUARTERLY REVIEW OF BIOLOGY

Page 7: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

PCR restriction fragment length polymor-phism (PCR-RFLP). PCR-RFLP uses a com-bination of the PCR amplification of apredefined polymorphic DNA region andthe subsequent fragmentation of the ampli-fied DNA by restriction enzyme digestion.Characteristic patterns of fragment lengthsproduced by the restriction enzymes occurbecause of nucleotide variation in the DNA

sequences that alter the number of restric-tion enzyme recognition sites. This tech-nique is relatively inexpensive, requires onlybasic and widely available molecular labora-tory equipment, and has the potential forhigh-throughput. Start-up information isminimal, simply requiring sets of PCR prim-ers that will amplify the target DNA region inall the relevant taxa, as well as polymorphic

Box 1Techniques and approaches to genetic assessment of biodiversity. This box pro-vides some short definitions of the genetic techniques referred to in the text, as wellas a brief overview of the techniques that have been applied to bioassessment andof techniques that may become useful in this rapidly growing field.

Polymerase Chain Reaction (PCR): A variety of genetic assay techniques rely on theuse of PCR amplification to target genetic variation in defined fragments of nuclear ormitochondrial genomic DNA.

PCR-RFLP: In this widely used approach, a predefined gene region is amplified withDNA primers, and the amplified DNA segment is fragmented with restriction enzymes(REs). REs have specific nucleotide sequence recognition sites and cut the DNA atthese sites. Variation in the sequence at these recognition sites in the amplified DNAfragment yields different size fragments among taxa, hence the name restrictionfragment length polymorphism (RFLP). This technique involves sets of DNA primersthat bind to and amplify the target fragments in all the study taxa. Products arevisualized with an agarose gel, or the amplified PCR products are labeled with fluores-cent dyes and visualized in a capillary DNA sequencing machine. Banding patterns willdiffer if communities are different. Sequencing of the bands—something that is quitetechnically challenging—is required in order to identify the species affected.

Real-Time PCR: A modification of standard PCR amplification that monitors theamount of product formed in the reaction at each cycle. The rate of increase is relatedto the initial concentration of the template DNA in the PCR. This technique is usefulfor detecting low abundance DNA and can be used to quantify relative abundance atvarious taxonomic levels, depending on the specificity of the primers and fluorescentprobes. Multiplexing, i.e., the ability to detect multiple species, is limited with thecurrent technology.

Barcoding: The largest coordinated effort of the CBOL is devoted to characterizingthe DNA sequence at predefined, highly informative gene regions. The most com-monly used barcoding region is the cytochrome oxidase I (COI) gene in the mito-chondrial genome. Unique nucleotide sequences are a “barcode” that can be linked totraditional taxonomic designations and used for identification. This approach usesPCR and DNA sequencing.

Universal and specific PCR primers: The utility of all the PCR based techniques iscontingent on the performance of the PCR primers. In some applications (e.g.,Barcoding), it is highly advantageous to have universal primers that amplify any cleanspecimen, but, in order to detect an important species directly in the environment,primers need to be specific at the species level (see Real-time PCR above).

September 2010 325DIAGNOSTIC DNA MARKERS AND BIOASSESSMENT

Page 8: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

nucleotides in DNA sequences that fallwithin the recognition sites of the restrictionenzymes. Given a set of known sequences, itis possible to define a priori the number ofunique sequence variants that can be identi-fied by an optimized set of restrictionenzymes. A number of taxonomic groups,

including benthic invertebrate assemblages,have been characterized through PCR-RFLP(e.g., Carew et al. 2003). The major disad-vantage to the PCR-RFLP approach is its lim-ited ability to screen polymorphic sites in thetarget DNA region. Because variation inDNA fragment size is the result of nucleotide

Box 1Continued

Micro- and macroarrays: Unique single-stranded DNA oligonucleotide fragments arebound to a substrate (e.g., nylon membrane, glass slides or computer chip) in sets ofdistinct spots. Many small spots make microarrays, while fewer large spots producemacroarrays. Sample DNA is labeled (e.g., with fluorescent dyes) and hybridized to thearrays. Spots with hybridized DNA fluoresce and indicate the presence of complemen-tary DNA in the sample. This technique is commonly used for expression profiling tosurvey the activity of genes, and has been used in a number of studies to detect thepresence of species DNA markers in an environmental sample. Often arrays arecoupled with PCR to enrich a sample for specific DNA targets prior to hybridization.The major advantage is the ability to survey many different DNA variants simulta-neously. The disadvantages include the relatively high cost when compared to RT PCRand the inability to detect unknown sequences in a sample when compared to sequenc-ing techniques.

Illumina Inc., bead-based arrays: This platform utilizes a combination of a marker-specific nested PCR and a novel hybridization approach to survey polymorphism at alarge number of DNA sites. Originally, this technique was designed to assay singlenucleotide polymorphisms (SNPs) in human genomes. Bead-based arrays can assay upto 1500 polymorphic sites in a single assay and can be scaled up to run 96-well plates,thus giving it high-throughput potential.

Next-generation DNA sequencing: Unlike the PCR and hybridization approachesthat require DNA data for design, direct next-generation sequencing can be doneanonymously on virtually any sample. The sequence data are collected in Mb quanti-ties, and then unique variants are filtered out bioinformatically. There are threeleading platforms, each with a different sequencing methodology and differentstrengths and advantages. The Roche GS FLX generates �500 Mb per run, with readlengths of 400–500 bp. Illumina Inc. has partnered with Solexa to provide a GenomeAnalyzer System producing �2.3 Gb per run, with read lengths of 70 bp. AppliedBiosystems Inc. provides a relatively new addition to next-generation sequencing withtheir SOLiD System 2.0. This platform produces an impressive 5 Gb of data per run,with read lengths of 35 bp. The limitation to all these systems is the current high costper run. However, it is becoming practical to combine multiple uniquely taggedsamples in a single run. Given the rapid acceleration of these technologies and theincreasingly lower costs, next-generation sequencing holds tremendous promise forfuture application in biodiversity assessment.

Flow cytometry: In a clever combination of PCR, DNA hybridization, and flowcytometry, Diaz et al. (2006) developed a fungal identification system. This approachcould be tailored to high-throughput. The remaining challenge is to establish theupper bounds of unique variants that can be detected in a sample.

326 Volume 85THE QUARTERLY REVIEW OF BIOLOGY

Page 9: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

variation in restriction enzyme recognitionsites, this approach is, in practice, only assay-ing variation at a small fraction of the nucle-otide sites in a DNA sequence. Much of thepotentially informative variation in DNA se-quences is cryptic to PCR-RFLP, thus reduc-ing the information content and making itdifficult to identify novel sequence variants.In particular, distinguishing among closelyrelated taxa can be problematic, as diagnos-tic nucleotide variation may not coincidewith restriction sites, thereby limiting this ap-proach in fine-scale taxonomic resolution.

One approach to circumventing thesedata limitations is PCR amplification andDNA sequencing of the target region.Direct DNA sequencing reveals all polymor-phic sites and, thus, yields greater informa-tion. More data result in greater power todiscriminate among taxa, but this gain inresolution comes at a greater cost in bothtime and expense. Samples need to be pro-cessed one individual at a time, and the costsof processing samples and sequencing DNAgreatly limit the goal of a gain in sample size.An additional complication shared by theseapproaches arises when dealing with com-munities comprised of highly divergent taxa.As taxonomic distance increases, it becomesprogressively more difficult to design “uni-versal” PCR primers. For example, capturingthe taxonomic diversity typically present inassemblages of benthic invertebrates wouldlikely require the development of multiplesets of PCR primers, each targeting the sameDNA region in different sets of species.

A direct application of a PCR-DNA sequenc-ing approach is illustrated by the substantialand coordinated efforts of the Barcode of LifeInitiative (http://www.dnabarcodes.org). Theambitious aim of this group is to characterizeDNA sequence variation in all the major eu-karyotic groups based on predefined gene re-gion(s), and to link that variation to traditionaltaxonomic identity. Sequence variation in a di-agnostic �650 base pair portion of the mito-chondrial gene cytochrome c oxidase I (COI)is the target for eukaryotic identification. Thebarcoding approach based on COI as well asother DNA regions, such as intergenic spacers,has been highly successful in a wide variety oftaxonomic groups, including terrestrial (He-

bert et al. 2004; Barrett and Hebert 2005) andaquatic taxa (Neigel et al. 2007), and has beenapplied to a number of pressing ecological is-sues (e.g., invasive species [Armstrong and Ball2005; Harvey et al. 2009]). Sequence variationin the COI region has been used effectively ina number of studies to characterize benthicinvertebrate diversity (Sharley et al. 2004; Ca-rew et al. 2005, 2007a,b).

DNA hybridization-based approachesGiven a comprehensive database of diag-

nostic DNA markers, a spectrum of currentlyavailable technologies is amenable to the de-velopment of genetic assay tools; we will ex-pand on the development of such a databaseand the relationship between DNA markersand taxonomy below. These technologiesrange from oligonucleotide microarray plat-forms to single nucleotide polymorphism(SNP) bead-based arrays. Ultimately, thechoice of platforms will depend on thepractical consideration of the number oftaxa included, as well as per sample costs.A well-characterized library of diagnosticDNA markers for species allows the devel-opment of genetic approaches based onDNA-DNA hybridization techniques as areasonable alternative to PCR and DNAsequencing. DNA hybridization takes advan-tage of the complementary base-paired struc-ture of double-stranded DNA. In thesehybridization techniques, a unique singlestrand of DNA is bound to a membrane,glass slide, or bead. These platforms ofbound DNA can contain a few uniquesequences in a macroarray, or can be con-structed with a high density of many se-quence variants into microarrays. DNAextracted from an environmental sample islabeled with fluorescent dye and washedover the DNA array. Complementary se-quences in the sample hybridize to the DNAcaptured on the array, and the presence orabsence of any particular sequence can be dis-tinguished by the intensity of the fluorescence.The major potential gain of an array-basedplatform is the ability to batch-process wholecommunity-level samples. This capability allevi-ates the laborious one-by-one approachcurrently used for morphological identifi-

September 2010 327DIAGNOSTIC DNA MARKERS AND BIOASSESSMENT

Page 10: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

cation and required for standard PCR andDNA sequencing.

Microarrays have been commonly used asa tool for functional studies of gene expres-sion (Gracey and Cossins 2003; Stoughton2005) and for detection of single nucleotidepolymorphisms (SNPs) in single taxa (Co-mai et al. 2004; Gilchrist et al. 2006; Greshamet al. 2006), but applications to biodiversityassessment are just beginning to appear inthe literature. The taxonomic groups thathave thus far received the most attention areprokaryotic communities (Gentry et al. 2006;Wagner et al. 2007). Microarrays based onsequence variation in the 16S ribosomalRNA provide broad coverage of prokaryoticlineages and have been used to categorizebacterial diversity in environmental samples(Castiglioni et al. 2002; Call et al. 2003; Loyet al. 2005; Lozupone and Knight 2007). Inthese studies, wide taxonomic coverage isachieved by using sets of “universal” PCRprimers to amplify the 16S rRNA, and theamplification products are then hybridizedto arrays containing taxon-specific DNA frag-ments. Through the incorporation of DNAsequences of genes found within commonmetabolic pathways, the design of prokary-otic arrays and the objectives of these studieshave recently shifted to the quantification offunctional diversity in prokaryotic communi-ties (Dinsdale et al. 2008). Interestingly, thisfunctional view of prokaryotic diversity is, inturn, causing a shift in the emphasis ofstudies on biodiversity in prokaryotic com-munities away from a description of theabundance and distribution of unique lin-eages defined by characteristic 16S rRNA se-quences, and instead toward the descriptionof the abundance and distribution of genesand metabolic pathways in communities(Dinsdale et al. 2008). In eukaryotic taxa,array-based approaches have been applied todiversity studies of fungal communities (Le-vesque at al. 1998; Siefert and Lévesque2004; Tambong et al. 2006) and mammaliantaxa (Pfunder et al. 2004), and have beenused as forensic tools for detecting endan-gered vertebrates (Teletchea et al. 2008). Abarrier to the wider application of arrays toeukaryotic biodiversity assessment is the lack

of DNA sequence data available to designarrays in target taxonomic groups.

The greatest advantages to a microarrayapproach will be gained in contexts where alarge number of species are present withinor among the target communities, and whenthere is a need to process a high volume ofsamples. The initial investment in array de-sign and the relatively high per-array costmake this approach an impractical optionfor assessments targeting a relatively smallnumber of taxa, or in situations where thenumber of samples is small enough that PCRor sequencing strategies can more easily beemployed. However, large increases in thenumber of individuals in an environmentalsample require a substantial increase in laborand cost when using PCR-based strategiessuch as PCR-RFLP or DNA sequencing. Ar-rays, in contrast, may contain unique DNAsignatures from a large variety of taxa thatcanbeassayedsimultaneously.Currentlyavail-able high-density arrays may contain severalhundred thousand unique DNA fragments.Also, strategies to overcome the high persample cost of array processing are now avail-able. An example is the bead-based array,produced by Illumina, Inc., which uses anested PCR approach. This array can surveyup to 1500 unique polymorphisms and canbe scaled to a 96-well format, thereby allow-ing the simultaneous processing of multiplesamples. Membrane-based arrays can also bestripped and reused multiple times, (Fesse-haie et al 2003; Tambong et al. 2006) and areamenable to spotting with microarrayers atnear microarray density (Chen et al. 2009).

A number of technical challenges in thedesign of arrays for biodiversity assessmentneed to be addressed. What is the optimallength of DNA probes needed to reducenonspecific hybridization that may generatea false positive signal? How much redun-dancy in the number of probes should beincorporated into a microarray for the assess-ment of highly diverse and complex commu-nities of eukaryotes? How many gene regionswill be needed to identify a given number oftarget taxa? It is possible that a single gene orfew genes (e.g., the COI barcoding region)and a small number of unique probes maybe suitable for array design. In a modeling

328 Volume 85THE QUARTERLY REVIEW OF BIOLOGY

Page 11: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

exercise, Hajibabaei et al. (2007) used thedesign specification for an oligonucleotide-based array consisting of short 25merprobes, as well as the information content inthe COI and cytochrome b (cytb) genes, togauge the feasibility of constructing a mam-malian identification array in silico. Bydesigning three unique probes for eachspecies, they could unambiguously identifymore than 90% of the species in the originaldata set, based upon the level of sequencevariation observed in either of the two DNAfragments.

The work of Hajibabaei et al. (2007) andothers (e.g., Zahariev et al. 2009) are impor-tant first steps in examining these criticaldesign issues. Both bioinformatics and em-pirical testing are still required to determinethe potential accuracy of such a single genearray and the extent to which multiple generegions would be required in order to avoidcross hybridization of closely related taxa, aswell as false positives. Inevitably, as thenumber of taxa incorporated in an arrayincreases, the scope of these problems be-comes increasingly more complex. In addi-tion to the refinement of array design forwhole communities, more powerful analyticapproaches are needed that maximize theinformation content from multiple DNAprobes in order to identify closely relatedspecies (Engelmann et al. 2009). In futureapplications, specifically designed arrayscould be used as tools for developing char-acteristic DNA signatures (Cannon et al.2006), similar to the way in which commu-nity typing is currently used for microbialcommunities through RFLP techniques.

next-generation sequencingstrategies

The approaches that we have discussedthus far require that substantial DNA se-quence information be linked to the partic-ular taxa that would be sampled in a survey.This requirement is true for both PCR-basedand hybridization platforms, as well as forpost-data collection processing to generatemeasures of biodiversity. An alternative strat-egy is to apply next-generation DNA sequenc-ing to whole community DNA extrac-tions, or to sequence amplified DNA

from the products of universal primersapplied to whole community DNA. Pro-karyotic metagenomic projects based onnext-generation DNA sequencing gener-ally take this approach (Angly et al. 2006;Dinsdale et al. 2008). The two major ad-vantages to a next-generation sequencingapproach are that it is possible to generatelarge amounts of DNA sequence informa-tion from environmental samples, and thatlittle upfront development is required. Forexample, a single run of a 454-Roche Inc.machine generates in excess of 500 millionbases of sequence. Other platforms (Illu-mina Inc. and ABIs SOLiD) generate sub-stantially more bases in total, but shorterindividual sequences. Applying a taggingstrategy to uniquely identify samples wouldallow the combination of multiple wholecommunity assays in a single run, therebyreducing the per sample cost (Meyer et al.2007, 2008; Parameswaran et al. 2007).

Since the use of next-generation sequenc-ing for whole community diversity assess-ment in eukaryotic biotas is still in its earlieststages, there are relevant issues to be ad-dressed. One important factor will be to es-tablish the lower detection limit with regardto the numerical abundance of individuals inrare taxa (i.e., the relative contribution ofDNA to a sample). There are also pitfalls inthe commonly used strategy of PCR amplifi-cation and in the sequencing of target genes.The lack of truly universal PCR primersallows for potential taxon-specific amplifica-tion bias due to primer binding inefficien-cies. It has also been suggested that PCRamplification can introduce sequence vari-ants that are the result of errors in the PCRand sequencing process and that are not re-flective of true variation in the original sam-ple. For example, as many as 16% of thesequences in a benthic sample showing chi-meric sequences have been noted (Porazin-ska et al. 2009). These errors, which couldinflate the estimates of sequence diversity ina sample and make detection of unknowntaxa challenging, can in large part be over-come with a comprehensive reference data-base. A remaining challenge is to developapproaches that extend next-generationsequencing beyond presence-absence deter-

September 2010 329DIAGNOSTIC DNA MARKERS AND BIOASSESSMENT

Page 12: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

mination to a quantitative assessment oftaxonomic diversity. Along these lines, con-trolled and replicated experimental tests us-ing nematode communities with differingabundance among taxa have been promis-ing. The high level of repeatability in se-quence coverage among replicates suggeststhat a qualitative assessment may eventuallybe achievable (Porazinska et al. 2010); how-ever, achieving this goal in a benthic inver-tebrate community will certainly be acomplicated endeavor, given the dramaticsize differences among the most commonlyobserved species (see below).

A bioinformatics approach can be used onthese data to bin or cluster the sequencesinto unique variants at particular genes—forinstance, using the partial COI locus as instandard barcoding. This approach, in whichthe variation in the sequences informs theassessment of taxonomic diversity, has beenreferred to as reverse taxonomy (Markmannand Tautz 2005). In principle, biodiversitymetrics could be based completely upon theclustering of taxonomically anonymous DNAsequences, with sequence divergence criteriaused to assign groups of similar sequencesto molecular operational taxonomic units(MOTUs) (Floyd et al. 2002; Blaxter et al.2005). However, caution should be used ininferring the validity of taxonomic assess-ments based on DNA markers without firstmaking a significant effort to establish therelationship between the markers and taxon-omy. Phenetic clustering of DNA sequencesinto MOTUs ignores the detailed taxonomicframeworks available for many taxa that in-corporate an evolutionary phylogenetic per-spective that cannot be readily matchedthrough a single-gene molecular geneticdata set. A more powerful approach wouldbe to compare DNA data from next-generation sequencing to a well-vetted refer-ence database that relates sequence variantsto formally described taxa. Here again, weemphasize that it is necessary for a compre-hensive DNA sequence library to be a prom-inent component of a mature, DNA-basedbiodiversity assessment tool. Linking DNA se-quence data to established taxonomic clas-sifications remains a significant but essen-tial challenge in maximizing the utility of

DNA-based assessment strategies. The cur-rent lack of a fully linked DNA sequencedatabase and taxonomy highlights thevalue of applying next-generation sequenc-ing approaches for the development of acomprehensive characterization of the ge-netic diversity found in target biotas.Aquatic invertebrate biodiversity assess-ment efforts can provide the essential datawith which to synergistically focus taxono-mists in their efforts to resolve areas ofambiguity.

ChallengesThe practical realization of a rapid DNA-

based method for assessing the biodiversityof freshwater and other ecosystems will re-quire a focused and coordinated researcheffort if it is to overcome a number of tech-nical and conceptual challenges (Figure 2).To realize the potential of this methodologywe must address three primary issues:

1. The development and management ofthe primary data on which DNA-based sur-veys will depend, including establishing therelationship between a well-resolved taxo-nomic framework and diagnostic DNA signa-tures

2. The development and validation of thetechnical methods that can most efficiently,accurately, and cost-effectively produce theDNA-sequences used to identify taxa

3. The development of the bioinformaticsnecessary to store and efficiently translateDNA data into useable information, and pro-vide public access to these data.

data needsThe success of DNA-based surveys will ul-

timately depend on the development of adatabase that relates DNA sequence informa-tion to an accepted and usable taxonomy—even if that taxonomy is, in the short term,partially based on MOTUs. This work hasalready been started via the Consortium forthe Barcode of Life (CBOL). CBOL focuseson producing barcodes for species within dif-ferent groups of taxa. The proposed workwould build on the CBOL model but wouldfocus on establishing sequences for multipletaxonomic groups within a single type of eco-system. Given the increased ease of collect-

330 Volume 85THE QUARTERLY REVIEW OF BIOLOGY

Page 13: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

ing genetic data, taxonomically informativemarkers (e.g., COI gene) will inevitably besupplemented by a set of genes that havecharacteristic transcriptional patterns andthat are bioindicators of environmental qual-ity, similar to the functional approach of mi-crobial community assays, thus allowing for asingle assay or set of assays that will charac-terize both biodiversity and organismal func-tion in a particular environmental context.

The efficacy of a DNA-based biodiver-sity assay depends on the identification ofunique sequence variants for all of thetarget taxa; therefore, the developmentof a database of unique DNA sequencevariants suitable for taxonomic identifi-cation is the primary challenge faced inadvancing this research agenda. The scope ofthe task is defined by the invertebratediversity in North American freshwatersand the requirements for the accurate biodi-versity assessment of these fauna. The devel-opment of an assessment tool applicable tobodies of freshwater across North Americacould potentially require a database includ-

ing all of the �15,000 freshwater inverte-brate species (Thorp and Covich 2001), but,in reality, only a fraction of this fauna needsto be characterized in order to construct aneffective DNA-based assessment tool. Thebenthic invertebrates that are typicallyconsidered in current assessment programsand that are representative of the major focaltaxonomic groups (e.g., arthropods, anne-lids, and mollusks) comprise 2000–3000species. Given current next-generation se-quencing approaches, generating the ge-netic data necessary to characterize thesetaxa and subsequently developing a DNA-based assay tool that would be applicable inthe majority of North American aquatic eco-systems is a readily achievable goal. To gaugethe scale of this effort, it is important torealize that population-level genetic varia-tion must be considered. All target taxa mustbe characterized by multiple DNA sequencescovering the range of genetic diversity foundin the natural populations that will be bio-surveyed. Significant inroads towards a ge-netic characterization of this freshwater

Figure 2. Coordinated Research to Achieve Effective Genetic-Based BioassessmentSchematic overview of the components required for a coordinated research program to develop a genetic

biodiversity assessment tool for North American benthic invertebrates. The critical elements are shown inboxes. Oversight and coordination among government and academic research labs is at the top, along withsubstantial funding from U. S. agencies whose mission includes direct involvement in the development andutilization of genetic tools. Three interdependent areas of research priority include: (1) sampling in applicablefreshwater ecosystems and the generation of genetic DNA signature data, (2) development of protocols forDNA extraction and genetic assay platforms, and (3) bioinformatics infrastructure to process and archive data.

September 2010 331DIAGNOSTIC DNA MARKERS AND BIOASSESSMENT

Page 14: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

fauna have already been made. For example,the Barcode of Life Project (http://www.barcodinglife.org/) has started to developthis type of genetic database for species-levelidentifications, and the Barcoding approachis being applied with great success to a grow-ing range of taxa, including benthic inverte-brates (Ratnasingham et al. 2007).

Two important issues to address include: (1)how to prioritize the sequencing among vari-ous groups of freshwater taxa, and (2) whatsequence information should be catalogued.Although, ultimately, sequence informationon all freshwater species should be collected,the species-level taxonomy of some groups ismuch better understood than that of others.For example, most species of Ephemerop-tera, Plecoptera, Trichoptera, and Coleopterafrom North American freshwater ecosystemshave been described (Balian et al. 2008),whereas Dipteran species—especially hyper-diverse groups such as the Chironomidae(Cranston 1995)—are poorly known. Thisgap in our knowledge base also extends tothe association of life stages; for instance,within the Tipuloidea (Diptera), less than4% of the 15,000 species described have im-mature life stages associated with the adultlife stage. We believe it would be most pru-dent to establish sequences for species in thebetter known groups first, while still workingto develop strategies to better characterizethe taxonomic diversity in the less well-known groups. To date, a �650 bp sequenceof the mitochondrial COI gene has beenviewed as a standard DNA barcode. How-ever, a single gene region may not be suffi-cient for the identification of all taxa in acommunity sample. For example, in an as-sessment of nematode diversity using next-generation sequencing of a pool of knowntaxonomic composition, Porazinska et al.(2009) found that analysis based on a singlegene sequence (small and large subunitrRNA sequences) underestimated the num-ber of species. Using both sequences, thedetection ability was increased from �90%to 95%. Unambiguous identifications basedon sequence data may well require informa-tion from more than one DNA region tocompletely resolve the diversity in a commu-nity sample.

integrating DNA information andtaxonomy

The relationship between diagnostic DNAmarkers and systematics has been more thana little contentious over the past few years.This conflict is illustrated by a number ofissues with barcoding. As with any biodiver-sity assessment approach, barcoding has ac-knowledged limitations. For example, thegeneral utility of a barcoding strategy as acharacterization of species level biodiversityhas been questioned because, in some groups,the level of nucleotide polymorphism withinspecies is comparable to the level of diver-gence among species (Meier et al. 2006;Skevington et al. 2007). Moreover, currentDNA barcoding methods do not distinguishbetween the true mitochondrial targetmarkers and nuclear mitochondrial pseu-dogenes (numts)—portions of the mito-chondrial genome that have been incorpo-rated into the nuclear genome over theevolutionary history of a group. The numberof species can be over- estimated by 100% ormore when numts are coamplified with thetarget barcode region (Song et al. 2008).The distribution of numts throughout ar-thropod and other invertebrate taxa has notyet been explored, although they appear tobe widespread in grasshoppers and crayfish,and are likely present in other aquatic insectssuch as Plecoptera and Ephemeroptera aswell (M. Whiting, unpublished data).

The relative level of within- and among-species variation has a potentially large im-pact on the link between DNA variation andtaxonomic classification. Although the utilityof barcoding across taxonomic groups is anongoing empirical issue, the important out-come of these efforts is the rapid expan-sion of a DNA database of characteristic se-quences associated with specific taxonomicgroups, including knowledge of intra/inter-specific variation among these groups. Formany of the candidate genetic platforms forbenthic invertebrate biodiversity assessment,the existence of these data is absolutely crit-ical.

The relationship between DNA signaturesand taxonomic description, as well as thedependence of one upon the other, is not

332 Volume 85THE QUARTERLY REVIEW OF BIOLOGY

Page 15: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

entirely clear. There is an ongoing and ex-tensive dialogue in this area (Cognato andCaesar 2006); some advocates question theutility of DNA signatures without linkage to aformal taxonomic assessment (e.g., Will andRubinoff 2004), while others point out thelack of established taxonomy for manygroups of organisms and question the utilityof groupings based solely on DNA sequences(e.g., Blaxter 2004). Advances in this areawill require both an extensive research com-mitment to working through rigorous alpha-level taxonomy for benthic invertebrates, aswell as an exploration of bioinformatic ap-proaches to link DNA data to this taxonomicframework (Bertolazzi et al. 2009).

We take a pragmatic view of this issue,acknowledging that formal taxonomy plays acritical role in biodiversity assessment, butalso realizing that major advances can bequickly gained in the absence of a well-resolved taxonomy linked to DNA sequencevariants. In a real sense, these two endeavorsare complementary, with DNA data inform-ing both gaps and problematic areas intaxonomy, and should be applied in a syner-gistic and iterative fashion (Carew et al. 2005;Caesar et al. 2006). In practice, benthic in-vertebrate diversity estimates generally relyon the classification of juvenile individuals orof partial/damaged specimens in a samplethat are notoriously difficult to assign tospecies-level taxonomy due to a lack of infor-mative characters. To further complicate theissue, the taxonomy in many of these groupsis largely based on adult morphology, oftenwith no established link to juvenile forms.Here, diagnostic DNA markers serve thedual purpose of providing unambiguousidentification of juvenile forms, as well asproviding the critical data necessary to estab-lish the link between immature and adultforms that is critical for the development of acompletely resolved taxonomic framework.

The integration of DNA information andtaxonomy will require close collaborationamong taxonomists, molecular biologists,bioinformatics specialists, freshwater ecolo-gists, and resource managers/agencies. Theprimary focus of this work should be to es-tablish a library of DNA signatures for rec-ognized species. Also, a properly curated

collection of voucher specimens that can becross-referenced with the DNA signaturesshould be linked with this library, and re-vised according to changes in taxonomy.This should be viewed as a positive benefit totaxonomists, as the process of producing andanalyzing sequences will provide them withdata that can aid in the discovery of previ-ously undescribed species and patterns ofphylogeographic diversity, as well as in thelinking of juvenile and adult forms.

Producing this sequence data requires ac-cess to physical specimens. Previously col-lected and archived material might provide aready source for some species, assuming thatthere is intact DNA available for sequencing,but material for many other species will needto be collected, identified by experts, pro-cessed for sequence data, curated, and ar-chived. In either case, the compilation andorganization of material will require a non-trivial expenditure of time and funds, andmust be viewed as a critical research activity.We suspect that material on the �15,000species of North American freshwater inver-tebrates could be collected within 5 years,given a successfully coordinated effort.

The first short-term goal is the large-scalesequencing of invertebrate samples over ageographically distributed set of samples. Aparallel effort can be made by associatingDNA signatures with described taxa, but thisassociation is a longer-term objective, as itwill be both time-consuming and labor inten-sive, and will require the direct involvementof expert taxonomists and, importantly formany taxa, the collection of adult life stages.During this effort, genetic data and taxon-omy are synergistic. Novel genetic signatureswill focus taxonomists on cryptic variationwithin species, and the linkages between de-scribed adults and larval forms will verify theidentity of the DNA signatures. The processof collecting genetic data can begin immedi-ately, and, given the requirement of 2000–3000 target species, can be completed withina reasonable time frame. The major con-straint will be the collection of appropriaterepresentative samples; making linkages withtaxonomy will be an ongoing effort with aduration directly related to the intensityof that effort. Concurrently, an intense se-

September 2010 333DIAGNOSTIC DNA MARKERS AND BIOASSESSMENT

Page 16: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

quencing effort of invertebrate samplesfrom a geographically limited set ofstream samples would allow protocol de-velopment, provide specimens for a com-plete albeit geographically limited area, andprovide proof-of-concept using genetic toolsfor freshwater assessment.

technical issues for DNA-basedassessment methods

The second major initiative is the modifi-cation and testing of sample collection pro-tocols, as well as array and next-generationsequencing approaches, to develop a work-ing, cost-effective biodiversity assay tool. Thisphase can begin once the emerging geneticdatabase contains a sufficient amount ofDNA sequences that represent target taxa.Because the appropriate technology iscurrently available and only requires mod-ification to make it specific to freshwater bio-assessment, this phase of development canrapidly follow the development of the ge-netic database. A final task will be to validatethe efficacy of a genetic assay through a se-ries of quality-control tests and pilot projectsthat directly compare the results of assess-ments based on standard morphological as-say techniques.

Although the productive technical ap-proaches to developing a DNA-based toolseem clear, significant challenges remain.For example, extracting high-quality DNAfrom an individual organism is routine, butefficiently doing the same on hundreds oreven thousands of individuals in a single bulksample may not be as straightforward. Ad-vances in this area are a research priority, asthe increase in processing time dictated byDNA extraction from individual samplescould well offset the advantages of increasedtaxonomic resolution gained by a DNA-based approach. Methods for bulk DNA ex-traction need to be refined and rigorouslytested, and these methods may vary depend-ing on their particular application (e.g.,microarray or next-generation sequenc-ing) (Creer et al. 2010). Moving to a high-throughput bulk sample may also limit ourability to move beyond categorization of thepresence/absence of taxa to a more quanti-tative assessment that includes relative abun-

dance. In principle, it is possible to use arraysto quantify the abundance of DNA frag-ments; this is the underlying assumption ofgene expression arrays that quantify the rel-ative abundance of transcripts (DeSantis etal. 2005). However, estimating the relativeabundance of DNA fragments in a pool ofDNA extracted from a bulk sample of ben-thic invertebrates that vary by orders ofmagnitude in body size will likely be quitecomplicated. Some form of normalization,either in the DNA extraction or data process-ing phases, will be required.

Few studies have involved large quantitiesof extraction material and large numbers ofsamples. In a DNA-array study with largesample size, Robideau et al. (2008) exam-ined 2000 fruit samples for evidence of fun-gal pathogens. They showed that as thesample size increased, the number of falsenegatives in the molecular assay increased upto 20% as compared with direct detection viaplating, because calyx colonists grow readilyon plates, but have low biomasses in fruitsamples. The results of this study suggest thatdetection is a probability issue dependent oninclusion of DNA from low abundance colo-nists in the PCR reaction template. A similarissue would likely affect detection of smallbody size and/or rare invertebrates in ben-thic samples, and PCR instruments that workwith increasingly smaller volumes to increasespeed and reduce cost will compound thisproblem. The potential pitfalls of using a high-throughput DNA-based strategy for wholecommunity diversity studies can be evalu-ated and resolved through controlled andreplicated experimental studies. Thesestudies should focus on issues of DNA iso-lation, detection limits, and error rates in acontrolled laboratory setting. Finally, theefficacy of a particular platform will needvalidation in a field context by comparisonto a detailed morphological assessment ofspecies diversity.

An important point to consider is the dis-tinct possibility that the “best” genetic assayplatform may change rapidly in the next5–10 years. Currently, microarray and next-generation sequencing approaches are themost promising avenues to pursue. However,given the rapid pace of technology in this

334 Volume 85THE QUARTERLY REVIEW OF BIOLOGY

Page 17: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

area, which is driven largely by the desire tocharacterize genomic level genetic variationin individuals and populations of humans,the most accurate and cost-effective assayplatform will almost certainly change repeat-edly in the near future. This realization rein-forces the critical and central importance ofa comprehensive DNA database with com-plementary vouchered collections that linkgenetic variation to taxonomy in benthic in-vertebrates. Given the cost and effort re-quired to compile the necessary biologicalcollections, the serious consideration oflong-term maintenance of archived com-munity DNA and RNA is warranted. Asthe technology becomes increasinglymore cost-effective and comprehensive, itmay be possible to mine these collections foradditional taxonomic and functional diver-sity. Importantly, archived collections willform the basis for critical examinations ofthe effect of climate change on aquatic bio-tas over the next decades. This database andspecimen collection are the lynchpins forassay tools that can be constructed now usingcurrent technology, and for those that will bedeveloped with new technologies in the nearfuture.

managing and interpreting sequencedata: the bioinformatics challengeIn order for it to be useful in an ecological

context, we must be able to quickly translatethe DNA data produced from assay instru-ments into a form used by ecologists (i.e.,lists of species names or their codes) andinto a file format that can be easily used byexisting ecological software. The technologyfor handling such large arrays of data is gen-erally available, but work will be needed todevelop the most efficient ways of parsingthe sequence information in order to unam-biguously discriminate between taxonomicunits and to then output that information ina format usable by ecologists. If they are to betruly useful, these sequence-taxa databasesmust be designed so that they will be able tocommunicate directly with existing taxa-ecology databases that house informationregarding the ecological requirements and dis-tributional records of species.

In summary, the establishment of a rapid

genetic biodiversity assay will require a seriesof coordinated data collection efforts, as wellas assay platform development. The primaryrequirement is a large data set of uniqueDNA signatures. These signatures can becoupled to described invertebrate taxa, andwill form the basis for design of genetic as-sessment tools. A coordinated effort betweentaxonomists, molecular biologists, bioinfor-matics specialists, freshwater ecologists, andresource managers, focused by support fromthe appropriate state and federal agencies,should be able to effectively produce a viabletoolset for DNA-based assessment of freshwa-ter systems within the next 5–10 years.

funding and research coordinationIn our view, a priority for realizing the

potential of DNA-based surveys for freshwa-ter bioassessments is the establishment of acoordinated program of research supportamong those federal institutions that havesome interest in either the development orapplication of biodiversity surveys. In theUSA, the National Science Foundation (NSF)holds primary responsibility for funding thedevelopment of new science, while programswithin the Environmental Protection Agency(EPA), the U.S. Geological Survey (USGS),U.S. Department of Agriculture (USDA),and the National Institute of EnvironmentalHealth Science (NIEHS) may be inter-ested in supporting the application of thisscience to environmental and humanhealth issues. Currently, no coordinatedeffort exists among these agencies to pro-mote the development or refinement ofthe science supporting DNA-based surveys.Two critical issues need to be addressed.First, the appropriate agencies need toidentify taxonomic and DNA-based bio-assesment work as a high-priority researchneed when developing funding budgets.Second, these agencies must take pri-mary responsibility for coordinating themultiple avenues of research that need tobe pursued in parallel. This coordinationcould be achieved by tasking an appro-priate federal research lab with this re-sponsibility, or by funding a consortiumof universities to oversee these efforts.

September 2010 335DIAGNOSTIC DNA MARKERS AND BIOASSESSMENT

Page 18: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

acknowledgments

The ideas and synthesis in this article were fostered bya workshop organized by MEP and CPH at Utah StateUniversity. The workshop was supported by funds

from the USU Center for Integrated BioSystems, theUSU Ecology Center, and the USU Office of Re-search. Although this work was reviewed by EPA andapproved for publication, it may not necessarily re-flect official Agency policy.

REFERENCES

Angly F. E., Felts B., Breibart M., Salamon P., EdwardsR. A., Carlson C., Chan A. M., Haynes M., KelleyS., Lui H., Mahaffy J. M., Mueller J. E., Nulton J.,Olson R., Parsons R., Rayhawk S., Suttle C. A.,Rohwer F. 2006. The marine viromes of four oce-anic regions. PLoS Biology 4(11):e368.

Armstrong K. F., Ball S. L. 2005. DNA barcodes forbiosecurity: invasive species identification. Philo-sophical Transactions of the Royal Society, Series B:Biological Sciences 360(1462):1813–1823.

Arscott D. B., Jackson J. K., Kratzer E. B. 2006. Role ofrarity and taxonomic resolution in a regional andspatial analysis of stream macroinvertebrates. Jour-nal of the North American Benthological Society 25(4):977–997.

[ASIWPCA] Association of State and Interstate Pollu-tion Control Administrators. 2002. Water QualityMonitoring Programs Survey Report: Status and Futureof State Ambient Water Quality Monitoring Programs.Washington (DC): ASIWPCA.

Balian E. V., Levesque C., Segers H., Martens K.,editors. 2008. Freshwater animal diversity assess-ment. Hydrobiologia 595:1–637.

Ball S. L., Hebert P. D. N., Burian S. K., Webb J. M. 2005.Biological identifications of mayflies (Ephemeroptera)using DNA barcodes. Journal of the North AmericanBenthological Society 24:508–524.

Bertolazzi P., Felici G., Weitschek E. 2009. Learningto classify species with barcodes. BMC Bioinformat-ics 10(supplement 14):S7.

Barrett R. D. H., Hebert P. D. N. 2005. Identifyingspiders through DNA barcodes. Canadian Journalof Zoology 83(3):481–491.

Blaxter M. 2004. The promise of a DNA taxonomy.Philosophical Transactions of the Royal Society, Series B:Biological Sciences 359:699–679.

Blaxter M., Mann J., Chapman T., Thomas F., WhittonC., Floyd R., Abebe E. 2005. Defining operationaltaxonomic units using DNA barcode data. Philosoph-ical Transactions of the Royal Society, Series B: BiologicalSciences 360(1462):1935–1943.

Brodie E. L., DeSantis T. Z., Moberg Parker J. P.,Zubietta I. X., Piceno Y. M., Anderson G. L. 2007.Urban aerosols harbor diverse and dynamic bac-terial populations. Proceedings of the National Acad-emy of Scienes USA 104(1):299–304.

Caesar R. M., Sörensson M., Cognato A. I. 2006. In-tegrating DNA data and traditional taxonomy tostreamline biodiversity assessment: an example

from edaphic beetles in the Klamath ecoregion,California, USA. Diversity and Distributions 12(5):483–489.

Call D. R., Borucki M. K., Loge F. J. 2003. Detection ofbacterial pathogens in environmental samples us-ing DNA microarrays. Journal of MicrobiologicalMethods 53(2):235–243.

Cannon C. H., Kua C. S., Lobenhofer E. K., Hurban P.2006. Capturing genomic signatures of DNA se-quence variation using a standard anonymous mi-croarray platform. Nucleic Acids Research 34(18):e121.

Cao Y., Hawkins C. P. 2005. Simulating biologicalimpairment to evaluate the accuracy of ecologicalindicators. Journal of Applied Ecology 42:954–965.

Cao Y., Hawkins C. P., Larsen D. P., Van Sickle J. 2007.Effects of sample standardization on mean speciesdetectabilities and estimates of relative differencesin species richness among assemblages. AmericanNaturalist 170(3):381–395.

Cao Y., Larsen D. P., Hughes R. M., Angermeier P. L.,Patton T. M. 2002a. Sampling effort affects multi-variate comparisons of stream assemblages. Journalof the North American Benthological Society 21(4):701–714.

Cao Y., Williams D. D., Larsen D. P. 2002b. Compar-ison of ecological communities: the problem ofsample representativeness. Ecological Monographs72:41–56.

Carew M. E., Pettigrove V., Cox R. L., Hoffmann A. A.2007a. DNA identification of urban Tanytarsinichironomids (Diptera: Chironomidae). Journal ofthe North American Benthological Society 26(4):587–600.

Carew M. E., Pettigrove V., Cox R. L., Hoffmann A. A.2007b. The response of Chironomidae to sedi-ment pollution and other environmental charac-teristics in urban wetlands. Freshwater Biology52(12):2444–2462.

Carew M. E., Pettigrove V., Hoffmann A. A. 2003.Identifying chironomids (Diptera: Chironomi-dae) for biological monitoring with PCR-RFLP.Bulletin of Entomological Research 93:483–490.

Carew M. E., Pettigrove V., Hoffmann A. A. 2005. Theutility of DNA markers in classical taxonomy: usingCytochrome Oxidase I markers to differentiate Aus-tralian Cladopelma (Diptera: Chironomidae) midges.Annals of the Entomological Society of America 98(4):587–594.

336 Volume 85THE QUARTERLY REVIEW OF BIOLOGY

Page 19: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

Carter J. L., Resh V. H. 2001. After site selection andbefore data analysis: sampling, sorting, and labo-ratory procedures used in stream benthic macro-invertebrate monitoring programs by USA stateagencies. Journal of the North American BenthologicalSociety 20(4):658–682.

Castiglioni B., Rizzi E., Frosini A., Mugnai M. A.,Ventura S., Sivonen K., Rajaniemi P., Rantala A.,Wilmotte A., Boutte C., Consolandi C., BorboniR., Mezzelani A., Busti E., Rossi Bernardi L.,Battaglia C., De Bellis G. 2002. Application of anuniversal DNA microarray to cyanobacterial diver-sity assessment. Minerva Biotech 14(3–4):253–257.

Chen W., Seifert K. A., Levesque C. A. 2009. A highdensity COX1 barcode oligonucleotide array foridentification and detection of species of Penicil-lium subgenus Penicillium. Molecular Ecology Re-sources 9(supplement 1):114–129.

Clarke R. T., Lorenz A., Sandin L., Schmidt-KloiberA., Strackbein J., Kneebone N. T., Haase P. 2006.Effects of sampling and sub-sampling variation us-ing the STAR-AQEM sampling protocol on theprecision of macroinvertebrate metrics. Hydrobio-logia 566:441–459.

Cognato A. I., Caesar R. M. 2006. Will DNA Barcodingadvance efforts to conserve biodiversity more effi-ciently than traditional taxonomic efforts. Frontiersin Ecology and the Environment 4(5):268–270.

Comai L., Young K., Till B. J., Reynolds S. H., GreeneE. A., Codomo C. A., Enns L. C., Johnson J. E.,Burtner C., Oden A. R., Henikof S. 2004. Efficientdiscovery of DNA polymorphisms in natural pop-ulations by Ecotiling. Plant Journal 37(5):778–786.

Cranston P. S. 1995. Introduction. Pages 1–7 in TheChironomidae: The Biology and Ecology of Non-BitingMidges, edited by P. D. Armitage et al. London(UK): Chapman and Hall.

Creer S., Fonseca V. G., Porazinska D. L., Giblin-DavisR. M., Sung W., Power D. M., Packer M., CarvalhoG. R., Blaxter M. L., Lambshead P. J. D., ThomasW. K. 2010. Ultrasequencing of the meiofaunalbiosphere: practice, pitfalls and promises. Molecu-lar Ecology 19(s1):4–20.

DeSantis T. Z., Stone C. E., Murray S. R., Moberg J. P.,Andersen G. L. 2005. Rapid quantification andtaxonomic classification of environmental DNAfrom prokaryotic and eukaryotic origins using amicroarray. FEMS Microbiology Letters 245(2):271–278.

Diaz M. R., Boekhout T., Rheelen B., Bovers M.,Cabanes F. J., Fell J. W. 2006. Microcoding andflow cytometry as a high-throughput fungal iden-tification system for Malassezia species. Journal ofMedical Microbiology 55:1197–1209.

Dinsdale E. A., Edwards R. A., Hall D., Angly F.,Breitbart M., Brulc J. M., Furlan M., Desnues C.,Haynes M., Li L., McDaniel L., Moran M. A., Nel-

son K. E., Nilsson C., Olson R., Paul J., Brito B. R.,Ruan Y., Swan B. K., Stevens R., Valentine D. L.,Thurber R. V., Wegley L., White B. A., Rohwer F.2008. Functional metagenomic profiling of ninebiomes. Nature 452:629–632.

Engelmann J. C., Rahmann S., Wolf M., Schultz J.,Fritzilas E., Kneitz S., Dandekar T., Muller T. 2009.Modelling cross-hybridization on phylogeneticDNA microarrays increases the detection power ofclosely related species. Molecular Ecology Resources9(1):83–93.

Epler J. H. 2001. Identification Manual for the Larval Chirono-midae (Diptera) of North and South Carolina: A Guide to theTaxonomy of the Midges of the Southeastern United States,including Florida. Special Publication SJ2001-SP13. Ra-leigh (NC) and Palatka (FL): North Carolina Depart-ment of Environment and Natural Resources, and St.Johns River Water Management District.

Fessehaie A., De Boer S. H., Levesque C. A. 2003. Anoligonucleotide array for the identification anddifferentiation of bacteria pathogenic on potato.Phytopathology 93:262–269.

Floyd R., Abebe E., Papert A., Blaxter M. 2002. Mo-lecular barcodes for soil nematode identification.Molecular Ecology 11:839–850.

Gentry T. J., Wickham G. S., Schadt C. W., He Z.,Zhou J. 2006. Microarray applications in microbialecology research. Microbial Ecology 52(2):159–175.

Gilchrist E. J., Haughn G. W., Ying C. C., Otto S. P.,Zhuang J., Cheung D., Hamberger B., AboutorabiF., Kalynyak T., Johnson L., Bohlmann J., EllisB. E., Douglas C. J., Cronk Q. C. B. 2006. Use ofecotilling as an efficient SNP discovery tool tosurvey genetic variation in wild populations ofPopulus trichocarpa. Molecular Ecology 15(5):1367–1378.

Gracey A. Y., Cossins A. R. 2003. Application of mi-croarray technology in environmental and com-parative physiology. Annual Review of Physiology 65:231–259.

Gresham D., Ruderfer D. M., Pratt S. C., SchachererJ., Dunham M. J., Botstein D., Kruglyak L. 2006.Genome-wide detection of polymorphism at nu-cleotide resolution with a single DNA microarray.Science 311(5769):1932–1936.

Guerold F. 2000. Influence of taxonomic determina-tion level on several community indices. Water Re-search 34:487–492.

Hajibabaei M., Singer G. A. C., Clare E. L., HebertP. D. N. 2007. Design and applicability of DNAarrays and DNA barcodes in biodiversity monitor-ing. BMC Biology 5:24.

Harvey J. B. J., Hoy M. S., Rodriguez R. J. 2009.Molecular detection of native and invasive marineinvertebrate larvae present in ballast and openwater environmental samples collected in Puget

September 2010 337DIAGNOSTIC DNA MARKERS AND BIOASSESSMENT

Page 20: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

Sound. Journal of Experimental Marine Biology andEcology 369:93–99.

Hawkins C. P. 2006. Quantifying biological integrity bytaxonomic completeness: its utility in regional andglobal assessments. Ecological Applications 16(4):1277–1294.

Hawkins C. P., Norris R. H., Hogue J. N., FeminellaJ. W. 2000. Development and evaluation of predic-tive models for measuring biological integrity instreams. Ecological Applications 10:1456–1477.

Hawkins C. P., Olson J. R., Hill R. A. 2010. Thereference condition: predicting benchmarks forecological and water-quality assessments. Journal ofthe North American Benthological Society 29(1):312–343.

Hebert P. D. N., Penton E. H., Burns J. M., JanzenD. H., Hallwachs W. 2004. Ten species in one:DNA barcoding reveals cryptic species in the neo-tropical skipper butterfly Astraptes fulgerator. Pro-ceedings of the National Academy of Sciences USA101(41):14812–14817.

He Z., Gemntry T. J., Schadt C. W., Wu L., Liebich J.,Chong S. C., Huang Z., Wu W., Gu B., Jardine P.,Criddle C., Zhou J. 2007. GeoChip: a comprehen-sive microarray for investigating biogeochemical, eco-logical, and environmental processes. ISME Journal1:67–77.

Hughes R. M., Larsen D. P., Omernik J. M. 1986.Regional reference sites: a method for assessingstream potentials. Environmental Management 10:629–635.

Jones F. C. 2008. Taxonomic sufficiency: the influ-ence of taxonomic resolution on freshwater bio-assessments using benthic macroinvertebrates. En-vironmental Reviews 16:45–69.

King R. S., Richardson C. J. 2002. Evaluating sub-sampling approaches and macroinvertebratetaxonomic resolution for wetland bioassess-ment. Journal of the North American BenthologicalSociety 21(1):150 –171.

Lamche G., Fukuda Y. 2008. Comparison of Genus andFamily Level AUSRIVAS Models for the Darwin-DalyRegion in Relation to Land Use. Report 01/20008D.Northern Territory (Australia): Aquatic HealthUnit, Department of Natural Resources, Environ-ment and the Arts, Northern Territory Govern-ment.

Lenat D. R., Resh V. H. 2001. Taxonomy and streamecology—the benefits of genus and species levelidentifications. Journal of the North American Bentho-logical Society 20(2):287–298.

Levesque C. A., Harlton C. E., de Cock A. W. A. M.1998. Identification of some oomycetes by reversedot blot hybridization. Phytopathology 88(3):213–222.

Lorenz A., Kirchner L., Hering D. 2004. ‘Electronicsubsampling’ of macrobenthic samples: how many

individuals are needed for a valid assessment re-sult? Hydrobiologia 516(1):299–312.

Loy A., Schultz C., Lucker S., Schöpfer-Wendels A.,Stoecker K., Baranyi C., Lehner A., Wagner M.2005. 16S rRNA gene-based oligonucleotide mi-croarray for environmental monitoring of the be-taproteobacterial Order “Rhodocyclales.” AppliedEnvironmental Microbiology 71(3):1373–1386.

Lozupone C. A., Knight R. 2007. Global patterns inbacterial diversity. Proceedings of the National Acad-emy of Sciences USA 104(27):11436–11440.

Margulies M., Egholm M., Altman W. E., Attiya S.,Bader J. S., Bemben L. A., Berka J. et al. 2005.Genome sequencing in open microfabricatedhigh-density picoliter reactors. Nature 437(7057):376–380.

Markmann M., Tautz D. 2005. Reverse taxonomy:an approach towards determining the diversityof meiobenthic organisms based on ribosomalRNA signature sequences. Philosophical Transac-tions of the Royal Society, Series B: Biological Sciences360(1462):1917–1924.

Meier R., Shiyang K., Vaidya G., Ng P. K. L. 2006.DNA barcoding and taxonomy in Diptera: a tale ofhigh intraspecific variability and low identificationsuccess. Systematic Biology 55(5):715–728.

Meyer M., Briggs A. W., Maricic T., Höber B.,Höffner B., Krause J., Weihmann A., Paabo S.,Hofreiter M. 2008. From micrograms to pico-grams: quantitative PCR reduces the materialdemands of high-throughput sequencing. Nu-cleic Acids Research 36:e5.

Meyer M., Stenxel U., Myles S., Prufer K., HofreiterM. 2007. Targeted high-throughput sequencing oftagged nucleic acid samples. Nucleic Acids Research35:e96.

Neigel J., Domingo A., Stake J. 2007. DNA barcodingas a tool for reef conservation. Coral Reefs 26:487–499.

Nichols S. J., Robinson W. A., Norris R. H. 2006.Sample variability influences on the precision ofpredictive bioassessment. Hydrobiologia 572(1):215–233.

Ostermiller J. D., Hawkins C. P. 2004. Effects of sam-pling error on bioassessments of stream ecosys-tems: application to RIVPACS-type models. Journalof the North American Benthological Society 23:363–382.

Parameswaran P., Jalili R., Tao L., Shokralla S., GarizadehB., Ronaghi M., Fire A. Z. 2007. A pyrosequencing-tailored nucleotide barcode design unveils opportuni-ties for large-scale sample multiplexing. Nucleic AcidsResearch 35(19):e130.

Pfenninger M., Nowak C., Kley C., Steinke D., Streit B.2007. Utility of DNA taxonomy and barcoding forthe inference of larval community structure in

338 Volume 85THE QUARTERLY REVIEW OF BIOLOGY

Page 21: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

morphologically cryptic Chironomus (Diptera) spe-cies. Molecular Ecology 16:1957–1968.

Pfunder M., Holzgang O., Frey J. E. 2004. Develop-ment of a microarray-based diagnostics of volesand shrews for use in biodiversity monitoring stud-ies, and evaluation of mitochondrial cytochromeoxidase I vs. cytochrome b as genetic markers.Molecular Ecology 13(5):1277–1286.

Pond G. J., Passmore M. E., Borsuk F. A., Reynolds L.,Rose C. J. 2008. Downstream effects of mountain-top coal mining: comparing biological conditionsusing family- and genus-level macroinvertebratebioassessment tools. Journal of the North AmericanBenthological Society 27(3):717–737.

Porazinska D. L., Giblin-Davis R. M., Faller L., Farm-erie W., Kanzaki N., Morris K., Powers T. O.,Tucker A. E., Sung W., Thomas K. 2009. Evaluat-ing high-throughput sequencing as a method formetagemonic analysis of nematode diversity. Mo-lecular Ecology Resources 9(6):1439–1450.

Porazinska D. L., Sung W., Giblin-Davis R. M.,Thomas W. K. 2010. Evaluating high-throughputsequencing as a method for metagemonic analysisof nematode diversity. Molecular Ecology Resources9(6):1439–1450.

Ratnasingham S., Hebert P. D. N. 2007. BOLD: thebarcode of life data system: barcoding. MolecularEcology Notes 7:355–364.

Reynoldson T. B., Wright J. F. 2000. The referencecondition: problems and solutions. Pages 293–303in Assessing the Biological Quality of Fresh Waters:RIVPACS and Other Techniques, edited by J. F.Wright et al. Ambleside (UK): Freshwater Biolog-ical Association, The Ferry House.

Robideau G. P., Caruso F. L., Oudemans P. V.,McManus P. S., Renaud M. A., Auclair M. E.,Bilodeau G. J., Yee D., Desaulniers N. L., DeVernaJ. W., Levesque C. A. 2008. Detection of cranberryfruit rot fungi using DNA array hybridization. Ca-nadian Journal of Plant Pathology-Revue Canadiennede Phytopathologie 30:226–240.

Rosenberg D. M., Resh V. H. 1993. Freshwater Biomoni-toring and Benthic Macroinvertebrates. New York:Chapman & Hall.

Schmidt-Kloiber A., Nijboer R. C. 2004. The effect oftaxonomic resolution on the assessment of ecolog-ical water quality classes. Hydrobiology 516(1):269–283.

Sharley D. J., Pettigrove V., Parsons Y. M. 2004. Mo-lecular identification of Chironomus spp. (Diptera)for biomonitoring of aquatic ecosystems. Austra-lian Journal of Entomology 43:359–365.

Siefert K. A., Levesque C. A. 2004. Phylogeny andmolecular diagnostics of mycotoxigenic fungi. Eu-ropean Journal of Plant Pathology 110:449–471.

Sinclair C. S., Gresens S. E. 2008. Discrimination ofCricotopus species (Diptera: Chironomidae) by

DNA barcoding. Bulletin of Entomological Research1:1–9.

Skevington J. H., Kehlmaier C., Stahls G. 2007. DNAbarcoding: mixed results for big-headed flies(Diptera: Pipunculidae). Zootaxa 1423:1–26.

Song H., Buhay J. E., Whiting M. F., Crandall K. A.2008. Many species in one: DNA barcoding over-estimates the number of species when nuclearmitochondrial pseudogenes are coamplified. Pro-ceedings of the National Academy of Sciences USA105(36):13486–13491.

Stoddard J. L., Larsen D. P., Hawkins C. P., JohnsonR. K., Norris R. H. 2006. Setting expectations forthe ecological condition of streams: the conceptof reference condition. Ecological Applications16(4):1267–1276.

Stoughton R. B. 2005. Applications of DNA mi-croarrays in biology. Annual Review of Biochemis-try 74:53– 82.

Stribling J. B., Pavlik K. L., Holdsworth S. M., LeppoE. W. 2008. Data quality, performance, and uncer-tainty in taxonomic identification for biologicalassessments. Journal of the North American Benthologi-cal Society 27(4):906–919.

Tambong J. T., de Cock A. W. A. M., Tinker N. A.,Levesque C. A. 2006. Oligonucleotide array foridentification and detection of Pythium species.Applied and Environmental Microbiology 72(4):2691–2706.

Teletchea F., Bernillon J., Duffraisse M., Laudet V.,Hanni C. 2008. Molecular identification of verte-brate species by oligonucleotide microarray infood and forensic samples. Journal of Applied Ecol-ogy 45:967–975.

Thomas M. A., Klaper R. 2004. Genomics for theecological toolbox. Trends in Ecology and Evolution19(8):439–445.

Thorp J. H., Covich A. P., editors. 2001. Ecology andClassification of North American Freshwater Inverte-brates. Second Edition. San Diego (CA): AcademicPress.

[USEPA] United States Environmental ProtectionAgency. 2002. Summary of Biological Assessment Pro-grams and Biocriteria Development for States, Tribes,Territories, and Interstate Commissions: Streams andWadeable Rivers. EPA-822-R-02-048. Washington(DC): U.S. Environmental Protection Agency, Of-fice of Environmental Information and Office ofWater.

[USEPA] United States Environmental Protection Agency.2006. Wadeable Streams Assessment: A Coolaborative Surveyof the Nation’s Streamss. EPA-841-B-06-002. Washington(DC): U.S. Environmental Protection Agency, Officeof Research and Development and Office of Water.

Wagner M., Smidt H., Loy A., Zhou J. Z. 2007. Unrav-elling microbial communities with DNA-microar-

September 2010 339DIAGNOSTIC DNA MARKERS AND BIOASSESSMENT

Page 22: ASSESSING MACROINVERTEBRATE BIODIVERSITY IN FRESHWATER ...mpfrende/PDFs/Pfrender_et_al_QRB_2010.pdf · FRESHWATER ECOSYSTEMS: ADVANCES AND CHALLENGES IN DNA-BASED APPROACHES Michael

rays: challanges and future directions. MicrobialEcology 53(3):498–506.

Will K. W., Rubinoff D. 2004. Myth of the molecule:DNA barcodes for species cannot replace mor-phology for identification and classification. Cla-distics 20(1):47–55.

Zahariev M., Dahl V., Chen W., Levesque C. A. 2009.Efficient algorithms for the discovery of DNA oli-

gonucleotide barcodes from sequence databases.Molecular Ecology Notes 9:58–64.

Zhou J., Kang S., Schadt C. W., Garten C. T., Jr. 2008.Spatial scaling of functional gene diversity acrossvarious microbial taxa. Proceedings of the NationalAcademy of Sciences USA 105(22):7768–7773.

Associate Editor: Kent E. Holsinger

340 Volume 85THE QUARTERLY REVIEW OF BIOLOGY