MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus...

49
MURDOCH RESEARCH REPOSITORY This is the author’s final version of the work, as accepted for publication following peer review but without the publisher’s layout or pagination. The definitive version is available at http://dx.doi.org/10.1016/j.ijpara.2011.11.010 Nicol, P., Gill, Reetinder, Fosu-Nyarko, J. and Jones, M.G.K. (2012) De novo analysis and functional classification of the transcriptome of the root lesion nematode, Pratylenchus thornei, after 454 GS FLX sequencing. International Journal for Parasitology, 42 (3). pp. 225-237. http://researchrepository.murdoch.edu.au/7701/ Copyright: © 2012 Australian Society for Parasitology Inc. It is posted here for your personal use. No further distribution is permitted.

Transcript of MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus...

Page 1: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

MURDOCH RESEARCH REPOSITORY

This is the author’s final version of the work, as accepted for publication following peer review but without the publisher’s layout or pagination.

The definitive version is available at http://dx.doi.org/10.1016/j.ijpara.2011.11.010

Nicol, P., Gill, Reetinder, Fosu-Nyarko, J. and Jones, M.G.K. (2012) De novo analysis and functional classification of the

transcriptome of the root lesion nematode, Pratylenchus thornei, after 454 GS FLX sequencing. International Journal for

Parasitology, 42 (3). pp. 225-237.

http://researchrepository.murdoch.edu.au/7701/

Copyright: © 2012 Australian Society for Parasitology Inc.

It is posted here for your personal use. No further distribution is permitted.

Page 2: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Accepted Manuscript

de novo analysis and functional classification of the transcriptome of the root

lesion nematode, Pratylenchusthornei, after 454 GS FLX sequencing

Paul Nicol, Reetinder Gill, John Fosu-Nyarko, Michael G.K. Jones

PII: S0020-7519(12)00005-7

DOI: 10.1016/j.ijpara.2011.11.010

Reference: PARA 3353

To appear in: International Journal for Parasitology

Received Date: 12 September 2011

Revised Date: 18 November 2011

Accepted Date: 21 November 2011

Please cite this article as: Nicol, P., Gill, R., Fosu-Nyarko, J., Jones, M.G.K., de novo analysis and functional

classification of the transcriptome of the root lesion nematode, Pratylenchusthornei, after 454 GS FLX sequencing,

International Journal for Parasitology (2012), doi: 10.1016/j.ijpara.2011.11.010

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers

we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and

review of the resulting proof before it is published in its final form. Please note that during the production process

errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Page 3: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

1

de novo analysis and functional classification of the transcriptome of the root lesion nematode, 1

Pratylenchus thornei, after 454 GS FLX sequencing 2

Paul Nicol1, Reetinder Gill1, John Fosu-Nyarko1, Michael G. K. Jones1,* 3

4

Plant Biotechnology Research Group, WA State Agricultural Biotechnology Centre, School of 5

Biological Sciences and Biotechnology, Murdoch University, Perth, WA 6150, Australia 6

7

*Corresponding author. 8

Tel.: +61-8-9360-2424; fax: +61-8-9360-2502 9

10

E-mail address: [email protected] 11

12

1 All authors contributed equally to the work. 13

14

Note: Nucleotide sequence data reported in this paper are available in the GenBank™ database 15

under the accession numbers JO845319-JO845338 and JL859810-JL866456. 16

17

18

Note: Supplementary data associated with this article. 19

20

Page 4: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

2

ABSTRACT 21

The migratory endoparasitic root lesion nematode Pratylenchus thornei is a major pest of the 22

cereals wheat and barley. In what we believe to be the first global transcriptome analysis for P. 23

thornei, using Roche GS FLX sequencing, 787,275 reads were assembled into 34,312 contigs using 24

two assembly programs, to yield 6,989 contigs common to both. These contigs were annotated, 25

resulting in functional assignments for 3,048. Specific transcripts studied in more detail included 26

carbohydrate active enzymes potentially involved in cell wall degradation, neuropeptides, putative 27

plant nematode parasitism genes, and transcripts that could be secreted by the nematode. 28

Transcripts for cell wall degrading enzymes were similar to bacterial genes, suggesting that they 29

were acquired by horizontal gene transfer. Contigs matching 14 parasitism genes found in 30

sedentary endoparasitic nematodes were identified. These genes are thought to function in 31

suppression of host defenses and in feeding site development, but their function in P. thornei may 32

differ. Comparison of the common contigs from P. thornei with other nematodes showed that 33

2,039 were common to sequences of the Heteroderidae, 1,947 to the Meloidogynidae, 1,218 to 34

Radopholus similis, 1,209 matched expressed sequence tags (ESTs) of Pratylenchus penetrans and 35

Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs 36

common to Caenarhabditis elegans, with 15.9 % being common to all three groups. Twelve 37

percent of contigs with matches to the Heteroderidae and the Meloidogynidae had no homology 38

to any C. elegans protein. Fifty-seven percent of the contigs did not match known sequences and 39

some could be unique to P. thornei. These data provide substantial new information on the 40

transcriptome of P. thornei, those genes common to migratory and sedentary endoparasitic 41

nematodes, and provide additional understanding of genes required for different forms of 42

Page 5: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

3

parasitism. The data can also be used to identify potential genes to study host interactions and for 43

crop protection. 44

45

Keywords: Root lesion nematode, Pratylenchus thornei, 454 GS FLX sequencing, Transcriptome, 46

Annotation, Functional classification, Parasitism genes 47

48

Page 6: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

4

1. Introduction 49

Plant parasitic nematodes cause annual losses in crop production estimated at US $125 50

billion (Chitwood, 2003). Root lesion nematodes (RLNs), genus Pratylenchus, are economically one 51

of the most important groups of plant parasites and are widely distributed with approximately 68 52

nominated species. From their wide host range and migratory nature, it has been suggested that 53

Pratylenchus spp. may be less specialised plant parasites, possibly representing an evolutionary 54

intermediate between free-living forms and the highly specialised sedentary endoparasites such as 55

the root-knot and cyst nematodes (Mitreva et al., 2004). Nevertheless RLNs can cause significant 56

crop losses of 7 to 15%, although often unrecognised (Vanstone et al., 2007). For example, field 57

trials in Western Australia have identified seven spp., including Pratylenchus thornei, that cause 58

yield losses of 5 to 10% in wheat and 5 to 20% in barley, principally from infestation of P. thornei 59

and Pratylenchus neglectus (Vanstone et al., 2007). 60

RLNs are classified as migratory intercellular root endoparasites, that is, they live mainly in 61

plant roots but are able to move and feed from different host plant cells. They are polyphagous 62

and can infect a wide range of host plants including most of the important crop species. Their life 63

cycle takes 3-8 weeks depending on species and conditions (Castillo and Vovlas, 2007). After 64

embryogenesis and development of first stage juvenile (J1), the nematode moults to the second 65

stage juvenile (J2) which then hatches from the egg. Except for the egg and J1 stages, all juvenile 66

and adult stages of RLNs are vermiform and motile, and can infect host plants. The J2 nematode 67

grows and moults three more times to become a mature male or female. Adults and juveniles can 68

leave and enter roots; adult females lay eggs singly or in small groups inside roots. Males are 69

common in some species and rare in others, although reproduction is usually by parthenogenesis. 70

Longer term survival under adverse conditions can occur at the egg stage or through suspended 71

Page 7: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

5

metabolic activity. Infection causes root growth reduction, accompanied with the formation of 72

lesions, necrotic areas, browning and cell death, and root-rotting from secondary attack by fungi 73

or bacteria. When a RLN feeds, it both secretes materials from the single dorsal and two sub-74

ventral pharyngeal gland cells and ingests cell cytoplasm using its hollow mouth stylet, damaging 75

host cells in the process. The secretions from the pharyngeal glands that are injected into cells via 76

the mouth stylet while feeding and migrating probably contain a mixture of enzymes, including cell 77

wall-degrading enzymes. The feeding behaviour involves phases of cell probing, stylet 78

penetration, salivation and ingestion. Relative to their economic importance very little research 79

has been undertaken to study the molecular basis of RLN parasitism. 80

To date, molecular characterisation of Pratylenchus genes has been limited (e.g. 81

transcriptome analysis of Pratylenchus coffeae, Haegman et al., 2011), partly because their 82

migratory behaviour makes them more difficult to study than sedentary endoparasites. Genome 83

sequencing and annotation of the free-living nematodes; Caenorhabditis elegans (Consortium, 84

1998), Caenorhabditis briggsae (Stein et al., 2003) and Pristionchus pacificus (Dieterich et al., 85

2008), and the parasitic root-knot nematodes; Meloidogyne hapla (Opperman et al., 2008) and 86

Meloidogyne incognita (Abad et al., 2008) provide reference genomes for comparison with those 87

of RLNs. ‘Next generation’ sequencing technologies have revolutionised transcriptomics where 88

high throughput expression data for transcriptomes can be generated at a single-base resolution 89

(Morozova et al., 2009). The ‘next generation’ sequencing technology offered by Roche 454 GS-90

FLX was used to analyse the transcriptome of a mixed stage population of P. thornei. In this report 91

we present a functional classification of 6,989 contigs based on assignments to publicly available 92

datasets. 93

Page 8: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

6

2. Material and methods 94

2.1. Source materials and RNA extraction 95

A population of P. thornei, derived from a single female isolated from an infected wheat 96

plant in Western Australia, was maintained by serial subculture on aseptic carrot discs in sterile 97

containers under sterile conditions and kept in the dark at 25⁰C (Jones et al., 2009). To harvest the 98

nematodes, the carrot discs were cut into 5 mm thick pieces and extracted using a mist apparatus 99

to collect active nematodes. Nematodes freshly extracted from carrot pieces were used for RNA 100

extraction. Eggs, juveniles stages (J2-J4) and adults were mixed together at the ratio of 1:2:3. 101

Total RNA was extracted from these nematodes using Trizol (Invitrogen Life Technologies, 102

Carlsbad, USA) and then cleaned using an RNAeasy Mini kit column (Qiagen Inc, California, USA), 103

following the manufacturer’s instructions. The extracted RNA was assessed for quality and 104

quantified using an Agilent 2100 Bioanalyzer (Agilent Technologies, Mississauga, Canada), giving 105

an RNA integration score (RIN) of 7.0. A cDNA library was prepared from 2,592 ng of total RNA 106

using the Ovation RNA-Seq system (NuGEN Technologies, Inc., CA, USA), which uses both Oligo-dT 107

and random primers, and checked for removal of rRNA using an Agilent 2100 Bioanalyzer. 108

Sequencing was carried out using a Roche 454 GS FLX DNA platform at the Institute of 109

Immunology and Infectious Diseases, Murdoch University, Perth, Australia. 110

111

2.2. Sequence data analysis and assembly 112

One full picotitre plate was used to generate 454 GS FLX datasets, with results from each 113

half; Lane 1 (L1) and Lane 2 (L2) were analysed separately as shown in Fig. 1. Following removal of 114

the sequencing linkers, the reads (from L1 and L2) were assembled separately with CAP3 (Huang 115

Page 9: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

7

et al., 1999) and MIRA (Chevreux et al., 2004) assembly programs to generate contigs for each of 116

the two data sets. Two assembly programs were used in order to minimise assembly errors 117

(Kumar and Blaxter, 2010). In the next step, CAP3 was used to reassemble the four contig sets 118

into one. Finally, only those contigs common to each of the initial four assemblies were selected 119

for further analysis. The selected contigs were at least 90% similar over a minimum alignment 120

length of 40 bases. Although this assembly procedure reduced the total number of contigs, these 121

were expected to be of high quality and to better represent the original data. After assembly and 122

before annotating the transcripts, CD HIT-EST (Li and Godzik, 2006) was used to determine 123

whether there was any redundancy in the final data set. These were submitted to the National 124

Center for Biotechnology Information (NCBI) with the following accession numbers JO845319-125

JO845338 and JL859810-JL866456. 126

127

2.3. Transcript annotation 128

Protein identity, protein domains and Gene Ontology (GO) terms were assigned to 129

assembled contigs homologous to previously identified genes/proteins using the BLAST+ suite of 130

programs and Interproscan (Ashburner et al., 2000; Camacho et al., 2009; Hunter et al., 2009). 131

The BLAST programs were run in threaded mode to enable many alignments to be done in a short 132

time. BLAST searches were carried out consistently at a BLAST expected value (E) of E=1/6,989 133

(1.4e-04) unless otherwise stated. BLASTX was used to annotate the contigs with data available in 134

the nucleotide and amino acid datasets (non-redundant (nr) and nucleotide (nt)) curated by the 135

NCBI. Datasets containing 12,346,081 nt and 14,185,994 amino acid sequences were searched for 136

homology with the P. thornei contigs. Additional BLAST analyses were carried out using other 137

databases including the Whole Genome Shotgun , High Throughput Genomic Sequences , Genome 138

Page 10: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

8

Survey Sequence and Expressed Sequence Tags (ESTs) of mouse, human and others, all from NCBI. 139

The biopython (Cock et al., 2009) libraries were used for parsing the XML output file from the 140

BLAST analysis. The results of annotation were stored in a SQLite database that contained 141

annotation and alignment information. The database allowed efficient mining of data through the 142

python SQLite wrapper. SQL queries were performed against the database to extract items of 143

interest. 144

145

2.4. Functional classification of transcripts 146

Pratylenchus thornei contigs were assigned to metabolic pathways using the Kyoto 147

Encyclopedia of Genes and Genomes (KEGG2, Kanehisa et al., 2010) database 148

(http://www.genome.jp/KEGG/) as an alternative approach to annotate transcripts and to 149

determine their biochemical functions. KEGG enzyme commission (EC) and orthology (KO) 150

pathways were used as the basis of this assignment. BLAST queries of the contigs against KEGG 151

databases (nucleotide and amino acid) were carried out to obtain information on biochemical 152

pathways. The results were parsed to obtain KEGG identifiers, EC numbers and KO numbers which 153

were then assigned to particular KEGG metabolic pathways. Python was used to extract KEGG 154

identifiers in every path for comparison with the annotated contigs. InterPro Scan was also used 155

to match P. thornei contigs to characterised protein domains in the InterPro database. Mapping of 156

InterPro domains allowed placement of contigs into the GO hierarchy. The functional 157

classification focused on three main aspects: genes potentially involved in plant parasitism 158

including cell wall-degrading enzymes, neuropeptides and candidate sequences for nematode 159

control via RNA interference (RNAi). Carbohydrate active enzymes (CAZymes) encoded by P. 160

thornei were identified using the Carbohydrate Active Enzymes (CAZy) database 161

Page 11: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

9

(http://www.cazy.org/). The contigs were also compared with all ESTs downloaded from the NCBI 162

database using local TBLASTX. In addition, BLASTX queries against the C. elegans Wormbase 163

datasets (http://www.wormbase.org, release WS215, 2011) were carried out to further annotate 164

the transcriptome and to assign RNAi phenotypes to the contigs (Stein et al., 2001). The presence 165

of signal peptides in any of the contigs was predicted using SignalP 4.0 (Petersen et al., 2011). The 166

hidden Markov model was used to predict transmembrane helices (TMH) in the contigs 167

(Sonnhammer et al., 1998). Because the project focuses on the parasitic behaviour of P. thornei, 168

in all analyses involving published parasitism genes, CAZymes and prediction of new parasitism 169

genes using signal peptides and TMH, the whole reassembled contig set of 34,312 was used. 170

171

2.5. Analysis of unannotated transcripts 172

Transcripts with no known protein homologues in any database were further analysed to 173

assign possible functions to them. For those contigs, a database (http://sirna.sbc.su.se/) search 174

for contigs matching small interfering RNA’s (siRNAs) was performed using a local BLASTN from 175

NCBI (Chalk et al., 2005). The BLAST word size parameter was set to 15 and only those matches 176

with a percentage identity greater than 80% were retained. The presence of micro RNAs (miRNAs) 177

was also determined by searching a miRNA database (http://www.mirbase.org/), with BLASTN 178

using the accurate E-value condition (Griffiths et al., 2006). A further search was conducted to 179

match unannotated contigs to non-coding RNAs (ncRNAs) using the FANTOM 3 database 180

(http://research.imb.uq.edu.au/rnadb). This database contains ncRNAs previously identified in 181

functional annotation of mouse. BLASTN with word size set to 15 and score 50 was used to 182

determine matches. The number of significant matches for contigs in both the annotated and 183

unannotated data sets was determined. In addition, the number and distribution of open reading 184

Page 12: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

10

frames (ORFs) of both the annotated and unannotated contigs were compared to determine 185

whether the latter set of contigs were protein coding sequences. 186

3. Results 187

3.1. 454 GS FLX sequencing and assembly 188

Total RNA extracted from mixed stages of P. thornei was converted to cDNA, rRNA 189

removed and samples sequenced on two full picotitre plates using the Roche 454 GS FLX platform. 190

After filtering the two data sets (L1 and L2), 390,417 reads of N50 of 268 bases were obtained in 191

L1 and 396,858 reads of N50 of 264 bases in L2 (Table 1). The reads were divided into groups 192

based on their length (read length groups). A distribution of the read length (intervals of 20 bases) 193

was bimodal with peaks at 70 and 230 (Supplementary Fig. S1). The first of these may be due to 194

the presence of characteristically short transcripts. The reduction in transcripts of less than 60 195

bases resulted from filtering out short reads. Both data sets have a similar distribution, with L2 196

showing slightly shorter lengths. Fig. 2 shows a parallel base distribution for both datasets L1 and 197

L2, expressed as the distribution of read lengths as a percentage of the total number of bases 198

sequenced. Transcripts of length close to the maximum 280 bases constituted the bulk of the 199

sequenced bases. Also shown is the reduction in numbers of transcripts in the range from 350 to 200

570 bases. 201

A comparison of each of the four initial assemblies was done to assess the quality of the 202

assemblies. The two datasets L1 and L2 were each assembled with CAP3 and MIRA assemblers 203

separately, resulting in 32,897 and 43,569 contigs, respectively. Fig. 3 shows the number and 204

distribution of contigs of a given length for each of the initial assemblies and results of the 205

reassembled data using CAP3. The distribution of the contigs in the size ranges for all of the 206

Page 13: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

11

assemblies are similar, indicating no loss of quality in the process. Assemblies for L1 and L2 show 207

the most similarity, whereas those from different assemblers show the least. The MIRA 208

assemblies for reads L1 and L2 differed in N50 by only four bases, similarly the N50 for reads L1 209

and L2 differed by only six for the CAP3 assemblies. The largest difference in N50 of 22 bases 210

occurred between the MIRA L1 and CAP3 L2 assemblies. A final assembly of these four sets was 211

carried out using CAP3, resulting in a reassembled dataset of 34,312 contigs with a N50 of 707 212

bases. This represents an increase of approximately 200 bases from each of the sub-assemblies, 213

as shown in Table 1. The final step in the assembly process involved selecting those contigs from 214

the combined assembly that had components in each of the four initial assemblies. This 215

requirement was satisfied by a contig in the final assembly if one or more contigs from each of the 216

four sub-assemblies was used in assembling that contig. Subsequent filtering substantially 217

reduced the number of contigs in the final set to 6,989 (Table 1). 218

Approximately 50% of the contigs in the individual assemblies were in the size range of 219

100-500 bases, but in the combined assembly approximately 50% of the contigs were in size range 220

of 500-1,000 bases. In addition, approximately 75% of the contigs were at least 500 bases long in 221

the reassembled and common contig sets compared with 50% for the initial assemblies 222

(Supplementary Table S1). After assembly, CDHIT-EST was used to determine whether or not 223

there was redundancy in the final data set before annotation. There was no significant reduction 224

in the number of transcripts with more than 99% of the contigs being single copies. Also, the 225

reassembled and common contig sets contained contigs substantially longer than those found in 226

the individual assemblies (Fig. 3). This indicates that the final reassembly of the contigs using 227

CAP3 removed redundancy in the data, thereby removing the need to cluster the contigs. 228

229

Page 14: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

12

3.2. Functional classification based on GO assignments 230

To categorise transcripts by putative function, the GO classification scheme was used. The 231

results of this analysis are presented in Fig. 4, which shows the percentages of transcripts 232

classified in terms of their biological, molecular and cellular functions. The y-axis indicates the 233

percentage of contigs assigned a particular GO term in the x-axis- with 1,368 contigs assigned GO 234

terms for multiple molecular functions, 831 mapping to multiple biological processes and 670 235

contigs to cellular components. A complete list of mapping is available in Supplementary Tables 236

S2-S4. Metabolism mappings (under Biological Process, Supplementary Table S2) were the most 237

highly represented GO category with 76% of the contigs encoding metabolism genes. The next 238

most highly represented category (55%) was binding/carrier proteins (in Molecular Function, 239

Supplementary Table S3), with 184 contigs matching ATP binding and over 100 contigs matching to 240

DNA and ion binding (including calcium, iron, metal and magnesium ion binding) genes. 241

Pratylenchus thornei ‘extracellular’ mappings (23 contigs) (under Cellular Component, 242

Supplementary Table S4) were mainly associated with transport of serum albumin which is 243

primarily a carrier protein for steroids, fatty acids and thyroid hormones, hormone activity related 244

to insulin/relaxin and chitin binding (Fig. 4). 245

Because nematodes are sterol-auxotrophs, they rely on dietary sources of sterol for growth 246

and development (Svaboda and Chitwood, 1992). Nematodes can modify ingested sterols, for 247

example they may synthesise 4α-methylated sterols and steroid hormones, using the latter to 248

regulate larval development and moulting (Entchev and Kurzchalia, 2005), and to synthesise 249

cholesterol. Four contigs involved in steroid metabolic process were found, for example oxysterol 250

binding proteins, which bind oxygenated derivatives of cholesterol. 251

252

Page 15: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

13

3.3. Enzymes and orthologues involved in metabolic pathways 253

Pratylenchus thornei contigs were assigned to metabolic pathways using the KEGG 254

database as an alternative approach to determine biochemical functions. EC and KO pathways 255

were used as the basis of this assignment. EC numbers were assigned to 250 contigs of which 165 256

mapped to KO (see Supplementary Tables S5 and S6). All of the 11 major metabolic EC pathways 257

were represented in the contigs (Table 2). Pathways well represented included: purine 258

metabolism (12 enzymes represented), citrate cycle (nine), alanine/aspartate/glutamate 259

metabolism (seven), glycolysis/gluconeogenesis (six), butanoate metabolism (six), nitrogen 260

metabolism (six), pyrimidine metabolism (six) and lysine degradation (six). 261

In general, nematodes store fatty acids and use the glyoxylate cycle to generate 262

carbohydrates from beta-oxidation of such fatty acids, making them unique among animals 263

(Barrett and Wright, 1998). The glyoxylate pathway relies on two enzymes, malate synthase and 264

isocitrate lyase, to bypass two decarboxylation steps. Only two contigs of the mixed stage 265

population of P. thornei mapped to two glyoxylate pathway enzymes, malate synthase (EC: 266

2.3.3.9) and aconitate hydratase (EC: 4.2.1.3), which are shared with the citric acid cycle. 267

Isocitrate lyase was not found in this analysis. 268

In relation to KEGG assignments of transcripts to metabolic pathways (Table 2), we noted 269

apparent anomalies such as transcripts assigned to photosynthesis and carbon fixation – on closer 270

examination these transcripts are subunits α and β of an H+ ATPase common to electron transport 271

pathways in photosynthesis and oxidative phosphorylation, and the transcript assigned to carbon 272

fixation is triose phosphate isomerase, an enzyme of glycolysis and gluconeogenesis. Other 273

apparent unexpected results were observed for activities common to a number of pathways 274

including the assignment of a contig to hexose 6-phosphate transferase, which is involved in the 275

Page 16: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

14

pathway for streptomycin biosynthesis, also a key enzyme in glycolysis. Full details of the EC 276

numbers of all the transcripts are provided in the Supplementary Tables S5 and S6. 277

In addition to an overall analysis of transcripts obtained from the P. thornei transcriptome, 278

a more detailed analysis of putative nematode parasitism genes, including secreted peptides and 279

CAZymes, neuropeptides and sequences orthologous to C. elegans genes, analysed for RNAi is 280

presented. 281

282

3.4. Pratylenchus thornei putative parasitism genes 283

Searches were undertaken to identify contigs in the P. thornei transcriptome that may 284

have homologies or functional equivalents to published parasitism genes or gene products of 285

sedentary and migratory endoparasitic plant nematodes. Twelve gene products known to be 286

involved in parasitism of root knot and cyst nematodes were identified at E-values ranging from 287

10E-6 for Mi-msp-1, a venom allergen-like protein to 10E-131 for 14-3-3(b) gene products, both of 288

M. incognita (Table 3). Contigs matching all gene products ranged from one (for a Heterodera 289

schachtii Annexin (Hs4F01), Mi-msp-1 and Heterodera glycines Vap-1, and a Globodera 290

rostochiensis Peroxiredoxin (Gr-TpX) to 31 for transthyretin-like proteins and precursors closely 291

related to H. glycines and Radopholus similis. Other gene products with high amino acid identities 292

to those of M. incognita were calreticulin (Mi-crt-1), cysteine proteases (Mi-cpl) and the 293

Glutathatione-transferase, Mi-gsts-1. Pratylenchus thornei gene products identical to parasitism 294

genes of cyst nematodes included Gp-far-1, a fatty acid retinoid binding protein and a RAN-BP-like 295

protein (Gp-rbp-1), both of Globodera pallida, and two products from G. rostochiensis: a RAN-BP-296

like protein, Gr-A18 and Gluthathione peroxidase. Twenty-nine contigs in the P. thornei 297

Page 17: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

15

transcriptome were identified as transthyretin-like proteins of the migratory plant-parasitic 298

nematode, R. similis, specifically homologous to Rs-ttl 1-4. These proteins are members of an 299

extensive nematode-specific family of proteins highly expressed in the migratory nematode R. 300

similis. Two additional contigs homologous to a transthyretin-like protein precursor were closely 301

related to sequences of H. glycines. These proteins, as well as Mi-msp-1, are known to be 302

predominantly expressed in parasitic stages of the respective nematodes. 303

Chorismate mutase, cm-1, a gene whose product is known to be involved in the parasitism 304

of both root knot and cyst nematodes, was not identified in the P. thornei transcriptome at the 305

accurate E-value. However, five contigs matched cm-1 of G. rostochiensis (Gr-cm-1) and seven 306

matched cm-1 of H. schachtii (Hs-cm-1) with amino acid identities ranging from 31.6–42.3% and 307

28.6–47.6%, respectively. In addition, four contigs were only 47.6% identical to the isoform of Gr-308

cm-1 gene: Gr-cm-1-IRII (GenBank accession no. ABR19890.1), an alternative splicing product. 309

BLASTX searches also resulted in sequences matching cm-1 of Erysipelotrichaceae bacterium 310

(GenBank accession no. ZP06644729) and chorismate binding enzyme of the bacterium 311

Burkholderia mallei with 30% identity (GenBank accession no.YP103883.1). 312

313

3.5. Secreted peptides and CAZymes 314

CAZymes are primarily involved in carbohydrate metabolism and can be grouped into 315

families of structurally-related catalytic and carbohydrate-binding modules (or functional domains) 316

that degrade, modify or create glycosidic bonds. These families are glycoside hydrolases (GHs), 317

glycosyltransferases (GTs), polysaccharide lyases (PLs), carbohydrate esterases (CEs) and 318

carbohydrate binding modules (CBMs) (www.cazy.org). Some CAZymes are known to be secreted 319

by sedentary as well as migratory plant parasitic nematodes and have been predicted to be 320

Page 18: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

16

involved in nematode migration in the root and feeding from host cells (Bakhetia et al., 2007). In 321

all, 160 contigs matched to all five families of CAZymes; 51 contigs to 19 subfamilies of GHs, 46 322

contigs to 24 subfamilies of GTs, one contig to the PL3 subfamily, eight contigs to three CE 323

subfamilies and 32 contigs matching to eight CBM subfamilies (Table 4). These gene products are 324

most homologous to bacterial and fungal homologues with a few similar to those of C. elegans, 325

insects, fishes and protozoa, except for the carbohydrate esterases, where all four are of non-326

pathogenic origins (Table 4). Contigs identified to encode peptides of the following CAZyme 327

subfamilies were closely related to those identified for three plants: CBM43 (Zea mays), GH17 and 328

GT31 (Oryza sativa), and GH36, GH47, GT47, GT48, GT50 and CBM50 (Vitis vinifera) (not included 329

in Table 4). 330

Endoglucanases in plant nematodes mostly belong to the GH5 subfamily which has been 331

well studied in sedentary nematode genera Heterodera, Globodera and Meloidogyne. One contig 332

of the P. thornei transcriptome was identified as a peptide of the GH5 subfamily: homologous to 333

beta 1, 4-endoglucanase of M. hapla. However, absent from the GH family of CAZymes identified 334

in the transcriptome is that of GH45 previously identified in Bursaphelenchus xylophilus. Of 335

interest are three contigs that show homology to xylanase (GH10) of Neocallimastix patriciarum 336

and one to a GH12 endoglucanase closely related to Sinorhizorbium meliloti, all of non-nematode 337

organisms. Previously xylanase and a GH12 endoglucanase were identified in R. similis and 338

Xiphinema index (Jones et al., 2005). Two contigs were identified as homologous to pectate lyases 339

of M. incognita and M. hapla, a group of enzymes also identified in several sedentary 340

phytoparasitic as well as non-pathogenic and fungivorous nematodes (Doyle and Lambert, 2002; 341

Kikuchi et al., 2006; Karim et al., 2009). 342

Page 19: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

17

Further analysis to predict secreted proteins with possible roles in parasitism of P. thornei 343

was conducted using a combination of three approaches. First we looked for those contigs with 344

predicted signal peptides. We then removed contigs with TMH from this set using the hidden 345

Markov model. Lastly, contigs with carbohydrate active domains were retained. The final set 346

could be proteins secreted by parasitic stages of P. thornei, directed to host cells and could have a 347

direct role in the nematode’s feeding. This analysis revealed 1,416 of the contigs had no TMH 348

whereas 970 had signal peptides. Of these 194 had signal peptides but no TMH. There were 15 349

CAZymes with signal peptides but no TMH (Table 4). Seven of these were identical to GH 350

subfamily enzymes including the GH5 protein of M. hapla and Rotylenchulus reniformis, a GH45 351

enzyme identical to Brugia malayi and a GH47 protein of the nematode Schistosoma mansoni. 352

Other contigs of P. thornei that satisfied the above criteria included proteins from four GT 353

subfamilies (GT2, GT4, GT55 and GT66) mostly identical to those of bacteria and fungi, a CE11 354

protein (Ostreococcus tauri), and three with carbohydrate binding modules (CBM14, CBM33 and 355

CBM48). Examples of eight contigs that matched proteins known to be secreted by other 356

organism included: a glycosyl transferase from Caulobcater segnis and a putative trehalose 357

synthase (Cryptococcus neoformans). Also glycogen synthase (Steinernema feltiae), the C. elegans 358

protein, C08H9.6, and two hypothetical proteins from plant sources (rice and grapes) were 359

identified. One contig matched the Muc4B gene in Drosophila melanogaster, found in secretions 360

of the fly, and loosely related to the salivary mucin MG1 gene in Canis familiaris (E-value 2E-03). 361

362

3.6. Neuropeptides 363

Neuropeptides have a diverse role in the function and development of the nervous system 364

(Li et al., 1999). They not only function as neuromodulators, causing long-term changes in the 365

Page 20: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

18

responding cell through G-protein-coupled receptors, but also as primary transmitters in 366

invertebrate nervous systems (Li et al., 1999; Li, 2005). To date 109 neuropeptide genes have 367

been identified in C. elegans by bioinformatic and peptidomic approaches (Li et al., 1999; Nathoo 368

et al., 2001; Pierce et al., 2001; McVeigh et al., 2005). Based on their conserved motifs, these are 369

divided into three classes: the FMRFamide-like peptide (flp) gene family, ins genes that encode 370

insulin-like peptides and peptides derived from neuropeptide-like protein (nlp) genes which have 371

no sequence similarity to the other two classes (Husson et al., 2007). 372

Searches for P. thornei contigs homologous to the wide variety of bioactive neuropeptides 373

of C. elegans revealed four matches to nlp genes (nlp 15, nlp 24, nlp 26 and nlp 37). Three contigs 374

homologous to two ins genes, ins-17 and ins-18 (EMBL gene names Ceinsulin-2 and Ceinsulin-1, 375

respectively) were identified in the P. thornei transcriptome. In C. elegans, these may be involved 376

in reproductive growth (Malone and Thomas, 1994; Li et al., 2003), in determining life span 377

(Kenyon et al., 1993) and to limit body size (McCulloch and Gems, 2003). Flp 17 was the only 378

FMRFamide-like peptide identified in the transcriptome, matching one contig. It is worth 379

mentioning that at E-values greater than 10E-3, 209 contigs matched to 40 of the 47 known nlps of 380

C. elegans, 31 to five flp gene families and one contig also matched to ins-2. 381

382

3.7. The P. thornei transcriptome and other nematode families 383

The annotated P. thornei contigs were compared with ESTs of plant parasitic nematode 384

families: Meloidogynidae (73343 ESTs, NCBI), Heteroderidae (48145 ESTs, NCBI), three groups of 385

Pratylenchidae and to C. elegans proteins (Wormbase release 225 with 25,030 proteins) using 386

BLASTX and tBLASTX (Fig. 5). In all, 1,947 and 2,039 of the contigs, respectively, matched to 387

Page 21: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

19

Meloidogynidae and Heteroderidae families whereas 2,014 of the contigs matched C. elegans 388

proteins. A comparative analysis with available sequences of the Pratylenchidae revealed that 389

18.5% matched to the combined data of Pratylenchus penetrans and Pratylenchus vulnus, and 390

19.7% of sequences of R. similis have significant matches to the P. thornei common contigs (Fig. 5). 391

Of the 2,940 P. thornei common contigs which have significant similarity to 5,504 P. coffeae 392

contigs, only 584 (8.4%) were unannotated. Approximately 18% of the contigs matched 393

sequences of all of the groups compared. Sequences of some contigs were exclusively 394

homologous to some groups: 3.7% to ESTs of Meloidogynidae, 4.6% to ESTs of Heteroderidae and 395

5.4% to C. elegans proteins of which the 20 most highly conserved gene products are provided in 396

Table 5. These include gene products of structural components (actin, paramyosin), enzymes 397

(ubiquitin activating, protein kinase, glutamic acid decarboxylase), proteins with binding 398

properties (DNA binding-topoisomerase, calcium binding), genes of chromosomal loci important 399

for development such as structural maintenance of chromosomes (smc-3) and 400

embryogenesis/moulting (gex-2), and Argonaute-Like genes (alg-1) (Table 5). Twelve percent of 401

the contigs that matched the Meloidogynidae and Heteroderidae families had no match to any C. 402

elegans protein. Approximately 57% percent of the contigs did not match any sequence of the 403

sedentary plant parasites nor the free–living C. elegans. They also did not match any other 404

sequence on gene and protein sequence databases. 405

406

3.8. RNAi phenotype matches to C. elegans genes 407

The majority of C. elegans genes have been surveyed for knock-out phenotypes using RNAi, 408

where matching mRNA is degraded by the introduction of sequence-specific double-stranded RNA 409

(dsRNA) (Fire et al., 1998). A comparative analysis was done with the P. thornei contigs and C. 410

Page 22: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

20

elegans genes to look for orthologues with high identity to those of C. elegans with RNAi 411

phenotypes. The 2,014 P. thornei contigs homologous to C. elegans matched 1,484 genes of which 412

1,276 have documented RNAi phenotypes. These genes are expressed in all stages of the 413

nematode’s development and are involved in cellular and molecular functions that directly affect 414

biological processes such as germ cell development, embryogenesis, larval development, 415

moulting, positive and negative regulation of growth, digestion, reproduction, excretion and 416

defense response. Most of the RNAi phenotypes indicate a knock-out of these genes will markedly 417

affect nematode development, and for many of these genes RNAi phenotypes include arrested 418

development at the embryonic, larval and adult stages. RNAi of approximately 35% of these genes 419

results in some form of lethality at the embryonic, larval or adult stages in C. elegans. 420

421

3.9. Unannotated contigs 422

A significant number of the contigs (approximately 57%) had no homology in the NCBI 423

databases after searches at E-values less than 10E-04 . This may be because these sequences are 424

specific to P. thornei or to Pratylenchus spp. in general. They may also be non-protein-coding 425

sequences, or may have had coding sequences that were too short to be matched by BLASTX. 426

Further studies were performed to analyse these sequences, their relevance or possible functions 427

in the transcriptome. ORFfinder was used to predict the number of non-protein coding sequences 428

in the un-annotated contigs and the distribution compared with that of the annotated dataset, 429

where an ORF was defined as a sequence with both start and stop codons using the standard 430

code. The results indicate 43% had ORFs. A similar percentage (43.2%) of the unannotated 431

contigs exhibited a similar pattern in ORF size distribution to that of annotated contigs, indicating 432

Page 23: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

21

that they could be functional sequences. For both groups, the frequency of the contigs was 433

highest when the longest ORF ranged from 160-180 nt long (Fig. 6). 434

For functional predictions, the non-coding sequences were searched for matches to 435

regulatory RNAs, such as miRNAs and siRNAs, from a bulk of all of the regulatory RNAs (Backofen 436

et al., 2010). Six non-coding sequences were found to be miRNAs whereas 99 matched to 437

previously published siRNAs. A further search in the contigs for sequences matching non-coding 438

functional RNAs of the mouse genome indicates that the ratio of contigs with significant similarity 439

with the ncRNAs for the annotated and unannotated sets is similar; 18.9 and 23.4, respectively. 440

Considering a large percentage of the unannotated contigs with ORFs do not have any association 441

with ncRNAs, they could be genuine P. thornei coding sequences with no homology to previously 442

identified genes. 443

4. Discussion 444

This study describes the first transcriptome of mixed stages of the migratory root lesion 445

nematode, P. thornei, sequenced with Roche 454 GS FLX technology. Following assembly using 446

MIRA and CAP3 assemblers to obtain 34,312 contigs with a N50 of 707 bases, 6,989 contigs 447

common to both assemblers were selected for further study. Using different annotation 448

procedures, a total of 43% of the contigs were annotated. These were matched to known protein 449

products of which there were representative enzymes involved in all major metabolic pathways 450

essential for growth and development. Approximately 57% of the contigs did not match 451

sequences of any of the three groups nor any known organism and could be specific to P. thornei. 452

A comparison (BLASTX and tBLASTX) of the contigs with peptides of the free-living nematode C. 453

elegans and ESTs of Meloidogynidae and Heteroderidae, reveals that 12% were common to all 454

three groups. These may represent genes involved in nematode metabolism, structure and 455

Page 24: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

22

development. Of the 12% of the contigs which matched the plant parasitic groups but were not 456

present in C. elegans, 3.7% and 4.6% of the contigs only matched sequences of Meloidogynidae 457

and Heteroderidae, respectively, with 4.7% common to both. The 12% of contigs common only to 458

plant parasites will include genes specifically involved in plant parasitism. 459

Our analysis revealed that the P. thornei transcriptome encodes messages similar to those 460

known to be involved in parasitism by sedentary endoparasitic nematodes. These include 461

orthologues to 14-3-3 and 14-3-3b proteins, calreticulin (Mi-crt-1), cysteine proteases (Mi-cpl-1), 462

Glutathione-S-transferase (Mi-gsts-1) and the Venom allergen-like protein (Mi-mps-1), all closely 463

relate to M. incognita and Annexin (Hs4F01), Transthyretin-like protein and Ubiquitin extension 464

protein (Hs-ubi-1) of H. glycines and H. schachtii, as well as five gene products homologous to 465

Gobodera spp. The proposed function of parasitism genes in sedentary endoparasitic nematodes 466

includes suppression of host defense e.g. chorismate mutase (Lambert et al., 1999; Bekal et al., 467

2003; Jones et al., 2003; Abad et al., 2008; Lu et al., 2008; Vanholme et al., 2009) and induction 468

and maintenance of host feeding cells e.g. SPRYSEC RANBP-like (Sacco et al., 2009). The presence 469

of genes found both in sedentary endoparasites and migratory endoparasites is consistent with a 470

function of suppression of host defense, but since root lesion nematodes do not induce specific 471

feeding cells, the roles of genes found in the RLNs previously assumed to be involved in feeding 472

cell induction, may indicate a different function in the migratory endoparasites, or that the role 473

proposed for sedentary endoparasites needs to be revised. 474

Thirty-one contigs were orthologous to transthyretin-like proteins and precursors of H. 475

glycines and R. similis. Although no conclusive data supports the involvement of these proteins in 476

parasitism by plant parasitic nematodes, these nematode-specific families of proteins are 477

abundantly expressed both in juveniles and adults, the motile and parasitic stages of R. similis, but 478

Page 25: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

23

not in embryos (Jacob et al., 2007). Coupled with the presence of a putative signal peptide for 479

secretion, their mature proteins could have extracellular functions and may well constitute part of 480

the nematode’s secretome, which includes proteins for modifying plant cell wall polysaccharides 481

to aid migration and feeding. In the recently published transcriptome of P. coffeae (Haegeman et 482

al., 2011), one contig matched chorismate mutase with a ‘bit score of only 54.3’. This gene is 483

present in sedentary endoparasites where it may influence differentiation of root cells by altering 484

the synthesis of chorismate-derived products, resulting in a down-regulation of plant defense 485

against invading nematodes (Lambert et al., 1999; Bekal et al., 2003; Jones et al., 2003; Abad et al., 486

2008; Lu et al., 2008; Vanholme et al., 2009). Secreted chorismate mutases have also been 487

identified in plants, fungi and bacteria where they might serve as general tools for host 488

manipulation (Romero et al., 1995). Recently Djamei et al. (2011) reported that a chorismate 489

mutase, Cmu1, secreted by Ustilago maydis, a biotrophic fungal pathogen, acts as a virulence 490

factor in its interaction with host cells. They showed that during the host-pathogen interaction, 491

Cmu1 is taken up by plant cells, moves between cells and changes their metabolic status through 492

metabolic priming. However, at only 46.7% identity of 12 P. thornei contigs to chorismate mutase 493

of G. rostochiensis (Gr-cm-1) and H. schachtii (Hs-cm-1) identified in this study, it is still not clear 494

whether or how migratory plant parasitic nematodes of the Pratylenchus spp. make use of this 495

gene in their interaction with plant hosts. 496

Feeding habits and life styles of both sedentary endoparasitic and migratory ectoparasitic 497

plant nematodes indicate that secretions from their gland cells include CAZymes, which can act 498

directly on cell wall polysaccharides of host plants (Davis et al., 2004; Vanholme et al., 2004; Baum 499

et al., 2007; Mitchum et al., 2007). CAZymes identified in the P. thornei transcriptome belonged to 500

all five families including contigs matching 18 glycosyl hydrolase subfamilies of which most appear 501

to be of bacterial and fungal origin, supporting the theory that most of these have probably been 502

Page 26: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

24

acquired from bacteria by horizontal gene transfer (Davis et al., 2000; Jasmer et al., 2003; Baldwin 503

et al., 2004; Ledger et al., 2006). Cellulases, in the CAZy database (http://www.cazy.org/), can be 504

found in 13 GH families: GH5, 6, 7, 8, 9, 10, 12, 26, 44, 45, 51, 61 and 74, most of which are 505

typically produced by plant pathogens and plant parasites (Lynd et al., 2002). Of the GH family of 506

enzymes identified in the P. thornei transcriptome, four (GH5, 9, 10 and 12) can be classified as 507

cellulases. These included a GH5 endoglucanase (M. hapla); also a xylanase (GH10) identical to 508

Neocallimastix patriciarum and a GH12 endoglucanase closely related to Sinorhizorbium meliloti, 509

orthologues of which have been shown to be produced by R. similis, M. incognita, P. coffeae and 510

Xiphinema index (Jones et al., 2005; Mitreva-Dautova et al., 2006; Haegeman et al., 2009, 2011). 511

Less surprisingly, the P. thornei glycosyl hydrolases orthologous to C. elegans are not cellulases. 512

They are α-glucocerebrodiase (GH30), β-glucosidase (GH31), α-amylase (GH13) and β-513

hexosiminidase (GH20), which are not known to have cell wall-degrading properties. The only 514

polysaccharide lyase in the transcriptome significantly matched PL3 of M. incognita and 515

Meloidogyne javanica, an enzyme which has been well characterised in cyst and root knot 516

nematodes (Bird and Koltai, 2000; Popeijus, H., 2002. The identification of cell-wall degrading 517

enzymes in Globodera rostochiensis. PhD thesis. Wageningen University and Research Centre, The 518

Netherlands), but was not known to be part of the arsenal of cell wall degrading enzymes 519

employed by nematodes of the family Pratylenchidae. The many other CAZymes identified in the 520

P. thornei transcriptome (Table 4) may have undetermined role(s) in nematode migration and 521

feeding but this remains to be determined. 522

To predict new or putative secreted proteins encoded by P. thornei we identified contigs 523

with signal peptides and carbohydrate enzymatic activity or binding modules but with no 524

transmembrane helices/domain. The most interesting contigs with these characteristics, which 525

had no matches to any C. elegans protein, could be involved in P. thornei-host interaction. These 526

Page 27: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

25

included a beta-1,4-endoglucanase that had significant identity to those of the sedentary 527

endoparasitic M. hapla and the sedentary semi-endoparasitic R. reniformis. Beta 1, 4-528

endoglucanases are cellulases generally known to be employed by sedentary endoparasitic 529

nematodes in degrading cellulose by hydrolysis. In addition, some contigs matched to three other 530

putative secreted proteins: these were a 1,6-glucosidase, a pullulanase-type peptide with high 531

identity to that of Chloroflexus aggregans, a beta-mannase (Roseburia intestinalis), and a glycoside 532

hydrolase (GH45, B. malayi). These would need further investigation to confirm their possible 533

involvement in P. thornei parasitism. In this study, 57% of the contigs analysed did not match 534

proteins or hypothetical proteins in any database. By comparing these sequences with contigs of 535

the recently published P. coffeae transcriptome (where 75% were not annotated), the number of 536

unannotated P. thornei contigs did not change. Further analysis of these unannotated contigs 537

indicates that more than half did not have ORFs and are probably ncRNAs. However, two-thirds of 538

them exhibited the same distribution of ORF sizes as the annotated sequences. Furthermore, only 539

3.5% of these sequences matched publicly available miRNAs and siRNAs, suggesting that a 540

majority of these unidentified sequences could be genes unique to P. thornei. Further work is 541

needed to characterise them and to identify possible new functional sequences. Recent 542

transcriptome analyses have revealed extensive expression of ncRNAs (Mercer et al., 2009; 543

Backofen et al., 2010; Langenberger et al., 2010). A study by Kapranov et al. (2007) revealed that 544

transcription of non-coding sequences can be up to four times higher than those of coding 545

sequences. Although previously thought to be transcriptional ‘noise’ (Struhl, 2007), ncRNAs are 546

now known to have a broad range of functional roles in subcellular structural organization and 547

development (Amaral and Mattick, 2008). For example, because some nematode genes are 548

polycistronic, ncRNAs could have significant roles in regulation of expression via cis/trans splicing. 549

Page 28: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

26

Comprehensive analysis of the transcriptome revealed high identity of some proteins to 550

those of the fully annotated C. elegans for which RNAi phenotypes indicate disruptions to 551

morphology and development at all stages of development. This analysis and reports of successful 552

in-planta RNAi against cyst and root knot nematodes suggests (Steeves et al., 2006; Yadav et al., 553

2006) that this approach should also be successful in conferring transgenic resistance to root 554

lesion nematodes. 555

In conclusion, the transcriptome for P. thornei has been sequenced and 6,989 unique 556

contigs common to two contig assemblers were annotated. Of these, 43% were identified and 557

provide new information for this species. The unannotated sequences include functional 558

sequences specific to this genus and require further characterisation. The availability of such data 559

will help to differentiate genes specific to sedentary and migratory endoparasites, and to 560

characterise those genes involved in migration and feeding. This information will also be valuable 561

for further development of diagnostic tests to identify specific RLN species and to identify new 562

gene targets for their control, for example via RNAi. 563

Acknowledgements 564

This work was supported by the Australian India Strategic Research Fund (BF030027), 565

Murdoch University, Australia (PhD studentship to Paul Nicol), iVEC, Australia through the use of 566

advanced computing resources located at Murdoch University and NemGenix Pty Ltd, Australia. 567

568

Page 29: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

27

References 569

Abad, P., Gouzy, J., Aury, J.-M., Castagnone-Sereno, P., Danchin, E.G.J., Deleury, E., Perfus-570

Barbeoch, L., Anthouard, V., Artiguenave, F., Blok, V.C., Caillaud, M.-C., Coutinho, P.M., 571

Dasilva, C., De Luca, F., Deau, F., Esquibet, M., Flutre, T., Goldstone, J.V., Hamamouch, N., 572

Hewezi, T., Jaillon, O., Jubin, C., Leonetti, P., Magliano, M., Maier, T.R., Markov, G.V., 573

McVeigh, P., Pesole, G., Poulain, J., Robinson-Rechavi, M., Sallet, E., Segurens, B., 574

Steinbach, D., Tytgat, T., Ugarte, E., van Ghelder, C., Veronico, P., Baum, T.J., Blaxter, M., 575

Bleve-Zacheo, T., Davis, E.L., Ewbank, J.J., Favery, B., Grenier, E., Henrissat, B., Jones, J.T., 576

Laudet, V., Maule, A.G., Quesneville, H., Rosso, M.-N., Schiex, T., Smant, G., Weissenbach, 577

J., Wincker, P., 2008. Genome sequence of the metazoan plant-parasitic nematode 578

Meloidogyne incognita. Nat. Biotech. 26, 909-915. 579

Amaral, P.P., Mattick, J.S., 2008. Noncoding RNA in development. Mamm. Genome 19, 454-492. 580

Ashburner,M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., 581

Dwight, S.S., Eppig, J.T., Harris, A.M., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., 582

Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G., 2000. Gene 583

Ontology: tool for the unification of biology. Nat. Genet. 25, 25-29. 584

Backofen, R., Chitsaz, H., Hofacker, I., Sahinalp, S., Stadler, P., 2010. Computational studies of non-585

coding RNAs. Pac. Symp. Biocomput. 15, 54-56. 586

Bakhetia, M., Urwin, P.E., Atkinson, H.J., 2007. qPCR analysis and RNAi define pharyngeal gland 587

cell-expressed genes of Heterodera glycines required for initial interactions with the host. 588

Mol. Plant-Microbe Interact. 20, 306-312. 589

Baldwin, J.G., Nadler, S.A., Adams, B.J., 2004. Evolution of plant parasitism among nematodes. 590

Ann. Rev. Phytopathol. 42, 83-105. 591

Barrett, J., Wright, D.J., 1998. Intermediary metabolism. In: Perry, R.N., Wright, D.J. (Ed.), The 592

physiology and biochemistry of free-living and plant-parasitic nematodes. CABI Publishing, 593

Oxford, pp 331-353. 594

Baum, T.J., Hussey, R.S., Davis, E.L., , 2007. Root-knot and cyst nematode parasitism genes: the 595

molecular basis of plant parasitism. In: Setlow, J.K. (Ed.), Genetic Engineering, Landscape 596

Series. Springer, New York USA, 28, pp. 17-43. 597

Bekal, S., Niblack, T.L., Lambert, K.N., 2003. A chorismate mutase from the soybean cyst nematode 598

Heterodera glycines shows polymorphisms that correlate with virulence. Mol. Plant-599

Microbe Interact. 16, 439-446. 600

Bird, D.M.K., Koltai, H., 2000. Plant parasitic nematodes: habitats, hormones, and horizontally-601

acquired genes. J. Plant Growth Regul. 19, 183-194. 602

Blanchard, A., Esquibet, M., Fouville, D., Grenier, E., 2005. Ranbpm homologue genes 603

characterised in the cyst nematodes Globodera pallida and Globodera 'mexicana'. Physiol. 604

Mol. Plant Pathol. 67, 15-22. 605

Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., Madden, T.L., 2009. 606

BLAST+: architecture and applications. BMC Bioinf. 10, 421. 607

Castillo, P. and Vovlas, N., 2008. Pratylenchus (Nematoda: Pratylenchidae): Diagnosis, Biology, 608

Pathogenicity and Management. Nematology Monographs and Perspectives, Vol. 6. Hunt, 609

D.J., Perry, R.N., series eds. Brill, Leiden, The Netherlands. 610

Chalk, A.M., Warfvinge, R., Georgii-Hemming, P., Sonnhammer, E.L.L., 2005. The siRNA database. 611

Nucl. Acids Res. 33, D131-134. 612

Page 30: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

28

Chevreux, B., Pfisterer, T., Drescher, B., Driesel, A.J., Muller, W.E., Wetter, T., Suhai, S., 2004. Using 613

the miraEST assembler for reliable and automated mRNA transcript assembly and SNP 614

detection in sequenced ESTs. Genome Res. 14, 1147-1159. 615

Chitwood, D. J., 2003. Research on plant-parasitic nematode biology conducted by the United 616

States Department of Agriculture - Agricultural Research Service. Pest Manag. Sci. 59, 748-617

753. 618

Cock, P.J., Antao, T., Chang, J.T., Chapman, B.A., Cox, C.J., Dalke, A., Friedberg, I., Hamelryck, T., 619

Kauff, F., Wilczynski, B., de Hoon, M.J., 2009. Biopython: freely available Python tools for 620

computational molecular biology and bioinformatics. Bioinformatics 25, 1422-1423. 621

Davis, E.L., Hussey, R.S., Baum, T.J., Bakker, J., Shcots, A., Rosso, M.N., Abad, P., 2000. Nematode 622

parasitism genes. Annu. Rev. Phytopathol. 38, 365-396. 623

Davis, E.L., Hussey, R.S., Baum, T.J., 2004. Getting to the roots of parasitism by nematodes. Trends 624

Parasitol. 20, 134-141. 625

Dieterich, C., Clifton, S.W., Schuster, L.N., Chinwalla, A., Delehaunty, K., Dinkelacker, I., Fulton, L., 626

Fulton, R., Godfrey, J., Minx, P., 2008. The Pristionchus pacificus genome provides a unique 627

perspective on nematode lifestyle and parasitism. Nat. Genet. 40, 1193-1198. 628

Ding, X., Shields, J., Allen, R., Hussey, R., 2000. Molecular cloning and characterisation of a venom 629

allergen AG5-like cDNA from Meloidogyne incognita1. Int. J. Parasitol. 30, 77-81. 630

Djamei, A., Schipper, K., Rabe, F., Ghosh, A., Vincon, V., Kahnt, J., Osorio, S., Tohge, T., Fernie, A.R., 631

Feussner, I., Feussner, K., Meinicke, P., Stierhof, Y.D., Schwarz, H., Macek, B., Mann, M., 632

Kahmann, R., 2011. Metabolic priming by a secreted fungal effector. Nature 478, 395–398 633

Doyle, E.A., Lambert, K.N., 2003. Meloidogyne javanica chorismate mutase 1 alters plant cell 634

development. Mol. Plant-Microbe Interact. 16, 123-131. 635

Entchev, E.V., Kurzchalia,T., 2005. Requirement of sterols in the life cycle of the nematode 636

Caenorhabditis elegans. Semin. Cell Dev. Biol. 16, 175 - 182. 637

Fire, A., Xu, S., Montgomery, M.K., Kostas, S.A., Driver, S.E. and Mello, C.C., 1998. Potent and 638

specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature 639

391, 806-811. 640

Gao, B., Allen, R., Davis, E.L., Baum, T.J., Hussey, R.S., 2004. Molecular characterisation and 641

developmental expression of a cellulose-binding protein gene in the soybean cyst 642

nematode Heterodera glycines. Int. J. Parasitol. 34, 1377-1383. 643

Gao, B., Allen, R., Maier, T., Davis, E.L., Baum, T.J., Hussey, R.S., 2001. Molecular characterisation 644

and expression of two venom allergen-like protein genes in Heterodera glycines. Int. J. 645

Parasitol. 31, 1617-1625. 646

Gao, B., Allen, R., Maier, T., Davis, E., Baum, T., Hussey, R., 2002. Identification of a New β-1, 4-647

endoglucanase Gene Expressed in the Esophageal Subventral Gland Cells of Heterodera 648

glycines. J. Nematol. 34, 12. 649

Gao, B., Allen, R., Maier, T., Davis, E.L., Baum, T.J., Hussey, R.S., 2003. The Parasitome of the 650

Phytonematode Heterodera glycines. Mol. Plant-Microbe Interact. 16, 720-726. 651

Griffiths-Jones, S., Grocock, R.J., van Dongen, S., Bateman, A., Enright, A.J., 2006. miRBase: 652

microRNA sequences, targets and gene nomenclature. Nucl. Acids Res. 34, 140-144 653

Haegeman, A., Joseph, S., Gheysen, G., 2011. Analysis of the transcriptome of the root lesion 654

nematode Pratylenchus coffeae generated by 454 sequencing technology. Mol. Biochem. 655

Parasitol. 178, 7-14. 656

Haegeman, A., Vanholme, B., Gheysen, G., 2009. Charachterization of a putative endoxylanase in 657

the migratory plant-parasitic nematode Radopholus similis. Mol. Plant Pathol. 10, 389 - 658

401. 659

Page 31: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

29

Huang, X., Madan, A., 1999. CAP3: A DNA sequence assembly program. Genome Res. 9, 868-887. 660

Hunter, S., Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Binns, D., Bork, P., Das, U., 661

Daugherty, L., Duquenne, L., Finn, R.D., Gough, J., Haft, D., Hulo, N., Kahn, D., Kelly, E., 662

Laugraud, A., Letunic, I., Lonsdale, D., Lopez, R., Madera, M., Maslen, J., McAnulla, C., 663

McDowall, J., Mistry, J., Mitchell, A., Mulder, N., Natale, D., Orengo, C., Quinn, A.F., 664

Selengut, J.D., Sigrist, C.J.A., Thimma, M., Thomas, P.D., Valentin, F., Wilson, D., Wu, C.H., 665

Yeats, C., 2009. InterPro: the integrative protein signature database. Nucl. Acids Res. 37, 666

D211-215. 667

Husson, S.J., Mertens, I., Janssen, T., Lindemans, M., Schoofs, L., 2007. Neuropeptidergic signaling 668

in the nematode Caenorhabditis elegans. Prog. Neurobiol. 82, 33-55. 669

Jacob, J., Vanholme, B., Haegeman, A., Gheysen, G., 2007. Four transthyretin-like genes of the 670

migratory plant-parasitic nematode Radopholus similis: members of an extensive 671

nematode-specific family. Gene 402, 9-19. 672

Jasmer, D.P., Goverse, A., Smant, G., 2003. Parasitic nematode interactions with mammals and 673

plants. Ann. Rev. Phytopathol. 41, 245-270. 674

Jaubert, S., Laffaire, J.B., Ledger, T.N., Escoubas, P., Amri, E.Z., Abad, P., Rosso, M.N., 2004. 675

Comparative analysis of two 14-3-3 homologues and their expression pattern in the root-676

knot nematode Meloidogyne incognita. Int. J. Parasitol. 34, 873-880. 677

Jaubert, S., Laffaire, J.B., Piotte, C., Abad, P., Rosso, M.N., 2002. Direct identification of stylet 678

secreted proteins from root-knot nematodes by a proteomic approach. Mol. Biochem. 679

Parasitol. 121, 205-211. 680

Jaubert, S., Milac, A.L., Petrescu, A.J., de Almeida-Engler, J., Abad, P., Rosso, M.N., 2005. In planta 681

secretion of a calreticulin by migratory and sedentary stages of root-knot nematode. Mol. 682

Plant-Microbe Interact. 18, 1277-1284. 683

Jones, J.T., Furlanetto, C., Bakker, E., Banks, B., Blok, V., Chen, Q., Phillips, M., Prior, A., 2003. 684

Characterization of a chorismate mutase from the potato cyst nematode Globodera 685

pallida. Mol. Plant Pathol. 4, 43-50. 686

Jones, J.T., Furlanetto, C., Kikuchi, T., 2005. Horizontal gene transfer from bacteria and fungi as a 687

driving force in the evolution of plant parasitism in nematodes. Nematology 7, 641-646. 688

Jones, J.T., Reavy, B., Smant, G., Prior, A.E., 2004. Glutathione peroxidases of the potato cyst 689

nematode Globodera Rostochiensis. Gene 324, 47-54. 690

Jones, M.G.K., Rana, J., Fosu-Nyarko, J., Zhang, J.J., Perera, M. and Chamberlain, D. 2009. Towards 691

developing transgenic resistance to nematodes in wheat. In Cereal cust nematodes: status, 692

research and outlook, Eds Riley, I.T., Nicol, J.M. and Dababat, A.A., Proceedings First 693

Workshop of the International Cereal Cyst Nematode Initative, Oct 2009, Anatolya, Turkey: 694

CIMMYT, pp 191-194.Kikuchi, T., Shibuya, H., Aikawa, T., Jones, J.T., 2006. Cloning and 695

characterization of pectate lyases expressed in the esophageal gland of the pine wood 696

nematode Bursaphelenchus xylophilus. Mol. Plant-Microbe Interact. 19, 280-287. 697

Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M., Hirakawa, M., 2010. KEGG for representation 698

and analysis of molecular networks involving diseases and drugs. Nucl. Acids Res. 38, D355-699

360. 700

Kapranov, P., Cheng, J., Dike, S., Nix, D.A., Duttagupta, R., Willingham, A.T., Stadler, P.F., Hertel, J., 701

Hackermüller, J., Hofacker, I.L., 2007. RNA maps reveal new RNA classes and a possible 702

function for pervasive transcription. Science 316, 1484. 703

Karim, N., Jones, J., Okada, H., Kikuchi, T., 2009. Analysis of expressed sequence tags and 704

identification of genes encoding cell-wall-degrading enzymes from the fungivorous 705

nematode Aphelenchus avenae. BMC Genomics 10, 525. 706

Page 32: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

30

Kenyon, C., Chang, J., Gensch, E., Rudner, A., Ruvkun, G., Tabtiang, R., 1993. A C. elegans mutant 707

that lives twice as long as wild type. Nature 366, 461-464. 708

Kumar, S., Blaxter, M., 2010. Comparing de novo assemblers for 454 transcriptome data. BMC 709

Genomics 11, 571. 710

Lambert, K.N., Allen, K.D., Sussex, I.M., 1999. Cloning and characterization of an esophageal-gland-711

specific chorismate mutase from the phytoparasitic nematode Meloidogyne javanica. Mol. 712

Plant-Microbe Interact. 12, 328-336. 713

Langenberger, D., Bermudez-Santana, C., Stadler, P., Hoffmann, S., 2010. Identification and 714

classification of small RNAs in transcriptome sequence data. Pac. Symp. Biocomput. 15, 80–715

87. 716

Ledger, T.N., Jaubert, S., Bosselut, N., Abad, P., Rosso, M.N., 2006. Characterization of a new 717

[beta]-1, 4-endoglucanase gene from the root-knot nematode Meloidogyne incognita and 718

evolutionary scheme for phytonematode family 5 glycosyl hydrolases. Gene 382, 121-128. 719

Li, C., 2005. The ever-expanding neuropeptide gene families in the nematode Caenorhabditis 720

elegans. Parasitology 131, 109-127. 721

Li, W. and GODZIK, A., 2006. Cd-hit: a fast program for clustering and comparing large sets of 722

protein or nucleotide sequences. Bioinformatics 22, 1658-1659. 723

Li, C., Nelson, L.S., Kim, K., Nathoo, A., Hart, A.C., 1999. Neuropeptide gene families in the 724

nematode Caenorhabditis elegans. Ann. NY Acad. Sci. 897, 239-252. 725

Li, W., Kennedy, S.G., Ruvkun, G., 2003. daf-28 encodes a C. elegans insulin superfamily member 726

that is regulated by environmental cues and acts in the DAF-2 signaling pathway. Genes 727

Dev. 17, 844-858. 728

Lu, S.W., Tian, D., Borchardt-Wier, H.B., Wang, X., 2008. Alternative splicing: A novel mechanism of 729

regulation identified in the chorismate mutase gene of the potato cyst nematode 730

Globodera rostochiensis. Mol. Biochem. Parasitol. 162, 1-15. 731

Lynd, L.R., Weimer, P.J., van Zyl, W.H., Pretorius, I.S., 2002. Microbial Cellulose Utilization: 732

Fundamentals and Biotechnology. Microbiol. Mol. Biol. Rev. 66, 506-577. 733

Malone, E.A., Thomas, J.H., 1994. A screen for nonconditional dauer-constitutive mutations in 734

Caenorhabditis elegans. Genetics 136, 879-886. 735

McCulloch, D., Gems, D., 2003. Body size, insulin/IGF signaling and aging in the nematode 736

Caenorhabditis elegans. Exp. Gerontol. 38, 129-136. 737

McVeigh, P., Leech, S., Mair, J.R., Marks, N.J., Geary, T.J., Maule, A.G., 2005. Analysis of 738

FMRFamide-like peptide (FLP) diversity in phylum Nematoda. Int. J. Parasitol. 35, 1043-739

1060. 740

Mercer, T.R., Dinger, M.E., Mattick, J.S., 2009. Long non-coding RNAs:insight into functions. Nat. 741

Rev. 10, 155-159. 742

Mitchum, M., Hussey, R., Davis, E., Baum, T., Punja, Z., Boer, S., Sanfaçon, H., 2007. Application of 743

biotechnology to understand pathogenesis in nematode plant pathogens. In: Punja, Z. K., 744

de Boer, S. H., Sanfaçon, H. (Ed.), Biotechnology and Plant Disease Managment. CABI 745

International, Wallingford, UK, pp 58-86. 746

Mitreva, M., Elling, A., Dante, M., Kloek, A., Kalyanaraman, A., Aluru, S., Clifton, S., Bird, D.M.K., 747

Baum, T., McCarter, J., 2004. A survey of SL1-spliced transcripts from the root-lesion 748

nematode Pratylenchus penetrans. Mol. Genet. Genomics 272, 138-148. 749

Mitreva-Dautova, M., Roze, E., Overmars, H., de Graaff, L., Schots, A., Helder, J., Goverse, A., 750

Bakker, J., Smant, G., 2006. A symbiont-independent endo-1, 4-β-xylanase from the plant-751

parasitic nematode Meloidogyne incognita. Mol. Plant-Microbe Interact. 19, 521-529. 752

Page 33: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

31

Morozova, O., Hirst, M., Marra, M.A., 2009. Applications of new sequencing technologies for 753

transcriptome analysis. Annu. Rev. Genomics Hum. Genet. 10, 135-151. 754

Nathoo, A.N., Moeller, R.A., Westlund, B.A., Hart, A.C. 2001. Identification of neuropeptide-like 755

protein gene families in Caenorhabditis elegans and other species. Proc. Natl. Acad. Sci. 756

USA 98, 14000-14005. 757

Opperman, C.H., Bird, D.M., Williamson, V.M., Rokhsar, D.S., Burke, M., Cohn, J., Cromer, J., 758

Diener, S., Gajan, J., Graham, S., Houfek, T.D., Liu, Q., Mitros, T., Schaff, J., Schaffer, R., 759

Scholl, E., Sosinski, B.R., Thomas, V.P., Windham, E., 2008. Sequence and genetic map of 760

Meloidogyne hapla: A compact nematode genome for plant parasitism. Proc. Natl. Acad. 761

Sci. USA 105, 14802-14807. 762

Petersen, T.N., Brunak, S., von Heijne, G., Nielsen, H., 2011. SignalP 4.0: discriminating signal 763

peptides from transmembrane regions. Nature Meth. 8, 785-786. 764

Pierce, S.B., Costa, M., Wisotzkey, R., Devadhar, S., Homburger, S.A., Buchman, A.R., Ferguson, 765

K.C., Heller, J., Platt, D.M., Pasquinelli, A.A., Liu, L.X., Doberstein, S.K., Ruvkun, G., 2001. 766

Regulation of DAF-2 receptor signaling by human insulin and ins-1, a member of the 767

unusually large and diverse C. elegans insulin gene family. Genes Dev. 15, 672-686. 768

Prior, A., Jones, J.T., Blok, V.C., Beauchamp, J., McDermott, L., Cooper, A., Kennedy, M.W., 2001. A 769

surface-associated retinol-and fatty acid-binding protein (Gp-FAR-1) from the potato cyst 770

nematode Globodera pallida: lipid binding activities, structural analysis and expression 771

pattern. Biochem. J. 356, 387. 772

Romero, R., Roberts, M., Phillipson, J., 1995. Chorismate mutase in microorganisms and plants. 773

Phytochemistry. 40, 1015-1025. 774

Sacco, M.A., Koropacka, K., Grenier, E., Jaubert, M.J., Blanchard, A., Goverse, A., Smant, G., 775

Moffett, P., 2009. The cyst nematode SPRYSEC protein RBP-1 elicits Gpa2- and RanGAP2-776

dependent plant cell death. PLoS Pathog. 5, e1000564. 777

Shingles, J., Lilley, C., Atkinson, H., Urwin, P., 2007. Meloidogyne incognita: molecular and 778

biochemical characterisation of a cathepsin L cysteine proteinase and the effect on 779

parasitism following RNAi. Exp. Parasitol. 115, 114-120. 780

Sonnhammer, E., Von Heijne, G., Krogh, A., 1998. A hidden Markov model for predicting 781

transmembrane helices in protein sequences. Proc. Int. Conf. Intell. Syst. 6, 175-182. 782

Steeves, R.M., Todd, T.C., Essig, J.S., Trick, H.N., (2006). Transgenic soybeans expressing siRNAs 783

specific to a major sperm protein gene suppress Heterodera glycines reproduction. Funct. 784

Plant Biol. 33, 991–999 785

Stein, L.D., Bao, Z., Blasiar, D., Blumenthal, T., Brent, M.R., Chen, N., Chinwalla, A., Clarke, L., Clee, 786

C., Coghlan, A., Coulson, A., D'Eustachio, P., Fitch, D.H.A., Fulton, L.A., Fulton, R.E., Griffiths-787

Jones, S., Harris, T.W., Hillier, L.W., Kamath, R., Kuwabara, P.E., Mardis, E.R., Marra, M.A., 788

Miner, T.L., Minx, P., Mullikin, J.C., Plumb, R.W., Rogers, J., Schein, J.E., Sohemann, M., 789

Spieth, J., Stajich, J.E., Wei, C., Willey, D., Wilson, R.K., Durbin, R., Waterston, R.H., 2003. 790

The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics. 791

PLoS Biol. 1(2), e45. 792

Struhl, K., 2007. Transcriptional noise and the fidelity of initiation by RNA polymerase II. Nat. 793

Struct. Mol. Biol. 14, 103-105. 794

Svoboda, J.A., Chitwood, D.J., 1992. Inhibition of sterol metabolism in insects and nematodes, 795

regulation of isopentenoid metabolism. In: Nes, W.D., Parish, E.J., Trzaskos, J.M. (Ed.), 796

Regulation of Isopentenoid Metabolism. ACS Symposium Series Vol. 497, American 797

Chemical Society Press, Washington pp. 205-218. 798

Page 34: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

32

The C. elegans Sequencing Consortium, 1998. Genome sequence of the nematode C. elegans: a 799

platform for investigating biology. Science 282, 2012-2018. 800

Tytgat, T., Vanholme, B., De Meutter, J., Claeys, M., Couvreur, M., Vanhoutte, I., Gheysen, G., Van 801

Criekinge, W., Borgonie, G., Coomans, A., 2004. A new class of ubiquitin extension proteins 802

secreted by the dorsal pharyngeal gland in plant parasitic cyst nematodes. Mol. Plant-803

Microbe Interact. 17, 846-852. 804

Vanholme, B., Kast, P., Haegeman, A., Jacob, J., Grunewald, W.I.M., Gheysen, G., 2009. Structural 805

and functional investigation of a secreted chorismate mutase from the plant-parasitic 806

nematode Heterodera schachtii in the context of related enzymes from diverse origins. 807

Mol. Plant Pathol. 10, 189-200 808

Vanholme, B., De Meutter, J., Tytgat, T., Van Montagu, M., Coomans, A., Gheysen, G., 2004. 809

Secretions of plant-parasitic nematodes: a molecular update. Gene 332, 13-27. 810

Vanstone, V., 2007. Root Lesion and Burrowing Nematodes in Western Australian cropping 811

systems. Bulletin 4698, Department of Agriculture and Food Western Australia, Perth, 812

Australia . 813

Yadav, B.C., Veluthambi, K., Subramaniam, K., (2006). Host-generated double stranded RNA 814

induces RNAi in plant-parasitic nematodes and protects the host from infection. Mol. 815

Biochem. Parasitol. 148, 219–222 816

817

Page 35: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

33

Legends to Figures 818

Fig. 1. Overview of the steps used to assemble and annotate the Pratylenchus thornei 819

transcriptome. 820

Fig. 2. The percentage each read group contributed to the total sequenced length. 821

Fig. 3. The distribution of two sets of Pratylenchus thornei contigs in different size ranges, 822

assembled with CAP3 and MIRA separately, compared with the combined data generated with 823

CAP3. 824

Fig. 4. Functional classification of the Pratylenchus thornei contigs using Gene Ontology (GO) 825

terms. 826

Fig. 5. Comparison of Pratylenchus thornei contigs with expressed sequence tags (ESTs) of 827

Pratylenchidae, Heteroderidae, Meloidogynidae and Caenarhabditis elegans. Numbers in 828

parentheses indicate total numbers of contigs. 829

Fig. 6. Frequency of open reading frames for annotated and unannotated contigs. 830

831

Supplementary Figure legend 832

833

Supplementary Fig. S1. Distribution of reads corresponding to different size ranges of Pratylenchus 834

thornei transcriptome. 835

Page 36: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Table 1. Number of reads of Pratylenchus thornei transcriptome and contigs after assembly with

CAP3 and MIRA

Number of Reads Assembly of Contigs

Lane 1 (L1) Lane 2 (L2) CAP3 L1 MIRA L1 CAP3 L2 MIRA L2 Combined Final

Number of

Reads/Contigs 390,417 396,858

32,897 43.241 33,150 43,569 34,312 6,989

Total

Length

(bases)

78,549,690 77,704,413

14,467,885 19,441,646 14,348,384 19,205,118 20,849,165 4,461,906

Average

Length

(bases)

201.2 195.8

439.8 449.6 432.8 440.8 607.7 638.4

N50 268 at

39,274,947

264 at

38,852,391

495 513 491 507 707 740

Size Range

(bases) 40 - 656 40 – 667

42-3376 41-4602 42-3739 41-3928 45-6602 51-4101

Page 37: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Table 2. Assignment of KEGG Enzyme Commission (EC) numbers to Pratylenchus thornei contigs.

KEGG categories representeda

Contigs Enzymesb

KEGG categories representeda

Contigs Enzymesb

1 Carbohydrate metabolism 1.1 Glycolysis / Gluconeogenesis 1.2 Citrate cycle (TCA cycle) 1.3 Pentose phosphate pathway 1.4 Pentose and glucuronate interconversions 1.5 Fructose and mannose metabolism 1.6 Galactose metabolism 1.7 Ascorbate and aldarate metabolism 1.8 Starch and sucrose metabolism 1.9 Amino and nucleotide sugar metabolism 1.10 Pyruvate metabolism 1.11 Glyoxylate/decarboxylate metabolism 1.12 Propanoate metabolism 1.13 Butanoate metabolism 1.14 C5-Branched dibasic acid metabolism 1.15 Inositol phosphate metabolism 2 Energy metabolism 2.1 Oxidative phosphorylation 2.2 Photosynthesis 2.3 Carbon fixation 2.4 Reductive carboxylate cycle 2.5 Methane metabolism 2.6 Nitrogen metabolism 2.7 Sulphur metabolism 3 Lipid metabolism 3.1 Fatty acid biosynthesis 3.2 Fatty acid elongation in mitochondria 3.3 Fatty acid metabolism 3.4 Synthesis/degradation of ketone bodies 3.5 Steroid hormone biosynthesis 3.6 Glycerolipid metabolism 3.7 Glycerophospholipid metabolism 3.8 Sphingolipid metabolism 3.9 alpha-Linolenic acid metabolism 3.10 Biosynthesis of unsaturated fatty acids 4 Nucleotide metabolism 4.1 Purine metabolism 4.2 Pyrimidine metabolism 5 Amino acid metabolism 5.1 Alanine/aspartate/glutamate metabolism

8 11 4 3 3 1 4

12 4 6 2 6 9 1 6

28 12 1 6

14 11 2

1 1 6 1 2 3 2 3 1 2

31 16

14

6 9 2 2 3 1 2 5 3 5 2 5 6 1 4

4 1 1 5 3 6 1

1 1 4 1 1 2 2 1 1 2

12 6

7

5.2 Glycine/serine/threonine metabolism 5.4 Valine/leucine/isoleucine degradation 5.5 Valine, leucine and isoleucine biosynthesis 5.7 Lysine degradation 5.8 Arginine and proline metabolism 5.9 Histidine metabolism 5.10 Tyrosine metabolism 5.11 Phenylalanine metabolism 5.12 Tryptophan metabolism 6 Metabolism of other amino acids 6.1 beta-Alanine metabolism 6.2 Taurine/hypotaurine metabolism 6.3 Phosphonate and phosphinate metabolism 6.4 Selenoamino acid metabolism 6.6 D-Glutamine and D-glutamate metabolism 6.9 Glutathione metabolism 7 Glycan Biosynthesis and metabolism 7.1 N-Glycan biosynthesis - 11 7.2 Lipo-polysaccharide biosynthesis 8 Metabolism of cofactors and vitamins 8.1 Folate biosynthesis 8.2 Retinol metabolism 8.3 Porphyrin metabolism 9 Metabolism of terpenoids and polyketides 9.1 Terpenoid backbone biosynthesis 9.2 Limonene and pinene degradation 10 Biosynthesis of other secondary metabolites 10.1 Phenylpropanoid biosynthesis 10.2 Streptomycin biosynthesis 11 Xenobiotics biodegradation/metabolism 11.1 Caprolactam degradation 11.2 3-Chloroacrylic acid degradation 11.3 1,2-Dichloroethane degradation 11.4 Benzoate degradation via CoA ligation 11.5 Trinitrotoluene degradation 11.6 Metabolism by cytochrome P450 11.7 Drug metabolism - cytochrome P450 11.8 Drug metabolism - other enzymes

1 6 3

15 8 2 1 1 5

7 4 1 3 2 6

3 1

1 2 3

2 3

1 1

2 2 2 3 1 2 2 2

1 4 3 6 5 1 1 1 4

3 1 1 2 1 4

1 1

1 1 2

2 2

1 1

2 1 1 3 1 1 1 1

aSee KEGG database (http://www.genome.jp/kegg/) for categories bThe number of KEGG enzymes that match P. thornei contigs.

Page 38: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Table 3.Pratylenchus thornei orthologues to published nematode parasitism genes.

Accession

No

Species Gene Product and

Symbol

No of

contigs

Best E-

value

Best %

Identity

Alignment

length

(bp)

%

Coverage Reference

AAL40719/ AAR85527

Meloidogyne incognita

14-3-3/14-3-3b product

13 5E-131 96 261 99 Jaubert et al., 2002,

2004

AAN32888 Heterodera glycines

Annexin 4C10 2 3E-26 64 50 84 Gao et al., 2003

AAN32888 Heterodera schachtii

Annexin - Hs4F01 1 5E-17 38 129 58 Patel et al., 2010

AAL40720 M. incognita Calreticulin - Mi-crt-1 8 3E-67 85 189 94 Jaubert et al., 2005

CAD89795 M. incognita Cysteine protease - Mi-cpl-1

13 8E-64 84 82 93 Shingles et al., 2007

CAA70477 Globodera pallida

SEC-2/Fatty acid retinoid binding , Gp-

far-1

7 3E-60 78 111 87 Prior et al., 2001

ABN64198 M. incognita Glutathione-S-transferase -Mi-gsts-1

2 5E-37 62 108 76 Dubreuil et al., 2007

CAB48391 Globodera rostochiensis

Peroxiredoxin - Gr-TpX

1 7E-41 82 91 91 Robertson et al.,

2000

CAC21848 G. rostochiensis RAN-BP-like - Gr-A18 2 2E-07 41 27 56 Vanholme et al.,

2004

AAV34698 G. pallida RAN-BP-like - Gp-rbp-1

4 2E-07 48 27 63 Blanchard et al.,

2005

CAD38523 G. rostochiensis Glutathione peroxidase

4 2E-110 81 228 91 Jones et al., 2004

CAM84510 Radopholus similis

Transthyretin-like protein (Rs-ttl-1, Rs-

ttl-2, Rs-ttl-3, Rs-ttl-4)

29 1E-42 80 55 87 Jacob et al., 2007

CB374976 H. glycines transthyretin-like protein precursor

2 1E-52 64 120 80 Unpublished

AAP30081 H. schachtii Ubiquitin extension protein - Hs-ubi-1

10 4E-37 96 75 98 Tytgal et al., 2004

AAK60209/ AAK55116

H. glycines Venom allergen-like protein – Vap-1

1 3E-19 61 68 79 Gao et al., 2001

AAD01511 M. incognita Venom allergen-like protein – Mi-mps-1

1 E-06 39 28 61 Ding et al., 2000

Page 39: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Table 4.Carbohydrate active enzymes identified in Pratylenchus thornei; including candidates with signal peptides for secretion CAZy enzyme classes CAZy family P.thornei

contigs E-value range Species

Glycoside Hydrolases GH4 1 E-04 uncultured marine bacterium

GH5 a 2 E-05 - E-06 Meloidogyne hapla, Rotylenchulus reniformis

GH9 1 E-04 Herpetosiphon aurantiacus

GH10 3 E-04 – 5.77E-05 Neocallimastix patriciarum

GH12 1 6.28E-06 Sinorhizobium meliloti

GH13 a 7 E-04 – 1.59E-18 Drosophila melanogaster, Chloroflexus aggregans,

Caenarhabditis elegans GH18

a 11 4.45E-05 – 1.87E-37 Leptosphaeria maculans

GH19 2 E-04 Saussurea involucrata

GH20 3 E-04 – 7.5E-27 Spodoptera frugiperda, C. elegans , Catenibacterium mitsuokai

GH23 3 E-04 – 5.77E-05 Staphylococcus aureus, Clostridium ljungdahlii

GH26 a 1 2.5E-05 Roseburia intestinalis

GH30-31 2 1.38E-06- 9.77E-12 C. elegans

GH33 5 E-04 – 4.56E-14 Trypanosoma cruzi

GH38 2 9.96E-10 – 2.92E-11 Magnaporthe grisea

GH43 a 1 2.6E-05 M.grisea

GH 45 a 1 E-04 Brugia malayi

GH47 a 3 8.58E-06 – 8.93E-07 Schistosoma mansoni , M.grisea

GH63 2 2.78E-16 – 3.64E-17 Danio rerio

Glycosyl Trasferases GT2 a 1 E-06 Rhodobacter capsulatus

GT3 1 8.44E-06 Steinernema feltiae

GT4 a 8 E-04 – 2.70E-37 Methylbacteriun spp., Caulobacter segnis, Rhizomucor pusillus, Trichodesmium erythraeum, Nostoc punctiforme , Rhodobacter capsulatus, C. elegans

GT7 5 2.76E-06 - 1.43E-39 S. mansoni , C. elegans

GT27 2 4.70E-21 - 1.21E-47 C. elegans

GT34 1 E-04 Scheffersomyces stipitis

GT35 2 9.7E-63 – 1.63E-79 C. elegans

GT39 3 E-04 – 2.42E-13 Neurospora crassa

GT41 1 6.40E-05 Sideroxydans lithotrophicus

GT49 1 4.55E-10 Dictyostelium discoideum

GT51 10 E-04 – 8.82E-06 Bacillus spp., Eubacterium spp., Roseburia intestinalis

GT55 a 4 E-04 – 6.03E-08 Magnaporthe grisea

GT57 1 E-04 Leptosphaeria maculans

GT64 1 1.26E-26 Danio rerio

GT66 a 2 1.8E-20 – 2.32E-63 C. elegans, Neurospora crassa

GT67 1 7.45E-07 Leishmania braziliensis

GT91 2 E-04 Candida spp.

Carbohydrate Esterases CE4 1 E-04 Tribolium castaneum

CE10 2 4.21E-13 – 8.63E-21 Mus musculus , Homo sapiens

CE11 a 5 E-04 – 1.40E-17 Ostreococcus tauri

Polysaccharide Lyases PL3 2 5E-04 – 1.2E-37 Meloidogyne incognita, M. javanica

Carbohydrate Binding Modules CBM2 5 6.38E-05 - 3.49E-08 Amycolatopsis mediterranei

CBM6 1 2.21E-05 Salinispora tropica

CBM13 5 E-04 - 4.97E-08 Frankia alni

CBM14 a 5 E-04 - 1.98E-13 C. elegans , Drosophila melanogaster

CBM20 2 9.00E-05 - 7.14E-07 Micromonas sp.

CBM33 a 10 9.60E-5 - 9.28E-31 Spodoptera litura

CBM48 a 3 2.14E-07 - 1.71E-08 Chloroflexus aggregans, C. elegans

CBM51 1 6.16E-06 Streptomyces griseus

Page 40: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

CAZy, Carbohydrate active enzymes

aIndicates the presence of a signal peptide without a trans-membrane helix.

Page 41: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Table 5. The 20 most conserved genes between the Pratylenchus thornei transcriptome and

Caenarhabditis elegans.

Genbank

Accession No

Genebank

Protein ID

Gene E-value

E41472

CE43278

CE05449

CE09197

CE21713

CE06519

CE30306

CE43299

CE13150

CE17477

CE06184

CE02626

CE31351

CE32946

CE42002

CE12368

CE39951

CE40348

CE23997

CE20226

Y47D3A.26

W07E11.1

C47E12.5a

F07A5.7a

Y39E4B.1

T27F2.1

T25G3.3

F48F7.1b

T04C12.5

C27B7.4

K12D12.1

F10G7.2

F56A11.1

W02B3.2

F28B3.7a

M03F4.7a

M01G5.5

C02F4.2c

T21E12.4

Y37D8A.23a

smc-3, structural maintenance of chromosomes

glutamate synthase

uba-1, Ubiquitin activating enzyme

unc-15, paramyosin

abce-1, ABC transporter, class E

skp-1, transcriptional cofactor

Required for viability, growth, and fertility

alg-1, affects developmental timing

act-2, actin 2

rad-26, radiation sensitive abnormal

top-2, topoisomerase

tsn-1, component of the RNA-induced silencing complex

gex-2, tissue morphogenesis and cell migrations

grk-2, serine/threonine protein kinase

him-1, structural maintenance of chromosomes

calu-1, calcium binding protein

snf-6, sodium:neurotransmitter symporter

tax-6, calcineurin A

dhc-1, cytoplasmic dynein heavy chain

unc-25, glutamic acid decarboxylase (GAD)

1.94E-175

3.98E-162

4.73E-161

1.56E-149

4.50E-134

4.72E-134

1.57E-131

1.64E-127

2.07E-126

2.93E-126

1.49E-125

2.53E-124

4.76E-121

7.31E-119

9.93E-116

2.12E-113

3.68E-113

1.02E-111

2.19E-111

5.50E-111

Page 42: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Transcriptome Assembly

Roche 454 GS FLX

Dataset Lane 1 – 390,417

Dataset Lane 2 – 396,858

CAP3

Lane 1 + Lane 2

32,897

Annotations

Parasitism genes, signal peptides

and CAZy (34,312 contigs used)

NCBI BLAST searches

eg. C. elegans

Plant parasitic nematodes

Reassembly Step 2

Contig Assembly Step 1

CAP3

34,312

MIRA

Lane 1 + Lane 2

43,569

Step 3 Selection

Common Contigs

6,989

Neuropeptides C. elegans RNAi

genes

Gene Ontology (GO) terms and KEGG (EC)

Figure 1

Page 43: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Fig. 2

Read length Dataset L1 Dataset L2

0-20 0.00 0.00

20-40 0.34 0.40

40-60 27.90 30.67

60-80 43.09 47.70

80-100 49.05 54.03

100-120 51.29 56.24

120-140 51.38 55.67

140-160 53.55 55.88

160-180 55.07 57.74

180-200 60.98 62.07

200-220 70.77 71.85

220-240 82.75 83.29

240-260 96.66 93.91

260-280 100.00 100.00

280-300 99.85 95.26

300-320 91.91 88.86

320-340 82.06 80.71

340-360 70.71 68.94

360-380 59.82 59.52

380-400 50.10 49.33

400-420 41.43 39.93

420-440 32.43 32.26

440-460 27.21 26.74

460-480 22.41 21.83

480-500 19.53 19.32

500-520 13.78 13.62

520-540 7.22 7.00

540-560 2.83 2.52

560-580 0.75 0.61

580-600 0.30 0.18

600-620 0.06 0.03

620-640 0.03 0.00

640-660 0.02 0.01

660-680 0.00 0.01

680-700 0.00 0.00

0.00

20.00

40.00

60.00

80.00

100.00

120.00

0-2

0

20-4

0

40-6

0

60-8

0

80-1

00

100-1

20

120-1

40

140-1

60

160-1

80

180-2

00

200-2

20

220-2

40

240-2

60

260-2

80

280-3

00

300-3

20

320-3

40

340-3

60

Perc

en

tag

e o

f b

ases

Read Length (bp)

Dataset L1 Dataset L2

Page 44: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Fig. 3

Size Range (bp) CAP Dataset L1 MIRA Dataset L1 CAP Dataset L2 MIRA Dataset L2 Reassembled Data

0-50 20 30 22 37 71

51-99 680 1,156 744 1,296 2,690

100-150 1,204 1,790 1,430 2,002 3,952

151-199 1,576 2,163 1,701 2,404 4,511

200-250 2,377 2,907 2,459 3,089 5,768

251-299 3,430 4,141 3,520 4,179 7,702

300-350 3,936 4,768 3,874 4,737 8,667

351-399 3,670 4,600 3,676 4,686 8,522

400-450 3,339 4,215 3,250 4,073 7,706

451-499 2,754 3,663 2,819 3,614 6,820

500-550 2,202 2,940 2,165 2,866 5,678

551-599 1,617 2,280 1,609 2,275 4,395

600-650 1,317 1,709 1,296 1,778 3,430

651-699 976 1,405 959 1,308 2,634

700-750 751 1,122 756 1,019 2,156

751-799 609 818 562 828 1,652

800-850 509 655 461 635 1,441

851-899 348 490 331 481 1,115

900-950 315 388 276 360 895

951-999 249 331 221 333 729

1000-1050 184 265 177 251 590

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Nu

mb

er

of

Co

nti

gs

Contig Size Range (bp)

CAP Dataset L1 MIRA Dataset L1 CAP Dataset L2 MIRA Dataset L2 Reassembled Data

Page 45: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Fig. 4

GO Terms Percentage

Other enzyme activity 15.2

ATP binding 13.5

DNA binding 8.0

Ion binding 8.0

Transcription 6.6

Ligase activity 6.1

other binding 6.1

Transporter 6.0

Structural 4.2

Nucleotide binding 3.9

Protein binding 3.8

RNA binding 3.4

Kinase activity 3.3

Nucleic acid binding 3.2

Unspecified molecular function 2.6

Electron transport 2.3

Signal transducer 1.4

Motor 0.6

Hormone 0.4

Enzyme regulator 0.3

Growth 0.1

Nucleus 27.0

Membrane 27.0

Cytoplasm 23.0

Unspecified intracellular 18.0

Vesicle coat 2.0

Extracellular 3.0

Protein metabolism &modifications 19.2

Transport 19.0

Nucleic acid metabolism 13.5

Biosynthesis 5.6

Cell comunication 3.9

Catabolism 2.6

Carbohydrate metabolism 2.4

Lipid metabolism 1.3

Chitin metabolism 0.4

Viral reproduction 0.4

Superoxide metabolic process 0.2

Cell cycle 0.2

0.0

5.0

10.0

15.0

20.0

25.0

30.0

Oth

er

enzym

e a

ctivity

AT

P b

indin

g

DN

A b

indin

g

Ion b

indin

g

Tra

nscription

Lig

ase a

ctivity

oth

er

bin

din

g

Tra

nsport

er

Str

uctu

ral

Nucle

otide b

indin

g

Pro

tein

bin

din

g

RN

A b

indin

g

Kin

ase a

ctivity

Nucle

ic a

cid

bin

din

g

Unspecifie

d m

ole

cula

r fu

nction

Ele

ctr

on t

ransport

Sig

nal tr

ansducer

Moto

r

Horm

one

Enzym

e r

egula

tor

Gro

wth

Perc

en

tag

e

Molecular Function Cellular Component

GO Categories

Page 46: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Meloidogynidae (73,343)

Heteroderidae (47,338)

2,039

Radopholus similis (7,382)

Caenorhabditis elegans (25,030)

Pratylenchus thornei (6,989)

1,218

1,209

1,947

2,014

719,316

45,299 Pratylenchus penetrans and Pratylenchus vulnus (7,728)

6,510

6164

23,016

Figure 5

Pratylenchus coffeae (25,987)

5,504 20,483

Page 47: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Fig. 6

Size (bp) Unannotated Annotated

0-20 8 4

20-40 42 17

40-60 90 63

60-80 142 56

80-100 143 82

100-120 184 102

120-140 183 122

140-160 189 115

160-180 218 126

180-200 199 122

200-220 175 73

220-240 171 87

240-260 129 113

260-280 101 73

280-300 127 85

300-320 110 64

320-340 86 42

340-360 98 67

360-380 75 40

380-400 56 51

400-420 56 26

420-440 45 23

440-460 37 19

460-480 38 27

480-500 24 14

500-520 29 13

520-540 14 9

540-560 23 9

560-580 12 16

580-600 13 6

600-620 18 7

620-640 16 8

640-660 5 4

660-680 9 6

680-700 6 1

700-720 6 3

720-740 6 3

740-760 4 1

760-780 6 5

780-800 8 1

800-820 0 3

820-840 2 1

840-860 2 2

860-880 1 1

880-900 0 1

900-920 0 1

920-940 3 0

940-960 1 1

960-980 1 1

980-1000 2 0

0

50

100

150

200

250

Nu

mb

er

of

Op

en

Read

ing

Fra

mes

Open Reading Frame Size (bp)

Unannotated Annotated

Page 48: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Annotations (contigs)

Analysis of P. thornei transcriptome

Roche 454 GS FLX

Reads L1 – 390,417

Reads L2 – 396,858

CAP3 - 32,897

Reassembly

Contig Assembly

CAP3 - 34,312 contigs

MIRA - 43,569

Selection

Common Contigs - 6,989

Homologues to plant parasitic nematodes

Plant parasitism genes

CAZymes, Neuropeptides

C. elegans RNAi genes

Graphical Abstract

Page 49: MURDOCH RESEARCH REPOSITORY · 36 Pratylenchus vulnus, and 2,940 to contigs of Pratylenchus coffeae. There were 2,014 contigs common to ... (Castillo and Vovlas, 2007). After 65 embryogenesis

Highlights

First known transcriptome of the nematode Pratylenchus thornei using Roche GSFLX

sequencing

787,275 reads assembled into 34,312 contigs using two assemblers, yielding 6,989 contigs

Forty-two percent of the contigs were annotated using public databases

One hundred contigs matched parasitism genes of sedentary/migratory endoparasites

Data provides new information on genes involved in P. thornei-host interaction