Real-time tracking of Tomato brown rugose fruit virus (ToBRFV) … · 2020-06-02 · 14 Abstract 15...
Transcript of Real-time tracking of Tomato brown rugose fruit virus (ToBRFV) … · 2020-06-02 · 14 Abstract 15...
1 Real-time tracking of Tomato brown rugose
2 fruit virus (ToBRFV) outbreaks in the
3 Netherlands using Nextstrain
4
5 Bart T.L.H. van de Vossenberg1*, Michael Visser1, Maaike Bruinsma2, Harrie M.S. Koenraadt2,
6 Marcel Westenberg1, Marleen Botermans1
7
8 1. National Reference Centre of plant health, Dutch National Plant Protection Organization,
9 Geertjesweg 15, 6706EA, Wageningen, the Netherlands
10 2. Naktuinbouw, Sotaweg 22, 2371GD, Roelofarendsveen, the Netherlands
11
12 * corresponding author: [email protected]
13
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
14 Abstract
15 Tomato brown rugose fruit virus (ToBRFV) is a Tobamovirus that was first observed in 2014 and
16 2015 on tomato plants in Israel and Jordan causing discolorations and deformation of leaves and
17 fruits. Apart from tomato, damage in pepper fruits has been reported. Since the first description,
18 the virus has been found on all continents except Oceania and Antarctica.
19
20 In October 2019, the Dutch National Plant Protection Organization received a tomato sample
21 suspected to be infected with ToBRFV as part of an official specific survey of tomato fruit growers in
22 the Netherlands carried out July to October 2019. During the survey 124 companies were visited
23 and inspected for possible ToBRFV symptoms. Of the 47 samples tested, one sample tested
24 positive with ELISA and test plants, which was verified using real-time RT-PCR and Illumina
25 RNAseq data. A follow-up survey was initiated to determine the extent of ToBRFV presence in the
26 Dutch tomato horticulture and identify possible linkages between ToBRFV genotypes, companies
27 and epidemiological traits. We used Nextstrain to visualize these potential connections.
28
29 Genomic diversity of ToBRFV isolates found in the Netherlands group in three main clusters which
30 are hypothesized to represent three sources. No correlation was found between genotypes,
31 companies and epidemiological traits such as rootstock and scion varieties, seed batches and
32 nurseries, and the source(s) of the Dutch outbreak remain unknown.
33
34 This paper describes a Nextstrain build containing ToBRFV genomes up to and including November
35 2019. The NPPO-NL has committed itself to maintain and improve the build. Sharing data with this
36 interactive online tool will enable the regulatory plant virology field to better understand and
37 communicate the diversity and spread of this new virus. Other organizations are stimulated to
38 share data or materials for inclusion in the Nextstrain build, which can be accessed at
39 //40.91.255.14/ToBRFV/20191231.
40
41
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
42 Introduction
43 Tomato brown rugose fruit virus (ToBRFV) belongs to the genus Tobamovirus and was first
44 described by Salem et al. in 2016 after a finding in tomato (Solanum lycopersicum) plants in
45 Jordan [2]. This appeared to be the same virus that was causing problems since 2014 in Israel on
46 tomato plants showing mosaic patterns on leaves, and occasionally narrowing of leaves and yellow
47 spotted fruits. The tomato Tm-22 R-gene, which confers resistance to Tobacco mosaic virus (TMV)
48 and Tomato mosaic virus (ToMV), offers no protection against ToBRFV. Hence the virus is regarded
49 a major threat for tomato crops (1, 2). Pepper crops (Capsicum spp.) can also be affected by the
50 virus, producing symptoms similar to those observed on tomato (3). However, outbreaks in pepper
51 have only been reported from Mexico, Jordan and Italy (3-5). Several other plant species, including
52 Petunia spp. and Solanum nigrum could be inoculated under laboratory conditions (1). As with
53 other tobamoviruses, ToBRFV is stable and can easily be transmitted by contact (e.g. farming
54 equipment, hands, clothing and plant-to-plant contact) (6), and propagation material (grafts,
55 cuttings). Seed transmission of ToBRFV is presumed (7), and bumblebees have been reported to
56 serve as a vector for ToBRFV and potentially contribute to the spread of the disease (8).
57
58 ToBRFV has a similar genome architecture compared to other tobamoviruses, and consists of a
59 ~6.4 kb monopartite linear single stranded RNA (ssRNA) encoding four proteins: the small
60 replicase subunit, a RNA-dependent RNA polymerase (RdRp) which is translated through a
61 suppression of translation at the end of ORF1, a movement protein (MP), and a coat protein (CP)
62 (2). To date, nine complete ToBRFV genome sequences have been deposited in NCBI Genbank
63 (MK133095, MN167466, MK319944, KX619418, MK165457, MN182533, MK648157, MK133093,
64 KT383474) representing outbreak locations in Middle East, North-America and Europe.
65
66 Following the initial findings in Jordan and Israel, the virus was reported from the State of Palestine
67 (9), Mexico (10), the United States of America (11), Germany (12), and Italy (13) in 2018, and
68 Turkey (14), China (15), the United Kingdom (16), the Netherlands (17), Greece (18), and Spain
69 (19) in 2019. In 2020, ToBRFV outbreaks were reported from France (20) and Egypt (21). Early
70 October 2019, the National Reference Center for plant health of the Dutch National Plant Protection
71 Organization (NPPO-NL) received a tomato sample as part of an official specific survey of tomato
72 fruit growers, that was suspected to be infected with ToBRFV. Testing of the sample by DAS-ELISA,
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
73 bio-assay and real-time RT-PCR (22) further strengthened the initial suspicion. The sample was
74 subjected to Illumina RNAseq analysis for final confirmation. Comparison of the complete viral
75 genomes present in the sample with public NCBI accessions verified the presence of ToBRFV.
76 ToBRFV is regulated as part of EU Commission implementing decision 2019/1615 as of 1 November
77 of 2019 (23). Therefore the initial finding triggered a follow-up survey targeting the nursery and
78 the seed lots of the affected plants as well as intensified specific surveillance by the NPPO-NL to
79 determine the extent of ToBRFV presence in the Dutch tomato horticulture. This resulted in new
80 ToBRFV findings at tomato fruit companies and the enforcement of measures to prevent spread of
81 the virus.
82
83 The aim of the current study was to determine the genomic diversity that was represented within
84 and between the Dutch outbreak locations to identify possible linkages with epidemiological factors
85 such as scion and rootstock varieties. We used Nextstrain (24), a collection of open-source bio-
86 informatic tools, to create an interactive view of the diversity and spread of the virus. Real-time
87 virus genome sequencing and rapid dissemination of epidemiologically relevant results, which is
88 central in the Nextstrain philosophy, has been mainly applied to viruses causing disease in humans,
89 such as Zika and Ebola (25). By sharing the Dutch outbreak sequences in the context of geography
90 and epidemiological traits at an early stage, we believe this will empower NPPOs and the research
91 community to better understand and battle this important viral disease that threatens tomato and
92 pepper production worldwide.
93
94 At the time of preparation of this publication, public ToBRFV sequences and sequences generated in
95 this study dating from November 2014 to November 2019 were included in the ToBRFV Nextstrain
96 build. The Dutch NPPO has committed itself to keep adding sequence data and improving the build.
97 This paper also serves as an invitation to other NPPOs to contribute their data to improve the
98 predictive power of the tool. The ToBRFV Nextstrain build can be accessed via
99 //40.91.255.14/ToBRFV/20191231 (last accessed 13 May 2020).
100
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
101 Material and methods
102 Sampling
103 Following the initial ToBRFV finding, tomato greenhouses were selected based on an assessment
104 including potential linkages to the initial outbreak location and data collected from private
105 laboratories, and were inspected by phytosanitary inspectors of NPPO-NL. Typically, 50 young leaf
106 parts were sampled from symptomatic tomato plants per greenhouse. When no symptomatic plant
107 parts were found, asymptomatic material was sampled. Per company, one or more samples
108 consisting of 50 young leaf parts were taken. In cases where there was a strong suspicion of
109 ToBRFV but no young leaf parts available, additional material was sampled such as tomato roots
110 and/or water from rock wool. In addition to the survey samples taken from Dutch greenhouses,
111 samples from regular import inspections suspected to contain ToBRFV were included in this study.
112
113 Leaf samples were split into two subsamples of 25 leaves each. From each subsample, a leaf disc
114 was obtained using a 6 mm diameter leaf punch. Subsample punches were added to 5 mL GH+
115 buffer (26) (6 M guanidine hydrochloride, 0.2 M sodium acetate pH 5.2, 25 mM EDTA, and 2.5%
116 PVP-10) spiked with Bacopa chlorosis virus (BaCV) to a Cq value in the range of 25-30, which is
117 used as an RNA extraction efficiency control. Punch outs were homogenized with a Homex 6
118 (Bioreba, Switzerland) prior to RNA extraction with the Sbeadex Maxi Plant Kit (LCG genomics,
119 United Kingdom) on the automated KingFisher Flex 96 platform (ThermoFisher, MA, USA).
120 Extracted RNA was tested directly, or stored at -20 °C until use. Real-time RT-PCR reactions were
121 performed based on the ISHI-VEG protocol for ToBRFV detection in tomato and pepper seeds (22).
122 Reaction mixes contained two primer-probe pairs targeting the ToBRFV genome, and one targeting
123 the BaCV spike-in (Table 1). Reaction mixes consisted 1x UltraPlex 1-Step ToughMix (Quanta
124 Biosciences, MD, USA), 300 nM of each forward and reverse primer, 200 nM of each probe, and 3
125 µL of RNA template. Molecular grade water was added to reach a final volume of 25 µL. Real-time
126 RT-PCR reactions were performed in a CFX96 thermal cycler (BioRad, CA, USA) using the following
127 conditions: 50°C for 10 min, 95°C for 3 min, followed by 40 cycles of 95°C for 10 sec and 60°C for
128 1 min and plate read at 60°C. When one or both ToBRFV primer-probe combinations produced a Cq
129 value of ≤30, the virus was regarded to be detected. When both ToBRFV combinations failed to
130 produce a Cq value ≤30 and the spike-in resulted in a Cq value of ≤32, the virus was not detected.
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
131 When both ToBRFV combinations resulted in Cq values >30, and the spike-in resulted in a Cq value
132 of >32, the result was undetermined, and testing was repeated.
133
134 Table 1. Oligonucleotides used in this study
Name Sequence 5’ - 3’ Target
CSP1325-F CATTTGAAAGTGCATCCGGTTT
CSP1325-R GTACCACGTGTGTTTGCAGACA
CSP1325-P FAM-ATGGTCCTCTGCACCTGCATCTTGAGA-BHQ1
Coat protein (CP)
and 3’ tRNA-like
end
CaTa28-F GGTGGTGTCAGTGTCTGTTT
CaTa28-R GCGTCCTTGGTAGTGATGTT
CaTa28-P ATTO532-AGAGAATGGAGAGAGCGGACGAGG-BHQ1
Movement protein
(MP)
BaCV-F CGATGGGAATTCACTTTCGT
BaCV-R AATCCACATCGCACACAAGA
BaCV-P TxR-CAATCCTCACATGATGAGATGCCG-BHQ2
Spike-in control
135
136 Illumina sequencing
137 RNA extracts in which the virus was detected were DNAse I treated (Qiagen, Germany) for 15 min
138 at room temperature prior to sequencing. Samples were sent to GenomeScan (Leiden, the
139 Netherlands) for generation of 2 Gb Illumina RNAseq 150PE (paired-end) data per sample. The
140 Ultra II Directional RNA Library Prep Kit for Illumina (New England Biolabs, MA, USA) was used to
141 process the samples according to the protocol "NEBNext Ultra II Directional RNA Library Prep Kit
142 for Illumina". Briefly, rRNA was depleted from total RNA using the rRNA removal kit Ribo-Zero Plant
143 (Illumina, CA, USA). After fragmentation of the rRNA reduced RNA, a cDNA synthesis was
144 performed which was used for ligation with the sequencing adapters and PCR amplification of the
145 resulting product. Quality and yield after sample preparation were measured with a Fragment
146 Analyzer (Agilent, CA, USA) prior to pooling for sequencing on an Illumina NovaSeq (Illumina, CA,
147 USA).
148
149 Verification pipeline
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
150 RNAseq data were uploaded into CLC Genomics workbench v11.0.1 (Qiagen, Germany) and run in
151 a custom workflow build for detection of de novo assembled viral contigs (S1 Fig). First, a quality
152 trim (quality limit = 0.05; ambiguous limit = 2) was performed, followed by a de novo assembly
153 (map reads back to contigs = on; length fraction = 0.8; similarity fraction = 0.8, minimum contig
154 length = 200) and consensus sequences extraction (low coverage threshold= 10; remove regions
155 with low coverage = on; post-remove action = join). The de novo assembled contigs were analyzed
156 using the Basic Local Alignment Search Tool (BLAST, maximum alignments per database sequence
157 = 5; maximum E-value = 1e-6, minimum identity = 70%) with a local installation of the NCBI
158 nr/nt database (27). Blast results were split into plant and non-plant hits, which were visualized in
159 Krona (bit score threshold = 25) (28) (S2 Fig). The same pipeline was repeated with 1% of all
160 reads as de novo assembly of high coverage contigs can be problematic resulting in fragmented
161 assemblies. CLC external applications that were used in the pipeline are deposited on GitHub
162 (https://github.com/NPPO-NL/CLC-External-Applications). Putative ToBRFV sequences were
163 manually extracted from the assemblies, and reliability of the assembled putative ToBRFV contigs
164 were determined by visually inspecting read mappings. Consensus sequences were annotated in
165 Geneious R11 (Biomatters, New Zealand) based on the publicly available ToBRFV Tom1-Jo genome
166 KT383474 (2)(minimal sequence similarity = 90%) and the Geneious ORF finder tool (genetic code
167 = table 1, minimum size = 200), and submitted in NCBI GenBank under accessions MN882011 to
168 MN882064. The corresponding SRAs were submitted under accessions ERS4296142 to
169 ERS4296201.
170
171 Frequency of major genotypes in the individual samples was determined with the “Basic Variant
172 Calling” tool in CLC Genomics Worbench v11.0.1. (ploidy = 0, minimum coverage = 20, minimum
173 count = 2, minimum frequency = 10%, base quality filter = default) using read-mappings to
174 ToBRFV genome KX619418.
175
176 A phylogenetic tree was constructed with the ToBRFV genome sequences generated in this study,
177 together with publicly available ToBRFV sequences. Models for nucleotide substitution of the MAFFT
178 (29) aligned sequences were obtained using the “model testing” tool in CLC Genomics workbench
179 v11.0.1 which was run with default settings for the Hierarchical likelihood ratio test (hLRT),
180 Bayesian information criterion (BIC), Minimum theoretical information criterion (AIC), and
181 Minimum corrected theoretical information criterion (AICc). The clustering analysis was performed
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
182 with MrBayes v3.2.6 (30) implemented in Geneious R11 under the GTR model with gamma
183 distribution and estimation of invariable sites (GTR + G + I) with a random starting tree and four
184 Monte Carlo Markov Chains for 106 generations. Trees were sampled every 200 generations, and
185 the first 105 generations were discarded as burn-in. Remaining trees were combined to generate a
186 50% majority rule consensus tree with posterior probabilities.
187
188 Nextstrain implementation
189 Nextstrain is a bioinformatics pipeline that uses two tools, Augur and Auspice (24). Augur
190 (github.com/Nextstrain/augur) is a bioinformatics toolkit for phylogenetic analysis that uses a
191 series of modules that produces output that subsequently can be visualized by Auspice
192 (github.com/Nextstrain/auspice). Using the augur modules, ToBRFV sequences were aligned with
193 MAFFT (29) and a RAxML clustering (31) was performed. The initial tree was refined with metadata
194 and a time tree was build using TreeTime (32) with optimization of scalar coalescent time. Internal
195 nodes were assigned to their marginally most likely dates, and confidence intervals for node dates
196 were estimated. Amino acid changes were determined based on the functionally annotated Tom1-
197 Jo genome (KT383474 = NC_028478). Additional features had to be added to the NCBI Genbank
198 file (.gb) to enable the use of this sequence as reference, i.e. gene names and translation tables for
199 the four CDS annotations. Next, Augur output was exported and visualized in auspice. The ToBRFV
200 Nextstrain build is deposited on GitHub (https://github.com/NPPO-NL/nextstrain-ToBRFV).
201
202 Results
203 Sampling and ToBRFV detection
204 Following the initial specific survey carried out amongst tomato fruit growers from July-October
205 2019, by November 2019, additionally greenhouses of 68 companies were inspected and
206 symptoms were occasionally observed, especially young leaves showing chlorotic mottle, mosaic or
207 leaf narrowing (Fig 1). In three cases growers reported delayed ripening or brown necrotic spots on
208 green fruits. From the visited greenhouses a total of 135 samples were taken and split in two or
209 more subsamples that were individually analyzed with real-time RT-PCR. In addition, two fruit
210 samples (39070022 and 39070030) from regular import inspections (which were split into four
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
211 subsamples) were included for testing. Of these, 70 subsamples representing 21 companies and
212 the two intercepted samples produced positive real-time RT-PCR results in one or both assays with
213 Cq values ranging from 5.0 to 30.1 in the CP assay and from 4.6 to 31.2 in the MP assay. An
214 overview of samples that tested positive with the real-time RT-PCRs is provided in S1 Table.
215
216 Figure 1. Symptoms observed during phytosanitary inspections. A Chlorotic mosaic on
217 young leaves of ToBRFV and PepMV infected tomato plants; B Chlorotic mottle on leaf bases of
218 young leaves; and C narrowing of and chlorotic mottle on young leaves
219
220 ToBRFV verification using Illumina RNAseq
221 Plant rRNA depleted RNAseq Illumina data were generated for 70 RNA extracts representing 35
222 samples taken at phytosanitary inspections of 21 companies growing tomato (or in two cases had
223 been growing tomatoes in the past) in the Netherlands, and 2 samples taken at regular import
224 inspections of tomato fruits. On average 3.7 Gb of sequence data was generated per sample, and
225 de novo assembly resulted in an average of 19,645 scaffolds per dataset when using all data.
226 When the analysis with all data failed to produce a single contiguous sequence on account of too
227 many reads that were used in the assembly, this was achieved with the 1% sampled analysis.
228 Contigs producing blast hits for ToBRFV were obtained from 52 datasets representing 17
229 companies. Visual inspection of the de novo assemblies and structural annotation of these contigs
230 verified the identity of the virus in all 52 cases. In two datasets obtained from sample 39941641,
231 two different genotypes were assembled resulting in a total of 54 assembled and verified ToBRFV
232 genomes. The obtained genome size ranged from 6,301 to 6,397 nt with average read coverage
233 ranging from 13 to 579,611x (median coverage = 54,330x). Subsample 39941596_B was
234 sequenced twice (once with and once without DNAse I treatment) in independent sequencing runs
235 as control. The obtained ToBRFV genome sequences were identical, and both sequences represent
236 a unique genotype in the overall dataset.
237
238 Subsamples in which ToBRFV could be verified typically had Cq values for the CP and MP assays
239 lower than 25. Only in two cases the complete ToBRFV genome could be constructed from samples
240 with Cq values higher than 25. In some of the samples with Cq values higher than 25, low numbers
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
241 of reads mapping to the ToBRFV could be obtained, but these were insufficient to verify the
242 presence of ToBRFV in the sample.
243
244 Apart from ToBRFV hits, RNAseq data obtained from tomato plants all produced Pepino mosaic
245 virus (PepMV) hits.
246
247 Genomic variation
248 The obtained ToBRFV sequences were highly similar with 99.3% to 100% identical sites (up to 43
249 single nucleotide polymorphisms: SNPs). For the 160 amino acid (aa) coat protein, no non-
250 synonymous nucleotide changes were observed among the Dutch outbreak sequences, whereas the
251 1,621 aa RdRp protein, which contains the small replicase subunit, showed 100% to 99.8%
252 similarity on protein level (up to 6 aa changes). The 267 aa movement protein displays relatively
253 the highest level of diversity on protein level with 100% to 98.1% aa similarity (up to 5 aa
254 changes).
255
256 Bayesian inference of phylogeny results in clustering of the ToBRFV genomes sequenced from the
257 phytosanitary inspections of greenhouses growing tomato in the Netherlands in three main clusters
258 (Fig 2: red, blue and green clusters). A fourth cluster was formed with sequences obtained from
259 tomato fruits from one lot intercepted during import from Egypt (Fig 2: orange cluster). Clustering
260 obtained with RAxML, which is part of the Nextstrain build, results in the same grouping as the
261 Bayesian analysis for the viral strains obtained from the Dutch outbreak locations, but some
262 rearrangements can be seen between clusters (S3 Fig).
263
264 Figure 2. Bayesian inference of phylogeny based on a 6,404 nt alignment of 63 complete
265 ToBRFV genomes. Viral strains from the Dutch outbreak are found in three separate clades which
266 have been colored red, blue and green. Four ToBRFV genome sequences generated from material
267 intercepted during an import inspection from Egypt form a separate cluster (orange). Bayesian
268 posterior probabilities (PP) are displayed at the branch nodes. The scale bar indicates the number
269 of substitutions per site.
270
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
271 In twenty datasets, one dominant ToBRFV genotype was detected (major genotype ≥98%), but in
272 the remaining 34 datasets intermediate SNPs were observed suggesting the presence of multiple
273 genotypes within the sample (Fig 3). Within sample diversity of major genotypes ranged from
274 54.5% to 89.0% in the isolates with mixed genotypes.
275
276 Figure 3. Within sample diversity of ToBRFV genotypes. A. Frequency (y-axis) of major
277 ToBRFV genotypes in datasets (x-axis) generated in this study. Within sample frequencies of major
278 genotypes are sorted low to high. B. Detail of RNAseq read mappings to the small replicase subunit
279 of KX619418 for ToBRFV samples 38886230_B, 38886177_B, and 39070153_B. For each sample
280 the total number of mapped reads is shown as well as the maximum read coverage. The left SNP
281 (arrow) represents a single genotype (>99.7%) that is present in all three samples. The right SNP
282 reveals the presence of a minor genotype (32.8%) in sample 39070153_B.
283
284 Nextstrain build
285 The ToBRFV Nextstrain build contains 63 complete ToBRFV genomes generated from material
286 sampled between October 2014 and November 2019. Information in the build is presented in three
287 main panels: clustering of genomic diversity, geographical origin of the samples, and diversity
288 relative to the ToBRFV Tom1-Jo genome (KT383474 = NC_028478) (Fig 4).
289
290 The associated metadata included in the build allows users to color the nodes in the tree according
291 to the host plant, and the (anonymized) rootstock and scion varieties from which the sequence was
292 obtained. In addition, the within-sample genomic diversity is a trait that can be visualized. Internal
293 node colors indicate the predicted ancestral state of a given trait, and the confidence of that state
294 is conveyed by saturation of the color of the internal node. The cladogram can be shown in
295 different styles such as rectangular, radial and unrooted. The branch lengths of the tree can be
296 shown based on divergence or in function of time. Based on the information provided in the build,
297 Nextstrain determines the most likely transmission events, which can be animated from the
298 webpage. The genotypes represented in the tree are plotted on a map, and users can set different
299 levels of geographical resolution, i.e. continent, country, state and municipality (when this
300 information is available). This allows simultaneous interrogation of phylogenetic and geographic
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
301 relationships, with additional relevant metadata. Users can download metadata and tree-files from
302 the webpage, and can create screenshots of their views.
303
304 Figure 4. Genomic epidemiology of ToBRFV from October 2014 to November 2019. The
305 Nextstrain build consists of three main linked panels: a phylogenetic tree (A), geographical
306 distribution and most likely distribution events (B), and genomic diversity represented in the
307 dataset using the ToBRFV Tom1-Jo genome (KT383474 = NC_028478) as reference with the small
308 replicase subunit annotated in green, the RNA-dependent RNA polymerase in orange, the
309 movement protein in blue and the coat protein in yellow.
310
311 Discussion
312 In October 2019, the Dutch NPPO received a tomato sample that was suspected to be infected with
313 ToBRFV. The sample formed part of an official specific survey of tomato fruit growers in the
314 Netherlands carried out in the period July – October 2019. During the official survey in total 124
315 companies were visited and inspected for possible symptoms of ToBRFV. In total 47 samples were
316 tested, whereby one sample tested positive. Following initial testing with ELISA, indicator plants
317 and real-time RT-PCR, presence of ToBRFV was verified with analysis of plant rRNA depleted
318 Illumina RNA sequence data. Since little was known about the intraspecific variation and the
319 validation status of the diagnostic tests, it was decided to use RNAseq analysis for ToBRFV
320 verification in the follow-up survey and intensified specific surveillance by the NPPO-NL.
321
322 Several primer pairs have been described for conventional RT-PCR that can be Sanger sequenced
323 for ToBRFV verification. As a result, some authors have sequenced either the 5’ end of the small
324 replicase subunit, the 3’ end of the small replicase subunit, the 3’ end of RdRp, the partial MP and
325 CP, or a combination of these loci (S4 Fig). This inevitably results in incomplete datasets. In
326 contrast to Sanger sequencing (33), RNAseq analysis with plant rRNA depleted Illumina data does
327 not rely on the specificity of primers used to amplify the partial virus genome. Furthermore,
328 instead of sequencing only a small part of the genome, Illumina RNAseq data allows the de novo
329 assembly of the (near) complete ToBRFV genome. With the complete genome, all genetic
330 information can be used to determine potential linkages between genotypes and epidemiological
331 traits.
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
332
333 The real-time RT-PCRs used for the detection of ToBRFV have been validated for detection of the
334 virus in tomato and pepper seeds (available on request from Naktuinbouw). When implementing
335 the test for tomato leaves, in some cases late Cq values were obtained with the healthy tomato
336 matrix (data not shown). To compensate for these late Cq values, the Cq cut-off value was lowered
337 from 32 to 30. Samples with Cq values lower than 25 could be verified with Illumina RNAseq data.
338 When Cq-values between 25 and 30 were obtained, ToBRFV could be verified in a single case only.
339 Seventy percent of the samples taken resulted in Cq values lower than 25. When ToBRFV
340 suspected samples could not be verified, companies were revisited and additional samples were
341 taken. Overall there is a good correlation between the Cq values and normalized read coverage in
342 the samples analyzed (R2CP assay
= 85.3% and R2MP assay
= 76.4%; S5 Fig).
343
344 Visual inspection of the read mappings revealed that multiple genotypes existed within some
345 samples. Only in two subsamples the variation between the genotypes within a single subsample
346 was high enough to produce two separate contigs from the de novo assembly. In other cases, the
347 minor genotypes were not manually reconstructed from the assemblies as they could represent one
348 or more than one minor genotypes. When looking at the within sample diversity, a gap of almost
349 10% between samples with one clear major genotype and samples with mixed genotypes can be
350 seen (Fig 3). Since reads were mapped to reference KX619418, variants mapping between 89.0
351 and 98.0 % would have been detected if they were present. ToBRFV populations with only a single
352 major genotype could have undergone a recent genetic bottleneck event. However, no correlation
353 between within sample genomic diversity and genotype or any of epidemiological traits analyzed
354 was observed.
355
356 Epidemiology of Dutch ToBRFV outbreak
357 The ToBRFV Nextstrain build contains 63 complete ToBRFV genomes, 54 of which have been
358 generated in this study. The genomic sequences generated from the Dutch outbreak sites group in
359 three main clusters (Fig 2: red, blue, and green clusters). The TimeTree analysis estimates the
360 divergence of the red and blue clusters well before the first finding of ToBRFV in the Netherlands
361 (inferred date = May 2014, confidence interval = March 1980 – October 2014), and the three
362 clusters are hypothesized to represent three different sources. The blue and green clusters consist
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
363 of sequences that so far were only found in the Netherlands. One sequence obtained from an
364 outbreak in the United Kingdom groups in the red cluster, suggesting that these are potentially
365 linked. The genomes sequences from the Dutch outbreak locations display a broad range of
366 genomic diversity. However, genotypes found at outbreak locations in the Middle East, North-
367 America, Italy and Germany were not found in Dutch greenhouses. Furthermore, a genotype was
368 found in a sample obtained from an import inspection from Egypt which had not been found before
369 at any of the outbreak locations. Diversity is expected to be highest at the center of origin, and
370 increased sampling at outbreak locations and presumed origin will aid further understanding of
371 ToBRFV diversity. This will eventually allow the determination of the origin for this viral species.
372
373 In all but one of the inspected companies, ToBRFV genotypes obtained from the samples taken
374 belonged to one of the main clusters. In a single company, ToBRFV genotypes from both the red
375 and blue cluster were obtained. These two different genotypes were obtained from separate
376 greenhouses of the same company, and it is likely that they were introduced independently.
377
378 A number of epidemiological traits were analyzed in context of genomic diversity to identify the
379 possible source(s) of the ToBRFV outbreak in the Netherlands. These included rootstock and scion
380 varieties, young plant providers, seed batches and seed providers. However, no linkages were
381 identified between these epidemiological traits and companies or genotypes. Seen the distribution
382 of epidemiological traits among genotypes and outbreak locations, it is likely that the virus has
383 been present in the Netherlands some time before the first official samples were obtained and
384 tested. Also, establishing the origin of Tobamoviruses could prove to be too challenging as they are
385 easily transmitted and can spread rapidly over short periods in time when they remain unnoticed.
386 Companies that have been found infected must take strict hygiene measures to prevent spread of
387 the virus to other companies and to eradicate the virus at the end of the cropping season.
388
389 The ToBRFV Nextstrain build currently holds ToBRFV genomes from various outbreak locations in
390 the Middle East, North-America and Europe. However, there is a strong bias towards genotypes
391 obtained from the Dutch outbreak locations. Sampling bias and lack of data can hamper the
392 determination of reliable transmission links. The predictive power of the tool will improve with the
393 addition of additional genomes. Therefore, we encourage other organizations to share data or
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
394 biological materials together with relevant metadata in order to improve the build. Such a
395 community effort will contribute to source attribution and to a better understanding of the diversity
396 and spread of ToBRFV worldwide. The ToBRFV Nextstrain build will be maintained by NPPO-NL.
397
398 Acknowledgements
399 Many thanks to the NPPO-NL staff members Marieke van Lent, Leontine Colon, Bram Lokker, Isabel
400 Stagg, and Stephanie Rensen, Lucas van der Gouw and Eveline Metz for collecting metadata,
401 running tests and providing technical support with the generation and analysis of NGS data. The
402 support by Naktuinbouw staff members Agata Jodlowska, André van Vliet, and Ruud Barnhoorn in
403 performing the first-line detection tests is greatly acknowledged. Thank you Annelien Roenhorst
404 and Martijn Schenk (NPPO-NL) for critically reviewing the manuscript before submission.
405
406 References
407 1. Luria N, Smith E, Reingold V, Bekelman I, Lapidot M, Levin I, et al. A New Israeli Tobamovirus
408 Isolate Infects Tomato Plants Harboring Tm-22 Resistance Genes. PloS one. 2017;12(1):e0170429.
409 2. Salem N, Mansour A, Ciuffo M, Falk BW, Turina M. A new tobamovirus infecting tomato
410 crops in Jordan. Archives of Virology. 2016;161(2):503-6.
411 3. Salem NM, Cao M, Odeh S, Turina M, Tahzima R. First report of tobacco mild green mosaic
412 virus and tomato brown rugose fruit virus infecting Capsicum annuum in Jordan. Plant Disease. 2019.
413 4. Cambron-Crisantos J, Rodríguez-Mendoza J, Valencia-Luna J, Alcasio-Rangel S, Garcia-Avila C,
414 López-Buenfil J, et al. First report of Tomato brown rugose fruit virus (ToBRFV) in Michoacan, Mexico.
415 2018;37:185-92.
416 5. Panno S, Caruso AG, Blanco G, Davino S. First report of Tomato brown rugose fruit virus
417 infecting sweet pepper in Italy
418 New Disease Reports 2020;41(20).
419 6. Broadbent L, Fletcher JT. The epidemiology of tomato mosaic. Annals of Applied Biology.
420 1963;52(2):233-41.
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
421 7. OEPP/EPPO. EPPO Alert List – Tomato brown rugose fruit virus (Tobamovirus - ToBRFV) 2019
422 [14 Jan 2020]. Available from:
423 https://www.eppo.int/ACTIVITIES/plant_quarantine/alert_list_viruses/tomato_brown_rugose_fruit_
424 virus.
425 8. Levitzky N, Smith E, Lachman O, Luria N, Mizrahi Y, Bakelman H, et al. The bumblebee
426 Bombus terrestris carries a primary inoculum of Tomato brown rugose fruit virus contributing to
427 disease spread in tomatoes. PloS one. 2019;14(1):e0210871.
428 9. Alkowni R, Alabdallah O, Fadda Z. Molecular identification of tomato brown rugose fruit virus
429 in tomato in Palestine. Journal of Plant Pathology. 2019;101(3):719-23.
430 10. Camacho-Beltrán E, Pérez-Villarreal A, Leyva-López NE, Rodríguez-Negrete EA, Ceniceros-
431 Ojeda EA, Méndez-Lozano J. Occurrence of Tomato brown rugose fruit virus Infecting Tomato Crops
432 in Mexico. Plant Disease. 2019;103(6):1440-.
433 11. Ling K-S, Tian T, Gurung S, Salati R, Gilliard A. First Report of Tomato Brown Rugose Fruit
434 Virus Infecting Greenhouse Tomato in the United States. Plant Disease. 2019;103(6):1439.
435 12. Menzel W, Knierim D, Winter S, Hamacher J, Heupel M. First report of Tomato brown rugose
436 fruit virus infecting tomato in Germany. New Disease Reports. 2019;39:1.
437 13. Panno S, Caruso AG, Davino S. First Report of Tomato Brown Rugose Fruit Virus on Tomato
438 Crops in Italy. Plant Disease. 2019;103(6):1443-.
439 14. Fidan H, Sarikaya P, Calis O. First report of Tomato brown rugose fruit virus on tomato in
440 Turkey. New Disease Reports. 2019;39:18.
441 15. Yan ZY, Ma HY, Han SL, Geng C, Tian YP, Li XD. First Report of Tomato brown rugose fruit
442 virus Infecting Tomato in China. Plant Disease. 2019;103(11):2973-.
443 16. Skelton A, Buxton-Kirk A, Ward R, Harju V, Frew L, Fowkes A, et al. First report of Tomato
444 brown rugose fruit virus in tomato in the United Kingdom. New Disease Reports. 2019;40(12).
445 17. OEPP/EPPO. First report of Tomato brown rugose fruit virus in the Netherlands. EPPO
446 reporting service. 2019;10(2019/209).
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
447 18. OEPP/EPPO. First report of Tomato brown rugose fruit virus in Greece. EPPO reporting
448 service. 2019;10(2019/210).
449 19. OEPP/EPPO. First report of Tomato brown rugose fruit virus in Spain. EPPO reporting service.
450 2019;11(2019/238).
451 20. OEPP/EPPO. First report of tomato brown rugose fruit virus in France. EPPO reporting
452 service. 2020;2(2020/037).
453 21. Amer MA, Mahmoud SY. First report of Tomato brown rugose fruit virus on tomato in Egypt.
454 New Disease Reports. 2020;41(24).
455 22. ISF. Detection of infectious Tomato brown rugose fruit virus (ToBRFV) in Tomato and Pepper
456 Seeds. ISHI-veg2019. p. 11.
457 23. Anonymous. Establishing emergency measures to prevent the introduction into and the
458 spread within the Union of Tomato brown rugose fruit virus (ToBRFV). Official Journal of the
459 European Union. 2019;250:91-4.
460 24. Hadfield J, Megill C, Bell SM, Huddleston J, Potter B, Callender C, et al. Nextstrain: real-time
461 tracking of pathogen evolution. Bioinformatics. 2018;34(23):4121-3.
462 25. Gardy JL, Loman NJ. Towards a genomics-informed, real-time, global pathogen surveillance
463 system. Nature reviews Genetics. 2018;19(1):9-20.
464 26. Botermans M, van de Vossenberg BT, Verhoeven JT, Roenhorst JW, Hooftman M, Dekter R,
465 et al. Development and validation of a real-time RT-PCR assay for generic detection of pospiviroids.
466 Journal of virological methods. 2013;187(1):43-50.
467 27. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool.
468 Journal of molecular biology. 1990;215(3):403-10.
469 28. Ondov BD, Bergman NH, Phillippy AM. Interactive metagenomic visualization in a Web
470 browser. BMC Bioinformatics. 2011;12(1):385.
471 29. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7:
472 improvements in performance and usability. Molecular biology and evolution. 2013;30(4):772-80.
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
473 30. Huelsenbeck JP, Ronquist F. MRBAYES: Bayesian inference of phylogenetic trees.
474 Bioinformatics. 2001;17(8):754-5.
475 31. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large
476 phylogenies. Bioinformatics. 2014;30(9):1312-3.
477 32. Sagulenko P, Puller V, Neher RA. TreeTime: Maximum-likelihood phylodynamic analysis. Virus
478 Evol. 2018;4(1):vex042-vex.
479 33. Rodríguez-Mendoza J, Garcia-Avila C, López-Buenfil J, Araujo-Ruiz K, Quezada A, Cambrón-
480 Crisantos J, et al. Identification of Tomato brown rugose fruit virus by RT-PCR from a coding region or
481 replicase. 2019;37:346-56.
482
483 Supporting information
484 S1 Fig. Visualization of the CLC Genomics Workbench pipeline used to identify putative
485 ToBRFV contigs. The pipeline combines a quality trim for input reads, de novo assembly, blast
486 based detection and visualization of blast output in Krona. The pipeline runs on all input data, and
487 on a random sample of 1% of all reads. Blue process steps indicate output, and custom plugins
488 created for the pipeline are named CEA (CLC external application).
489 S2 Fig. Example of Krona visualization of blast output for sample 38886230_A. When
490 using all data for the de novo assembly, and after blast-based filtering of plant contigs, 1266
491 contigs remained. The majority of these sequences (99%) did not produce a blast hit, but 3 viral
492 blast hits were obtained (left panel). When selecting the viral hits, the blast-based identity is
493 shown (right panel). In this example, one ToBRFV hit and two PepMV hits are obtained. The
494 sequence producing the ToBRFV hit is obtained via a link, which is then further analyzed to
495 determine the presence of the virus in the sample.
496 S3 Fig. Comparison of the Nextstrain RAxML tree (left) and the Bayesian inference of
497 phylogeny (right) of 63 ToBRFV genomes included in this study. Links are drawn for the
498 isolates generated in this study. The Nextstrain tree is colored based on the country of origin,
499 whereas the Bayesian tree is colored based on the main clusters identified for the viral genomes
500 sequenced in this study. Grouping within the main clusters is identical between the two analyses,
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
501 but some groups are placed at different positions in the overall phylogeny (e.g. the Egyptian
502 sequences: orange cluster).
503 S4 Fig. Publicly available ToBRFV sequences mapped to ToBRFV genome NC_028478 (=
504 KT383474). Coding sequences are annotated in yellow, and sequences identical to the reference
505 sequence are shown in grey. Differences relative to the reference are highlighted in black. Apart
506 from the nine complete genome sequences, 37 Sanger sequences have been submitted to NCBI
507 covering 5’ end of the small replicase subunit, 3’ end of the small replicase subunit, 3’ end
508 of RdRp, and the partial MP and CP.
509 S5 Fig. Correlation between Cq values and read normalized read mapping. Real-time RT-
510 PCR Cq values of the CP (top) and MP (bottom) assays are displayed on the x-axis and the
511 normalized average read-coverage is shown on the y-axis. To allow comparison between the
512 different datasets, the average read coverage of the ToBRFV genome for a given sample was
513 multiplied by the fraction of reads generated for that sample relative to the mean of reads
514 generated for all samples.
515 S1 Table. Details of real-time RT-PCR detection and RNAseq-based verification tests
516 performed on ToBRFV samples included in this study.
517
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint
.CC-BY 4.0 International licensemade available under a(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is
The copyright holder for this preprintthis version posted June 2, 2020. ; https://doi.org/10.1101/2020.06.02.129395doi: bioRxiv preprint