Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic...

31
Metagenomic analysis of viruses associated to maize lethal necrosis Hernan Garcia-Ruiz Assistant Professor October 15, 2018

Transcript of Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic...

Page 1: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

Metagenomic analysis of viruses associatedto maize lethal necrosis

Hernan Garcia-RuizAssistant Professor

October 15, 2018

Page 2: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

Acknowledgements

LabMembers:

Dr.DeeptiNegamDr.PedroSouzaNatalieHolste

ChristianDukundeNicoleBachellerStellaUiterwaalAaronKnappPatriciaHarte-Maxwell

USDA Foreign Agriculture Service

Page 3: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

SUMMARY

Maize lethal necrosis was first described in Kansas/Nebraska in 1978Causal agents are

Maize chlorotic mottle virusSugarcane mosaic virus

Kenya

Page 4: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

SUMMARY

Maize lethal necrosis was first described in Kansas/Nebraska in 1978Causal agents are

Maize chlorotic mottle virusSugarcane mosaic virus

Maize lethal necrosis was first detected in Africa in 2012

Nebraska

Kenya

Page 5: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

SUMMARY

Maize lethal necrosis was first described in Kansas/Nebraska in 1978Causal agents are

Maize chlorotic mottle virusSugarcane mosaic virus

Maize lethal necrosis was first detected in Africa in 2012

Maize lethal necrosis is threatens food security in sub-Saharan Africa

Diagnosis based on molecular approaches was erraticAntibodies and RT-PCR fail to detectSugarcane mosaic virus

Viruses causing Maize lethal necrosis?

Nebraska

Kenya

Page 6: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

SUMMARY

Maize lethal necrosis was first described in Kansas/Nebraska in 1978Causal agents are

Maize chlorotic mottle virusSugarcane mosaic virus

Maize lethal necrosis was first detected in Africa in 2012

Maize lethal necrosis is threatens food security in sub-Saharan Africa

Diagnosis based on molecular approaches was erraticAntibodies and RT-PCR fail to detectSugarcane mosaic virus

Viruses causing Maize lethal necrosis?

Matagenomic analysis (RNA sequencing)A combination of two to eight virusesAt least three genetic variants of Sugarcane mosaic virus

Identification of genetic variation in Potyviruses

Nebraska

Kenya

Page 7: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

Co-infection= Maize lethal necrosisSugarcane mosaic virusMaize chlorotic mottle virus

Maize lethal necrosis is caused by a synergistic viral co-infection: Potyvirus + Maize chlorotic mottle virus

Page 8: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

Erratic detection of Sugarcane mosaic virus

ELISA test failed to detect SCMVAnti-coat protein antibodies for Ohio isolate

RT-PCR is inconsistentPrimers to amplify the coat proteinOhio isolate used as reference

Sugarcane mosaic virus genome organization

Page 9: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

RESEARCH Open Access

Metagenomic analysis of viruses associatedwith maize lethal necrosis in KenyaMwathi Jane Wamaitha1*, Deepti Nigam2, Solomon Maina3,4, Francesca Stomeo5, Anne Wangai1,Joyce Njoki Njuguna5, Timothy A. Holton6, Bramwel W. Wanjala5, Mark Wamalwa5, Tanui Lucas1,Appolinaire Djikeng5,7 and Hernan Garcia-Ruiz2*

Abstract

Background: Maize lethal necrosis is caused by a synergistic co-infection of Maize chlorotic mottle virus (MCMV) anda specific member of the Potyviridae, such as Sugarcane mosaic virus (SCMV), Wheat streak mosaic virus (WSMV) orJohnson grass mosaic virus (JGMV). Typical maize lethal necrosis symptoms include severe yellowing and leaf dryingfrom the edges. In Kenya, we detected plants showing typical and atypical symptoms. Both groups of plants oftentested negative for SCMV by ELISA.

Methods: We used next-generation sequencing to identify viruses associated to maize lethal necrosis in Kenyathrough a metagenomics analysis. Symptomatic and asymptomatic leaf samples were collected from maize andsorghum representing sixteen counties.

Results: Complete and partial genomes were assembled for MCMV, SCMV, Maize streak virus (MSV) and Maizeyellow dwarf virus-RMV (MYDV-RMV). These four viruses (MCMV, SCMV, MSV and MYDV-RMV) were found together in30 of 68 samples. A geographic analysis showed that these viruses are widely distributed in Kenya. Phylogeneticanalyses of nucleotide sequences showed that MCMV, MYDV-RMV and MSV are similar to isolates from East Africaand other parts of the world. Single nucleotide polymorphism, nucleotide and polyprotein sequence alignmentsidentified three genetically distinct groups of SCMV in Kenya. Variation mapped to sequences at the border of NIband the coat protein. Partial genome sequences were obtained for other four potyviruses and one polerovirus.

Conclusion: Our results uncover the complexity of the maize lethal necrosis epidemic in Kenya. MCMV, SCMV, MSVand MYDV-RMV are widely distributed and infect both maize and sorghum. SCMV population in Kenya is diverseand consists of numerous strains that are genetically different to isolates from other parts of the world. Severalpotyviruses, and possibly poleroviruses, are also involved.

Keywords: Maize lethal necrosis, MCMV, SCMV, MYDV-RMV, MSV, Metagenomics, Phylogenetics, Coat proteinvariation

BackgroundMaize (Zea mays L.) is one of the most important ce-reals in Sub-Saharan Africa and is grown in approxi-mately 25 million hectares [1]. Maize is consumed asa preferred calorie source by 95% of the population,at an average of 1075 kcal/capita/day, which repre-sents more than 50% of the recommended daily

intake [2]. Maize production is destined for humanconsumption or animal feed at a proportion of 88and 12%, respectively [3, 4].In 2011 maize lethal necrosis disease was first detected

in Kenya [5–7], and confirmed in several countries inEast and Central Africa, specifically in Tanzania, Uganda[8], Rwanda [9] DR Congo [10], Ethiopia and SouthSudan [11]. Corn lethal necrosis (CLN) was first de-scribed in the State of Kansas in 1978 [12]. In their ori-ginal descriptions, corn lethal necrosis and maize lethalnecrosis defined the same disease. Herein we use maizelethal necrosis disease.

* Correspondence: [email protected]; [email protected] Agricultural and Livestock Research Organization (KALRO), P. O. Box14733-00800, Nairobi, Kenya2Department of Plant Pathology and Nebraska Center for Virology, Universityof Nebraska- Lincoln, Lincoln, NE 68583, USAFull list of author information is available at the end of the article

© The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Wamaitha et al. Virology Journal (2018) 15:90 https://doi.org/10.1186/s12985-018-0999-2

Page 10: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

MCMV+SCMV+MSV+MYDV-RMVMCMV and SCMV+MSV or MYDV-RMVMCMV+ any otherMaize growing areas

c Frequency of mixed infections

Asymptomatic plantsSymptomatic plants

Num

ber

of s

ampl

es

MCMVSCMVMSVMYDV-RMV

+---

++--

+

+-

-

++-+

++

-+

++

++

Maize

MCMVSCMVMSVMYDV-RMV

++

++

Sorghum

MCMVSCMVMSVMYDV-RMV

Napier grass

+---

3026221814

2

106

++-+

30

26221814

2

106

30

26221814

2

106

Asymptomatic

Sorghum Napier grass Maize

Symptomatic

Maize Maize Maize

a Representative plants

d Other viruses

Virus Reference Length Contigs Similarity E-value n Sample Accession (bp) Number Length (bp) (%) number

Hubei Poty-like virus 1* NC_032912.1 9356 41 203 to 9323 75.2 to 87. 3 <3.6E-30 19 6,14,17,20,23,24,25, 27,30,32,34,35,40,41, 48,66,67,68,72Barley virus G isolate Gimje NC_029906.1 5620 26 242 to 5494 80.0 to 87.6 <4.9E-65 11 18, 29, 30, 32, 33, 37, 40, 41, 42, 44, 45Scallion mosaic virus* NC_003399.1 9324 9 260 to 961 71.6 to 90.0 <1.8E-09 7 14, 20, 23, 28, 46, 47,68

Jhonson grass mosaic virus* NC_003606.1 9779 6 244 to 1630 75.0 to 84.0 <4.4E-20 5 17, 18, 29, 30, 46

Iranian johnsongrass mosaic virus* NC_018833.1 9544 2 244 to 332 75.0 to 80.0 <4.4E-25 2 26, 36

**

b Distribution of maize viruses in Kenya

Fig. 1 Geographic distribution of maize-infecting viruses in Kenya. a Representative pictures of asymptomatic and symptomatic plants sampled inthis study. b Maize-growing areas and distribution of the main maize viruses detected in this study. Counties are color-coded to illustrate thecombinations of viruses found. c Most abundant viruses detected and frequency of mixed infections in asymptomatic and symptomatic plants(68 samples total). d Other viruses detected in this study. Potyvirus and polerovirus are denoted by * and **, respectively. Reference accessionnumber and length are provided. Number of de-novo assembled contigs, range of length and similarity to the reference genome is provided.Identity of samples contributing at least one contig is indicated

Wamaitha et al. Virology Journal (2018) 15:90 Page 3 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Symptom variation in maize and other plants

Page 11: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

ModelMaize lethal necrosis is caused by novel virusesViruses associated to maize lethal necrosis are genetically

different form know isolates

PredictionSymptomatic plants are infected by viruses other than Maize chlorotic mottle virus and Sugarcane mosaic virus

Maize chlorotic mottle virus and Sugarcane mosaic virusfrom Kenya are genetically different to US isolates

ExperimentRNA sequencing De-novo assemblyAlignment to plant virus databaseCharacterization of Identification of genetic

Page 12: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

Sample collection RNA sequencing and analysis

Phylogeneticanalysis

16 maize growingcounties in Kenya

3 plant species:maize, sorghum and

napier grass

Leaf tissue

Symptomatic and asymptomatic plants

68 samples

Total RNA extraction

RNA library preparation

Paired-endsequencing

(Illumina Miseq)

Adaptor removal andquality control check

(Trimmomatic)

De novo assembly(Trinity)

Virus identification

BlastN search

2166Plant virusGenomes

NCBI“nr”

database

Top two hits based on

percent similarity

Virus geographicaldistribution

Single and multipleinfection

Complete viral sequences in

Genebank

Near completegenome contigs

Phylogenetictree

Multiple SequenceAlignment

Model selectionand validation

Contigs equal to orlonger than 200 bp

Representativefull length or

partial virus contigs

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Experimental approach

Page 13: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

Sample collection RNA sequencing and analysis

Phylogeneticanalysis

16 maize growingcounties in Kenya

3 plant species:maize, sorghum and

napier grass

Leaf tissue

Symptomatic and asymptomatic plants

68 samples

Total RNA extraction

RNA library preparation

Paired-endsequencing

(Illumina Miseq)

Adaptor removal andquality control check

(Trimmomatic)

De novo assembly(Trinity)

Virus identification

BlastN search

2166Plant virusGenomes

NCBI“nr”

database

Top two hits based on

percent similarity

Virus geographicaldistribution

Single and multipleinfection

Complete viral sequences in

Genebank

Near completegenome contigs

Phylogenetictree

Multiple SequenceAlignment

Model selectionand validation

Contigs equal to orlonger than 200 bp

Representativefull length or

partial virus contigs

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Experimental approach

Maize chlorotic mottle virus lacks a poly-A tailSugarcane maize virus has a poly-A tailDNA viruses?

Page 14: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

De-novo assembly of RNA transcripts

Martin J.A. and Wang Z., Nat. Rev. Genet. (2011) 12:671–682

VirusRNAs

HostRNAs

Viruslongcontigs

Hostlongcontigs

Trimmeddata(in.fastqformat)

NCBIPlantvirusdatabase

NCBI“nr”database

(Discarded)

(>= 200 bp)

(Discarded)Otherslongcontigs

(usedfurther)

Denovoassembly(Trinity)

OtherRNAslikebacteria,fungietc.

ShortReads

Page 15: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

Sample collection RNA sequencing and analysis

Phylogeneticanalysis

16 maize growingcounties in Kenya

3 plant species:maize, sorghum and

napier grass

Leaf tissue

Symptomatic and asymptomatic plants

68 samples

Total RNA extraction

RNA library preparation

Paired-endsequencing

(Illumina Miseq)

Adaptor removal andquality control check

(Trimmomatic)

De novo assembly(Trinity)

Virus identification

BlastN search

2166Plant virusGenomes

NCBI“nr”

database

Top two hits based on

percent similarity

Virus geographicaldistribution

Single and multipleinfection

Complete viral sequences in

Genebank

Near completegenome contigs

Phylogenetictree

Multiple SequenceAlignment

Model selectionand validation

Contigs equal to orlonger than 200 bp

Representativefull length or

partial virus contigs

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Experimental approach

Page 16: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

MCMV+SCMV+MSV+MYDV-RMVMCMV and SCMV+MSV or MYDV-RMVMCMV+ any otherMaize growing areas

c Frequency of mixed infections

Asymptomatic plantsSymptomatic plants

Num

ber

of s

ampl

es

MCMVSCMVMSVMYDV-RMV

+---

++--

+

+-

-

++-+

++

-+

++

++

Maize

MCMVSCMVMSVMYDV-RMV

++

++

Sorghum

MCMVSCMVMSVMYDV-RMV

Napier grass

+---

3026221814

2

106

++-+

30

26221814

2

106

30

26221814

2

106

Asymptomatic

Sorghum Napier grass Maize

Symptomatic

Maize Maize Maize

a Representative plants

d Other viruses

Virus Reference Length Contigs Similarity E-value n Sample Accession (bp) Number Length (bp) (%) number

Hubei Poty-like virus 1* NC_032912.1 9356 41 203 to 9323 75.2 to 87. 3 <3.6E-30 19 6,14,17,20,23,24,25, 27,30,32,34,35,40,41, 48,66,67,68,72Barley virus G isolate Gimje NC_029906.1 5620 26 242 to 5494 80.0 to 87.6 <4.9E-65 11 18, 29, 30, 32, 33, 37, 40, 41, 42, 44, 45Scallion mosaic virus* NC_003399.1 9324 9 260 to 961 71.6 to 90.0 <1.8E-09 7 14, 20, 23, 28, 46, 47,68

Jhonson grass mosaic virus* NC_003606.1 9779 6 244 to 1630 75.0 to 84.0 <4.4E-20 5 17, 18, 29, 30, 46

Iranian johnsongrass mosaic virus* NC_018833.1 9544 2 244 to 332 75.0 to 80.0 <4.4E-25 2 26, 36

**

b Distribution of maize viruses in Kenya

Fig. 1 Geographic distribution of maize-infecting viruses in Kenya. a Representative pictures of asymptomatic and symptomatic plants sampled inthis study. b Maize-growing areas and distribution of the main maize viruses detected in this study. Counties are color-coded to illustrate thecombinations of viruses found. c Most abundant viruses detected and frequency of mixed infections in asymptomatic and symptomatic plants(68 samples total). d Other viruses detected in this study. Potyvirus and polerovirus are denoted by * and **, respectively. Reference accessionnumber and length are provided. Number of de-novo assembled contigs, range of length and similarity to the reference genome is provided.Identity of samples contributing at least one contig is indicated

Wamaitha et al. Virology Journal (2018) 15:90 Page 3 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Geographic distribution of maize viruses in Kenya

Page 17: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

MCMV+SCMV+MSV+MYDV-RMVMCMV and SCMV+MSV or MYDV-RMVMCMV+ any otherMaize growing areas

c Frequency of mixed infections

Asymptomatic plantsSymptomatic plants

Num

ber

of s

ampl

es

MCMVSCMVMSVMYDV-RMV

+---

++--

+

+-

-

++-+

++

-+

++

++

Maize

MCMVSCMVMSVMYDV-RMV

++

++

Sorghum

MCMVSCMVMSVMYDV-RMV

Napier grass

+---

3026221814

2

106

++-+

30

26221814

2

106

30

26221814

2

106

Asymptomatic

Sorghum Napier grass Maize

Symptomatic

Maize Maize Maize

a Representative plants

d Other viruses

Virus Reference Length Contigs Similarity E-value n Sample Accession (bp) Number Length (bp) (%) number

Hubei Poty-like virus 1* NC_032912.1 9356 41 203 to 9323 75.2 to 87. 3 <3.6E-30 19 6,14,17,20,23,24,25, 27,30,32,34,35,40,41, 48,66,67,68,72Barley virus G isolate Gimje NC_029906.1 5620 26 242 to 5494 80.0 to 87.6 <4.9E-65 11 18, 29, 30, 32, 33, 37, 40, 41, 42, 44, 45Scallion mosaic virus* NC_003399.1 9324 9 260 to 961 71.6 to 90.0 <1.8E-09 7 14, 20, 23, 28, 46, 47,68

Jhonson grass mosaic virus* NC_003606.1 9779 6 244 to 1630 75.0 to 84.0 <4.4E-20 5 17, 18, 29, 30, 46

Iranian johnsongrass mosaic virus* NC_018833.1 9544 2 244 to 332 75.0 to 80.0 <4.4E-25 2 26, 36

**

b Distribution of maize viruses in Kenya

Fig. 1 Geographic distribution of maize-infecting viruses in Kenya. a Representative pictures of asymptomatic and symptomatic plants sampled inthis study. b Maize-growing areas and distribution of the main maize viruses detected in this study. Counties are color-coded to illustrate thecombinations of viruses found. c Most abundant viruses detected and frequency of mixed infections in asymptomatic and symptomatic plants(68 samples total). d Other viruses detected in this study. Potyvirus and polerovirus are denoted by * and **, respectively. Reference accessionnumber and length are provided. Number of de-novo assembled contigs, range of length and similarity to the reference genome is provided.Identity of samples contributing at least one contig is indicated

Wamaitha et al. Virology Journal (2018) 15:90 Page 3 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Abundance of maize-infecing viruses in Kenya

Page 18: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

<0.5

0.51.0

1.11.5

1.62.0

2.12.5

2.63.0

3.13.5

3.64.0

4.14.5

2468

1012141618

<0.5

0.51.0

1.11.5

1.62.0

2.12.5

2.63.0

3.13.5

3.64.0

4.14.5

4.65.0

5.15.5

5.66.0

6.16.5

6.67.0

7.17.5

7.68.0

8.18.5

9.09.6

8.59.0

02468

1012141618

<0.5

0.51.0

1.11.5

1.62.0

2.12.5

02468

1012141618

<0.5

0.51.0

1.11.5

1.62.0

2.12.5

2.63.0

3.13.5

3.64.0

4.14.5

4.65.0

5.15.5 5.6 Kb

Infected = 68 Infected = 60

Infected = 52 Infected = 40

>2.5

SimilarLonger

Shorther

Contig size respect to reference genome

Num

ber o

f con

tigs

Num

ber o

f con

tigs

Num

ber o

f con

tigs

Num

ber o

f con

tigs

2

22

6

26

10

30

1418

0

Contig size (Kb) Contig size (Kb)

Contig size (Kb) Contig size (Kb)

a MCMV b SCMV

c MSV d MYDV-RMV

>9.6

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

A Maize chlorotic mottle virus B Sugarcane mosaic virus

C Maize streak virus D Maize yellow dwarf virus-RMV

Size and frequency of de novo assembled virus contigs

Page 19: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

MCMV+SCMV+MSV+MYDV-RMVMCMV and SCMV+MSV or MYDV-RMVMCMV+ any otherMaize growing areas

c Frequency of mixed infections

Asymptomatic plantsSymptomatic plants

Num

ber

of s

ampl

es

MCMVSCMVMSVMYDV-RMV

+---

++--

+

+-

-

++-+

++

-+

++

++

Maize

MCMVSCMVMSVMYDV-RMV

++

++

Sorghum

MCMVSCMVMSVMYDV-RMV

Napier grass

+---

3026221814

2

106

++-+

30

26221814

2

106

30

26221814

2

106

Asymptomatic

Sorghum Napier grass Maize

Symptomatic

Maize Maize Maize

a Representative plants

d Other viruses

Virus Reference Length Contigs Similarity E-value n Sample Accession (bp) Number Length (bp) (%) number

Hubei Poty-like virus 1* NC_032912.1 9356 41 203 to 9323 75.2 to 87. 3 <3.6E-30 19 6,14,17,20,23,24,25, 27,30,32,34,35,40,41, 48,66,67,68,72Barley virus G isolate Gimje NC_029906.1 5620 26 242 to 5494 80.0 to 87.6 <4.9E-65 11 18, 29, 30, 32, 33, 37, 40, 41, 42, 44, 45Scallion mosaic virus* NC_003399.1 9324 9 260 to 961 71.6 to 90.0 <1.8E-09 7 14, 20, 23, 28, 46, 47,68

Jhonson grass mosaic virus* NC_003606.1 9779 6 244 to 1630 75.0 to 84.0 <4.4E-20 5 17, 18, 29, 30, 46

Iranian johnsongrass mosaic virus* NC_018833.1 9544 2 244 to 332 75.0 to 80.0 <4.4E-25 2 26, 36

**

b Distribution of maize viruses in Kenya

Fig. 1 Geographic distribution of maize-infecting viruses in Kenya. a Representative pictures of asymptomatic and symptomatic plants sampled inthis study. b Maize-growing areas and distribution of the main maize viruses detected in this study. Counties are color-coded to illustrate thecombinations of viruses found. c Most abundant viruses detected and frequency of mixed infections in asymptomatic and symptomatic plants(68 samples total). d Other viruses detected in this study. Potyvirus and polerovirus are denoted by * and **, respectively. Reference accessionnumber and length are provided. Number of de-novo assembled contigs, range of length and similarity to the reference genome is provided.Identity of samples contributing at least one contig is indicated

Wamaitha et al. Virology Journal (2018) 15:90 Page 3 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Page 20: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

a MCMV genome and contig alignment

137 1453 3034

118 987

P50 P111P32

4436Genomic RNA (X14736.2)

29953199

3834

P31

CP

3384 4094

P7

Contigpolarity

(+)(-)

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.40.5

96.7

Sample(symptoms)

96.996.596.996.796.796.296.496.596.696.796.796.696.796.696.5 20 (A)96.6 21 (S)96.6 22 (S)96.7 23 (S)96.7 24 (S)96.6 25 (S)96.6 26 (S)96.5 27 (S)96.696.696.7

96.696.7

96.596.796.496.596.496.6

96.796.7

96.7

96.696.696.5

96.796.7

96.696.897.296.696.496.796.596.397.396.096.596.396.696.196.496.696.696.796.096.696.596.796.6

Contigsize (Kb)

3 (S)4 (S)5 (S)6 (S)7 (A)

* 8 (A)9 (A)

11 (S)12 (S)13 (S)14 (A)15 (A)16 (S)17 (A)18 (S)

96.696.3

96.3

4.43.02.03.74.54.7

2.23.74.24.44.44.42.14.33.44.52.84.44.54.53.02.04.34.44.44.32.91.64.43.34.52.34.44.42.44.41.92.74.64.54.54.52.32.23.11.01.14.43.32.32.72.31.42.03.31.44.41.22.41.74.42.91.14.32.81.73.0

1.5

28 (A)29 (S)30 (S)31 (A)32 (A)

* 34 (A)

40 (A)

64 (A)

33 (S)

35 (S)36 (S)37 (S)38 (S)

** 39 (A)

42 (S)43 (S)44 (S)45 (S)

48 (S)49 (S)50 (S)51 (S)

53 (S)54 (S)

56 (S)57 (S)

41 (A)

46 (A)47 (A)

52 (A)

*** 55 (A)

60 (A)

58 (S)59 (S)

61 (S)62 (S)63 (S)

66 (S)67 (S)68 (S)

65 (S)

72 (S)71 (S)70 (S)69 (S)

Similarity(%)

b MCMV coverage

Alignmentsize (Kb)

4.43.02.03.64.44.4

2.23.64.24.44.44.42.14.33.44.32.84.44.44.43.02.04.34.44.44.32.91.44.43.34.32.34.44.42.44.41.92.74.44.44.34.32.32.22.81.01.14.43.32.32.62.31.42.03.31.04.31.22.41.74.42.91.14.32.71.72.8

1.5

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.40.5GC content

900 -

0 -

Sample 3

Scale

0.5 Kb

County

Bomet

KirinyagaEmbu

Transzoia

BometBometBomet

Kirinyaga

TranszoiaBometTranszoia

BusiaBusia

BusiaBusia

Busia

BometNarokBometNarokNandiSiayaBometBusiaEmbuMigoriUasin GishuUasin GishuKisumuKirinyagaTranszoiaEmbuElgeyo MarakwetHomabayKirinyagaKirinyagaNandiSiayaBometMigoriBometNandiUasin GishuBometBometNyamiraMigoriElgeyo MarakwetNyamiraBometEmbuKerichoTranszoiaNandiBometBusiaBometBometElgeyo Marakwet

KakamegaBometNyamiraNandiNandiTranszoiaNandiTranszoia

Uasin Gishu

Fig. 2 (See legend on next page.)

Wamaitha et al. Virology Journal (2018) 15:90 Page 6 of 19

a MCMV genome and contig alignment

137 1453 3034

118 987

P50 P111P32

4436Genomic RNA (X14736.2)

29953199

3834

P31

CP

3384 4094

P7

Contigpolarity

(+)(-)

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.40.5

96.7

Sample(symptoms)

96.996.596.996.796.796.296.496.596.696.796.796.696.796.696.5 20 (A)96.6 21 (S)96.6 22 (S)96.7 23 (S)96.7 24 (S)96.6 25 (S)96.6 26 (S)96.5 27 (S)96.696.696.7

96.696.7

96.596.796.496.596.496.6

96.796.7

96.7

96.696.696.5

96.796.7

96.696.897.296.696.496.796.596.397.396.096.596.396.696.196.496.696.696.796.096.696.596.796.6

Contigsize (Kb)

3 (S)4 (S)5 (S)6 (S)7 (A)

* 8 (A)9 (A)

11 (S)12 (S)13 (S)14 (A)15 (A)16 (S)17 (A)18 (S)

96.696.3

96.3

4.43.02.03.74.54.7

2.23.74.24.44.44.42.14.33.44.52.84.44.54.53.02.04.34.44.44.32.91.64.43.34.52.34.44.42.44.41.92.74.64.54.54.52.32.23.11.01.14.43.32.32.72.31.42.03.31.44.41.22.41.74.42.91.14.32.81.73.0

1.5

28 (A)29 (S)30 (S)31 (A)32 (A)

* 34 (A)

40 (A)

64 (A)

33 (S)

35 (S)36 (S)37 (S)38 (S)

** 39 (A)

42 (S)43 (S)44 (S)45 (S)

48 (S)49 (S)50 (S)51 (S)

53 (S)54 (S)

56 (S)57 (S)

41 (A)

46 (A)47 (A)

52 (A)

*** 55 (A)

60 (A)

58 (S)59 (S)

61 (S)62 (S)63 (S)

66 (S)67 (S)68 (S)

65 (S)

72 (S)71 (S)70 (S)69 (S)

Similarity(%)

b MCMV coverage

Alignmentsize (Kb)

4.43.02.03.64.44.4

2.23.64.24.44.44.42.14.33.44.32.84.44.44.43.02.04.34.44.44.32.91.44.43.34.32.34.44.42.44.41.92.74.44.44.34.32.32.22.81.01.14.43.32.32.62.31.42.03.31.04.31.22.41.74.42.91.14.32.71.72.8

1.5

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.40.5GC content

900 -

0 -

Sample 3

Scale

0.5 Kb

County

Bomet

KirinyagaEmbu

Transzoia

BometBometBomet

Kirinyaga

TranszoiaBometTranszoia

BusiaBusia

BusiaBusia

Busia

BometNarokBometNarokNandiSiayaBometBusiaEmbuMigoriUasin GishuUasin GishuKisumuKirinyagaTranszoiaEmbuElgeyo MarakwetHomabayKirinyagaKirinyagaNandiSiayaBometMigoriBometNandiUasin GishuBometBometNyamiraMigoriElgeyo MarakwetNyamiraBometEmbuKerichoTranszoiaNandiBometBusiaBometBometElgeyo Marakwet

KakamegaBometNyamiraNandiNandiTranszoiaNandiTranszoia

Uasin Gishu

Fig. 2 (See legend on next page.)

Wamaitha et al. Virology Journal (2018) 15:90 Page 6 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Alignment of de novo -assembled contigs to Maize chlorotic mottle virus

Page 21: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

least 96% similar to the Kansas isolate (X14736.2) used asreference (Fig. 2a). In agreement with world wide variation[37], our results showed a clear distribution of MCMV iso-lates in different clades based on their geographic origin(Fig. 7a). Kenya samples described here clustered inthe clade containing isolates from East Africa, close to iso-lates from China and away from isolates from the American

continent (Fig. 7a). Within our Kenya samples, there wasno correlation with the county or host of origin. One sam-ple (number 16) lacking 15 nt and 205 nt at the 5’ end and3’ end, respectively, showed the most distance from theAfrican cluster (Fig. 7a). Results described here and before[37] show that there is low genetic variation in the MCMVpopulation in Kenya.

a MCMV

b SCMV

JX188385.1 (Ohio, USA)

KF744391.1, Rwanda(G1)

KF744392.1,Rwanda (G1)

JX047391.1,China (G3)

GU474635.1,Mexico (G1)

KP860935.1,Ethiopia (G2)

KP860936.1, Ethiopia (G1)KP772216.1, Ethiopia (G1)

KJ782300.1 (Taiwan)

GU

138674.1 (China)

X14736.2 (Kansas, USA)EU358605.1 (Nebraska, USA)

JQ982468.1 (C

hina)

KF010583.1 (China)

KP798452.2 (Ethiopia)JQ

982469.1 (China) 0.001

MF510222.1 (Sta Elena, Ecuador)

MF510219.1 (Porto severe, Ecuador)MF510220.1 (Hawaii)

MF510221.1 (Porto original, Ecuador)MF510234.1(T1F7S2, Kenya)

MF510232.1 (T2F1S3, K

enya)

MF510245.1 (T1F5S3, Kenya)

MF510238.1 (T1F5S

2, Kenya)

4445

16 53

2536

21

46

66

38

39

2818

3

823

34762

41

72, Transzoia (G1)52, Elgeyo Marakwet (G1)

15, Busia (G3)3, Bomet (G3)

48, Bomet (G1)

21, Bomet (G1)45, Bomet (G1)

44, Migori (G2)

56, Kericho (G1)

4, Bomet (G1)57, Transzoia (G1)

36, Embu (G1)

7, Kirinyaga (G2)

8, Embu (Sorghum, G1)32, Uasin Gishu (G2)9, Kirinyaga (G2)

34, Kirinyaga (Sorghum, G2)39,Kirinyaga (Sorghum,G2)

5, Bomet (G3)

18, Busia (G1)

0.01

Fig. 7 Phylogeny of MCMV (a) and SCMV (b). Phylogenetic trees were generated using Bayesian inference in Mr. Bayes 3.2. Scale bar representsnucleotide substitution per site. For SCMV, G1, G2 and G3 correspond to genetic variation and groups described in Fig. 4. Kenya samplesdescribed in this study are colored in red and identified by a number and the county of origin. Unless indicated otherwise, samples came frommaize. Green background indicates clusters formed by Kenya samples

Wamaitha et al. Virology Journal (2018) 15:90 Page 12 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Maize chlorotic mottle virus exhibits low genetic variation

Page 22: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

a MSV genome and conting alignment

98.199.197.5

LIR

MP CP

SIR

Rep C1/C2Rep A

2.7 Kb1.0 2.00.5 1.5 2.5

98.798.296.998.5

98.296.598.798.698.5

Contigsize (Kb)

2.1 0.4

2.22.20.71.00.61.31.31.71.31.71.3 97.91.7 98.42.2 98.21.6 98.41.7 98.50.6 97.91.7 98.50.9 95.71.2 98.40.7 98.11.6 97.71.8 98.01.1 98.71.3 99.31.1 98.91.8 97.81.6 97.71.2 98.71.4 99.01.9 98.51.0 98.81.3 98.82.1 97.91.1 98.40.9 97.92.2 97.90.9 97.21.6 98.52.1 97.30.7 98.41.3 98.51.2 96.80.3 92.40.4 100.00.5 98.40.3 97.70.2 97.90.1 99.10.7 99.30.3 99.3

Similarity(%)

20 (A)21 (S)22 (S)23 (S)24 (S)25 (S)26 (S)27 (S)

3 (S)5 (S)6 (S)7 (A)

* 8 (A)9 (A)

11 (S)12 (S)13 (S)14 (A)15 (A)16 (S)17 (A)18 (S)

28 (A)29 (S)30 (S)31 (A)32 (A)

* 34 (A)

40 (A)

33 (S)

35 (S)36 (S)37 (S)38 (S)

** 39 (A)

42 (S)43 (S)44 (S)45 (S)

48 (S)51 (S)53 (S)54 (S)57 (S)

41 (A)

46 (A)47 (A)

58 (S)59 (S)63 (S)65 (S)71 (S)

Sample(symptoms)

b MSV coverage

Sample 33

2.7 Kb1.0 2.00.5 1.5 2.5

Alignmentsize (Kb)

2.40.42.22.20.71.20.61.71.41.71.31.91.51.72.21.81.70.62.01.41.20.71.81.81.11.31.12.81.61.31.41.91.81.32.31.11.12.91.61.92.60.71.31.30.30.40.50.30.20.20.71.0

0 -

6000 -

Genomic DNA (AF329878.1)

Scale

0.5 Kb

Fig. 5 Maize streak virus (MSV) genome organization and alignment of de novo-assembled contigs. Labels are as in Fig. 2. a MSV genomeorganization. Open reading frames are represented by cylinders. Genomic DNA is represented by a solid line. Coordinates are based on referencesequence number AF329878.1. Large (LIR) and small (SIR) are represented by shaded boxes. Direction of transcription is indicated by arrows. Everysample categorized as infected contributed one representative contig. Shorter, redundant contigs were not illustrated. b Genome coverage afterreference based assembly using Bowtie v2 for one representative sample

Wamaitha et al. Virology Journal (2018) 15:90 Page 10 of 19

a MSV genome and conting alignment

98.199.197.5

LIR

MP CP

SIR

Rep C1/C2Rep A

2.7 Kb1.0 2.00.5 1.5 2.5

98.798.296.998.5

98.296.598.798.698.5

Contigsize (Kb)

2.1 0.4

2.22.20.71.00.61.31.31.71.31.71.3 97.91.7 98.42.2 98.21.6 98.41.7 98.50.6 97.91.7 98.50.9 95.71.2 98.40.7 98.11.6 97.71.8 98.01.1 98.71.3 99.31.1 98.91.8 97.81.6 97.71.2 98.71.4 99.01.9 98.51.0 98.81.3 98.82.1 97.91.1 98.40.9 97.92.2 97.90.9 97.21.6 98.52.1 97.30.7 98.41.3 98.51.2 96.80.3 92.40.4 100.00.5 98.40.3 97.70.2 97.90.1 99.10.7 99.30.3 99.3

Similarity(%)

20 (A)21 (S)22 (S)23 (S)24 (S)25 (S)26 (S)27 (S)

3 (S)5 (S)6 (S)7 (A)

* 8 (A)9 (A)

11 (S)12 (S)13 (S)14 (A)15 (A)16 (S)17 (A)18 (S)

28 (A)29 (S)30 (S)31 (A)32 (A)

* 34 (A)

40 (A)

33 (S)

35 (S)36 (S)37 (S)38 (S)

** 39 (A)

42 (S)43 (S)44 (S)45 (S)

48 (S)51 (S)53 (S)54 (S)57 (S)

41 (A)

46 (A)47 (A)

58 (S)59 (S)63 (S)65 (S)71 (S)

Sample(symptoms)

b MSV coverage

Sample 33

2.7 Kb1.0 2.00.5 1.5 2.5

Alignmentsize (Kb)

2.40.42.22.20.71.20.61.71.41.71.31.91.51.72.21.81.70.62.01.41.20.71.81.81.11.31.12.81.61.31.41.91.81.32.31.11.12.91.61.92.60.71.31.30.30.40.50.30.20.20.71.0

0 -

6000 -

Genomic DNA (AF329878.1)

Scale

0.5 Kb

Fig. 5 Maize streak virus (MSV) genome organization and alignment of de novo-assembled contigs. Labels are as in Fig. 2. a MSV genomeorganization. Open reading frames are represented by cylinders. Genomic DNA is represented by a solid line. Coordinates are based on referencesequence number AF329878.1. Large (LIR) and small (SIR) are represented by shaded boxes. Direction of transcription is indicated by arrows. Everysample categorized as infected contributed one representative contig. Shorter, redundant contigs were not illustrated. b Genome coverage afterreference based assembly using Bowtie v2 for one representative sample

Wamaitha et al. Virology Journal (2018) 15:90 Page 10 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Alignment of de novo -assembled contigs to Maize streak virus

Page 23: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

three groups based on nucleotide and amino acid se-quence at the C terminus of NIb and N terminus of thecoat protein (Fig. 4b).

Maize streak virus exhibits low genetic variationMSV described in this study showed 96 to 100% similarityto the South African isolate (AF329878.1) used as refer-ence (Fig. 5a). Eight contigs representing almost completegenomes (Additional file 8) and eight from GenBank wereused for a phylogenetic analysis. Six of our Kenya contigsclustered near isolates from Uganda, Nigeria, and previ-ously described Kenya isolates (Fig. 8a). Two samples (33and 44) from Kenya clustered separately near isolates fromNew Zealand and South African isolates. These and previ-ous results [41] show low genetic variation in the MSVpopulation in Kenya.

Polerovirus complex infecting maizeBased on five contigs (Additional file 9) from this studyand seventeen sequences from GenBank, a phylogenetictree was obtained for MYDV-RMV. Maize yellow mosaicvirus and Maize yellow dwarf virus-RMV2 were includedfor comparison. Sequences from Kenya obtained in thisstudy were 97 to 100% similar to (Fig. 6a) and four clus-tered near the MYDV-RMV reference (MF974579.2),while two clustered near Maize yellow mosaic virus(MaYMV) isolate from Nigeria (Fig. 8b). However, thesimilarity between MYDV-RMV and MaYMV is 98.67%.These results and the widespread distribution of

MaYMV in Rwanda [21] suggest that in Kenya there is acomplex of closely related poleroviruses that includeMaize yellow dwarf virus-RMV and Maize yellow mosaicvirus, and possibly others, such Barley virus G whichwas detected in 11 of the 68 samples (Fig. 1d).

a MSV

b MYDV-RMV

KY304959.1 (Kenya)KY304964.1 (Kenya)

EF547097.1 (Uganda)

KX787926.1 (Nigeria)KY304955.1 (Kenya)

HC035658.1 (New Zealand)

EU628566.1 (South Africa)AF329878.1(South Africa)

343

720

6

46

3340

0.001

KC921392.1 Maize yellow dwarf virus-RMV (Montana, USA)

KY684356.1 Maize yellow mosaic virus (Nigeria)MF974579.2 Maize yellow dwarf virus-RMV (Kenya)

KY052793.1 Maize yellow mosaic virus (Ecuador)

KT

9928

24.1

Mai

ze y

ello

w d

war

f viru

s-R

MV

2, C

hina

KU291101.1 Maize yellow mosaic virus (China)KU291105.1

KU291099.1KU291102.1

KU291101.1 KU248490_1KU291108.1

29 15

8 6937

0.01

Fig. 8 Phylogeny of MSV (a) and MYDV-RMV (b). Phylogenetic trees were generated using Bayesian inference in Mr. Bayes 3.2. Scale barrepresents nucleotide substitution per site. Kenya samples described in this study are colored in red and identified by a number

Wamaitha et al. Virology Journal (2018) 15:90 Page 14 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Maize streak virus exhibits low genetic variation

Page 24: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

(NC_003399.1) is 9324 nt long. The longest contig weobtained was 962 nt long and was 80.0% similar to thereference (sample 20). The highest similarity (89.3%) toreference genome was obtained for a 271-bp contig(sample 68). The JGMV reference genome (NC_003606.1) is 9779 nt long. The longest contig we obtained was1535 nt long and was 75.0% similar to the reference(sample 46). The highest similarity (85.6%) to referencegenome was obtained for a 967-bp contig (sample 30).Collectively, these results show that Hubei Poty-like

virus 1, Scallion mosaic virus,

JGMV, Iranian JGMV, and Barley virus G are part ofthe virus complex infecting maize in Kenya and theirgenetic composition is distant from isolates describedbefore (Fig. 1d).

Low genetic diversity of maize chlorotic mottle virus inKenyaThirty contigs from this study (Additional file 5) wereused for a phylogenetic analysis that included 16 se-quences from GenBank representing MCMV world widevariation [37]. MCMV sequences from Kenya were at

a MYDV-RMV genome and contig alignment

POPro/Vpg

RdRpCP

MP

CP-RTDGenomic RNA (MF974579.2)

0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 Kb

VPg

Contigsize (Kb)

Similarity(%)

Sample(symptoms)

3 (S)

b MYDV-RMV coverage

3.4 99.31.0 99.60.7 98.30.3 99.05.6 99.10.7 99.01.0 98.85.6 98.60.3 99.70.8 99.81.5 99.40.8 98.20.2 98.51.5 99.40.4 99.71.3 98.9

98.35.50.5 97.00.4 98.51.7 99.40.2 99.13.5 98.55.5 99.30.7 99.70.2 97.02.8 98.60.7 98.61.2 99.21.5 98.92.0 98.72.0 99.71.2 99.70.5 99.81.8 99.50.6 99.50.4 100.00.3 99.41.3 98.21.1 99.65.6 99.6

22 (S)23 (S)24 (S)25 (S)26 (S)27 (S)

4 (S)5 (S)7 (A)

* 8 (A)9 (A)

13 (S)15 (A)16 (S)18 (S)

29 (S)30 (S)31 (A)

* 34 (A)

40 (A)

64 (A)

33 (S)

36 (S)37 (S)38 (S)

** 39 (A)

42 (S)44 (S)45 (S)48 (S)49 (S)54 (S)56 (S)

41 (A)

58 (S)59 (S)

67 (S)68 (S)69 (S)

0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 Kb

Sample 15

Alignmentsize (Kb)

3.41.00.70.35.60.71.05.60.30.81.50.80.21.50.41.25.50.50.41.50.23.55.50.70.22.80.71.21.52.02.01.20.51.80.60.40.31.31.05.6

0 -

550 -

Scale

0.5 Kb

5.6

5.6

Fig. 6 Maize yellow dwarf virus (MYDV-RMV) genome organization and alignment of de novo-assembled non-overlapping contigs from symptomatic(S) and asymptomatic (A) maize, cultivated (*) or wild (**) sorghum. Labels are as in Fig. 2. a MYDV-RMV genome organization and geneexpression. Open reading frames are represented by cylinders. Genomic RNA is represented by a solid line. Coordinates are based on referencesequence number MF974579.2. Every sample categorized as infected contributed one representative contig. Shorter, redundant contigs were notillustrated. b Genome coverage after reference based assembly using Bowtie v2 for one representative sample

Wamaitha et al. Virology Journal (2018) 15:90 Page 11 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Alignment of de novo -assembled contigs to Maize yellow dwarf virus

Page 25: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

three groups based on nucleotide and amino acid se-quence at the C terminus of NIb and N terminus of thecoat protein (Fig. 4b).

Maize streak virus exhibits low genetic variationMSV described in this study showed 96 to 100% similarityto the South African isolate (AF329878.1) used as refer-ence (Fig. 5a). Eight contigs representing almost completegenomes (Additional file 8) and eight from GenBank wereused for a phylogenetic analysis. Six of our Kenya contigsclustered near isolates from Uganda, Nigeria, and previ-ously described Kenya isolates (Fig. 8a). Two samples (33and 44) from Kenya clustered separately near isolates fromNew Zealand and South African isolates. These and previ-ous results [41] show low genetic variation in the MSVpopulation in Kenya.

Polerovirus complex infecting maizeBased on five contigs (Additional file 9) from this studyand seventeen sequences from GenBank, a phylogenetictree was obtained for MYDV-RMV. Maize yellow mosaicvirus and Maize yellow dwarf virus-RMV2 were includedfor comparison. Sequences from Kenya obtained in thisstudy were 97 to 100% similar to (Fig. 6a) and four clus-tered near the MYDV-RMV reference (MF974579.2),while two clustered near Maize yellow mosaic virus(MaYMV) isolate from Nigeria (Fig. 8b). However, thesimilarity between MYDV-RMV and MaYMV is 98.67%.These results and the widespread distribution of

MaYMV in Rwanda [21] suggest that in Kenya there is acomplex of closely related poleroviruses that includeMaize yellow dwarf virus-RMV and Maize yellow mosaicvirus, and possibly others, such Barley virus G whichwas detected in 11 of the 68 samples (Fig. 1d).

a MSV

b MYDV-RMV

KY304959.1 (Kenya)KY304964.1 (Kenya)

EF547097.1 (Uganda)

KX787926.1 (Nigeria)KY304955.1 (Kenya)

HC035658.1 (New Zealand)

EU628566.1 (South Africa)AF329878.1(South Africa)

343

720

6

46

3340

0.001

KC921392.1 Maize yellow dwarf virus-RMV (Montana, USA)

KY684356.1 Maize yellow mosaic virus (Nigeria)MF974579.2 Maize yellow dwarf virus-RMV (Kenya)

KY052793.1 Maize yellow mosaic virus (Ecuador)

KT

9928

24.1

Mai

ze y

ello

w d

war

f viru

s-R

MV

2, C

hina

KU291101.1 Maize yellow mosaic virus (China)KU291105.1

KU291099.1KU291102.1

KU291101.1 KU248490_1KU291108.1

29 15

8 6937

0.01

Fig. 8 Phylogeny of MSV (a) and MYDV-RMV (b). Phylogenetic trees were generated using Bayesian inference in Mr. Bayes 3.2. Scale barrepresents nucleotide substitution per site. Kenya samples described in this study are colored in red and identified by a number

Wamaitha et al. Virology Journal (2018) 15:90 Page 14 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Maize yellow dwarf virus exhibits low genetic variation

Page 26: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

a SCMV genome and contig alignment

86.791.577.687.685.179.881.478.584.585.178.289.179.790.979.784.686.684.885.979.794.393.377.882.687.5

91.490.1

80.090.581.289.178.7

80.182.678.891.589.380.679.889.991.280.088.0

90.381.190.680.088.389.683.390.682.480.987.589.779.980.380.390.9

Contigsize (Kb)

9.99.69.55.39.59.89.56.32.76.91.69.60.80.39.60.5

9.1

0.46.55.70.20.33.10.55.30.50.59.60.49.60.49.77.20.79.50.45.65.17.79.59.64.90.39.6 91.14.50.59.58.79.99.41.10.30.23.85.21.42.96.47.29.6

Sample(symptoms)

20 (A)21 (S)22 (S)23 (S)24 (S)25 (S)26 (S)27 (S)

3 (S)4 (S)5 (S)6 (S)7 (A)

* 8 (A)9 (A)

11 (S)12 (S)13 (S)14 (A)15 (A)16 (S)17 (A)18 (S)

28 (A)29 (S)30 (S)31 (A)32 (A)

* 34 (A)

40 (A)

64 (A)

33 (S)

35 (S)36 (S)37 (S)38 (S)

** 39 (A)

42 (S)43 (S)44 (S)45 (S)

48 (S)49 (S)50 (S)

54 (S)56 (S)57 (S)

41 (A)

46 (A)47 (A)

52 (A)

60 (A)58 (S)

62 (S)

66 (S)67 (S)68 (S)

72 (S)70 (S)69 (S)

Similarity(%)

b SCMV coverage

Sample 9

1 2 3 5 6 7 8 9 9.6 Kb4

Alignmentsize (Kb)

9.69.59.45.39.39.69.56.22.75.81.37.80.80.39.60.5

9.1

0.46.45.70.20.33.10.51.00.50.59.60.49.50.49.66.00.69.40.45.64.46.59.49.64.90.29.60.80.59.57.79.69.41.10.30.23.70.90.62.45.57.29.4

1630 -

0 -

Genomic RNA (JX188385.1)

Scale

0.5 Kb

P1 HC-Pro P3 6K1 CI 6K2VPg NIa NIb CPAn

1 2 3 5 6 7 8 9 9.6 KbP3NPIPO

42

2

2

2

2

2

2

3

3

3

3

3

Fig. 3 Sugarcane mosaic virus (SCMV) genome organization and mapping of de novo-assembled contigs. Labels are as in Fig. 2. a SCMV genomeand polyprotein organization. Mature proteins are represented by cylinders. Coordinates are based on the Ohio isolate used as reference(JX188385.1). Every sample categorized as infected contributed one representative contig. A variable area was detected between nt 8500 and8650. Colored arrowheads represent the location of two conserved deletions in the polyprotein coding sequence. A number 2 (group G2) indicates a39 nt deletion (8487 to 8525) that resulted in an in-frame deletion of 13 amino acids at the C terminus of NIb. A number 3 (group G3) indicates a45 nt deletion between nt 8487 to 8676 that resulted in a 15-amino acid deletion. In samples not marked (group G1), variation was observed withoutinsertions or deletions. b Genome coverage after reference based assembly using Bowtie v2 for one representative sample

Wamaitha et al. Virology Journal (2018) 15:90 Page 8 of 19

a SCMV genome and contig alignment

86.791.577.687.685.179.881.478.584.585.178.289.179.790.979.784.686.684.885.979.794.393.377.882.687.5

91.490.1

80.090.581.289.178.7

80.182.678.891.589.380.679.889.991.280.088.0

90.381.190.680.088.389.683.390.682.480.987.589.779.980.380.390.9

Contigsize (Kb)

9.99.69.55.39.59.89.56.32.76.91.69.60.80.39.60.5

9.1

0.46.55.70.20.33.10.55.30.50.59.60.49.60.49.77.20.79.50.45.65.17.79.59.64.90.39.6 91.14.50.59.58.79.99.41.10.30.23.85.21.42.96.47.29.6

Sample(symptoms)

20 (A)21 (S)22 (S)23 (S)24 (S)25 (S)26 (S)27 (S)

3 (S)4 (S)5 (S)6 (S)7 (A)

* 8 (A)9 (A)

11 (S)12 (S)13 (S)14 (A)15 (A)16 (S)17 (A)18 (S)

28 (A)29 (S)30 (S)31 (A)32 (A)

* 34 (A)

40 (A)

64 (A)

33 (S)

35 (S)36 (S)37 (S)38 (S)

** 39 (A)

42 (S)43 (S)44 (S)45 (S)

48 (S)49 (S)50 (S)

54 (S)56 (S)57 (S)

41 (A)

46 (A)47 (A)

52 (A)

60 (A)58 (S)

62 (S)

66 (S)67 (S)68 (S)

72 (S)70 (S)69 (S)

Similarity(%)

b SCMV coverage

Sample 9

1 2 3 5 6 7 8 9 9.6 Kb4

Alignmentsize (Kb)

9.69.59.45.39.39.69.56.22.75.81.37.80.80.39.60.5

9.1

0.46.45.70.20.33.10.51.00.50.59.60.49.50.49.66.00.69.40.45.64.46.59.49.64.90.29.60.80.59.57.79.69.41.10.30.23.70.90.62.45.57.29.4

1630 -

0 -

Genomic RNA (JX188385.1)

Scale

0.5 Kb

P1 HC-Pro P3 6K1 CI 6K2VPg NIa NIb CPAn

1 2 3 5 6 7 8 9 9.6 KbP3NPIPO

42

2

2

2

2

2

2

3

3

3

3

3

Fig. 3 Sugarcane mosaic virus (SCMV) genome organization and mapping of de novo-assembled contigs. Labels are as in Fig. 2. a SCMV genomeand polyprotein organization. Mature proteins are represented by cylinders. Coordinates are based on the Ohio isolate used as reference(JX188385.1). Every sample categorized as infected contributed one representative contig. A variable area was detected between nt 8500 and8650. Colored arrowheads represent the location of two conserved deletions in the polyprotein coding sequence. A number 2 (group G2) indicates a39 nt deletion (8487 to 8525) that resulted in an in-frame deletion of 13 amino acids at the C terminus of NIb. A number 3 (group G3) indicates a45 nt deletion between nt 8487 to 8676 that resulted in a 15-amino acid deletion. In samples not marked (group G1), variation was observed withoutinsertions or deletions. b Genome coverage after reference based assembly using Bowtie v2 for one representative sample

Wamaitha et al. Virology Journal (2018) 15:90 Page 8 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Alignment of de novo -assembled contigs to Sugarcane mosaic virus

Page 27: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

least 96% similar to the Kansas isolate (X14736.2) used asreference (Fig. 2a). In agreement with world wide variation[37], our results showed a clear distribution of MCMV iso-lates in different clades based on their geographic origin(Fig. 7a). Kenya samples described here clustered inthe clade containing isolates from East Africa, close to iso-lates from China and away from isolates from the American

continent (Fig. 7a). Within our Kenya samples, there wasno correlation with the county or host of origin. One sam-ple (number 16) lacking 15 nt and 205 nt at the 5’ end and3’ end, respectively, showed the most distance from theAfrican cluster (Fig. 7a). Results described here and before[37] show that there is low genetic variation in the MCMVpopulation in Kenya.

a MCMV

b SCMV

JX188385.1 (Ohio, USA)

KF744391.1, Rwanda(G1)

KF744392.1,Rwanda (G1)

JX047391.1,China (G3)

GU474635.1,Mexico (G1)

KP860935.1,Ethiopia (G2)

KP860936.1, Ethiopia (G1)KP772216.1, Ethiopia (G1)

KJ782300.1 (Taiwan)

GU

138674.1 (China)

X14736.2 (Kansas, USA)EU358605.1 (Nebraska, USA)

JQ982468.1 (C

hina)

KF010583.1 (China)

KP798452.2 (Ethiopia)JQ

982469.1 (China) 0.001

MF510222.1 (Sta Elena, Ecuador)

MF510219.1 (Porto severe, Ecuador)MF510220.1 (Hawaii)

MF510221.1 (Porto original, Ecuador)MF510234.1(T1F7S2, Kenya)

MF510232.1 (T2F1S3, K

enya)

MF510245.1 (T1F5S3, Kenya)

MF510238.1 (T1F5S

2, Kenya)

4445

16 53

2536

21

46

66

38

39

2818

3

823

34762

41

72, Transzoia (G1)52, Elgeyo Marakwet (G1)

15, Busia (G3)3, Bomet (G3)

48, Bomet (G1)

21, Bomet (G1)45, Bomet (G1)

44, Migori (G2)

56, Kericho (G1)

4, Bomet (G1)57, Transzoia (G1)

36, Embu (G1)

7, Kirinyaga (G2)

8, Embu (Sorghum, G1)32, Uasin Gishu (G2)9, Kirinyaga (G2)

34, Kirinyaga (Sorghum, G2)39,Kirinyaga (Sorghum,G2)

5, Bomet (G3)

18, Busia (G1)

0.01

Fig. 7 Phylogeny of MCMV (a) and SCMV (b). Phylogenetic trees were generated using Bayesian inference in Mr. Bayes 3.2. Scale bar representsnucleotide substitution per site. For SCMV, G1, G2 and G3 correspond to genetic variation and groups described in Fig. 4. Kenya samplesdescribed in this study are colored in red and identified by a number and the county of origin. Unless indicated otherwise, samples came frommaize. Green background indicates clusters formed by Kenya samples

Wamaitha et al. Virology Journal (2018) 15:90 Page 12 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Sugarcane mosaic virus is genetically diverse

Page 28: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

Other viruses infecting maizeFour potyviruses and one polerovirus were detected in asmaller number of samples (Fig. 1d). Hubei Poty-likevirus 1 (19 samples), Scallion mosaic virus (7 samples),JGMV (5 samples), and Iranian JGMV (2 samples) arepotyviruses. Barley virus G (11 samples) is a polerovirus.

The Hubei Poty-like virus 1 reference genome (NC_032912.1) is 9356 nt long. The longest contig we ob-tained was 9323 nt long and was 77.3% similar to thereference (sample 48). The highest similarity (87.3%) tothe reference genome was obtained for a 206-bp contig(sample 68). The Scallion mosaic virus reference genome

b SCMV partial polyprotein sequence alignment

P1 HC-Pro P3 6K1 CI 6K2VPg NIa NIb CPAn

1 2 3 5 6 7 8 9 9.6 KbP3NPIPO

42 3

2

6

10

14

18

22

Kenya group123

All Kenyasamples

SN

P p

er 5

0 nt

0

350

700

1050

1400

1750

2100

2450

2800

3150

3500

3850

4200

4550

4900

5250

5600

5950

6300

6650

7000

7350

7700

8050

8400

8750

9100

9450

a SCMV single nucleotide polymorphism

SN

P p

er 5

0 nt

2468

101214161820

0

350

700

1050

1400

1750

2100

2450

2800

3150

3500

3850

4200

4550

4900

5250

5600

5950

6300

6650

7000

7350

7700

8050

8400

8750

9100

9450

0

JX188385.1 (Ohio,USA) 2710 ALRNLYLGTGIKEEEIEKYFKQFIKDLPGYIEDYNEDVFHQSGTVDAGTQGGSGSQGTTP 2769Kenya group 1 ALRNLYLGTGIKEEEIEIYFKQFVKDLPGYIEDYNEEVIHQSGTVDAGAQGGSGSQGTTPKF744391.1 (Rwanda) ALRNLYLGTGIKEEEIEIYFKQFVKDLPGYIEDYNEEVIHQSGTVDAGAQGGSGSQGTTPKF744392.1 (Rwanda) ALRNLYLGTGIKEEEIEKYFKQFVKDLPGYIEDYNEEVIHQSGTVDAGAQGGGGNQGTTPKP860936.1 (Ethiopia) ALRNLYLGTGIKEEEIEKYFKQFVKDLPGYIEDYNEEVIHQSGTVDAGAQGGSGSQGTTPKP772216.1 (Ethiopia) ALRNLYLGTGIKEEEIEKYFKQFVKDLPGYIEDYNEEVIHQSGTVDAGAQGGSGSQGTTPGU474635.1 (Mexico) ALRNLYLGTGIKEEEIEKYFKQFAKDLPGYIEDYNEDVFHQSGSVDAGVQGGSGNQGTTPKenya group 2 ALRNLYLGTGIKEEEIEKYFKQFVKDLPGYIEDYNEEVIHQSGTVDAGAQGGGGNQGTTPKP860935.1 (Ethiopia) ALRNLYLGTGIKEEEIEKYFKQFVKDLPGYIEDYNEDVIHQSGTVDAGAQGGSGNQGTTPKenya group 3 ALRNLYLGTGIKEEEIEKYFRQFVKDLPGYVEDYNEEVIHQSGQVDAGRQGGSGAQGGTPJX047391.1 (China) ALRNLYLGTSIKEEEIEKYFRQFVKDLPGYVEDYNEEVIHQSGQVDAGRQGGSGAQGGTPJX286708.1 (Kenya) -----------------------------------------SGQVDAGRQGGSGAQGGTP ** **** ***.* ** ** NIb Coat proteinJX188385.1 (Ohio,USA) 2770 PATGSGAKPATSGAGSGSSTGAGTGVTGSQAGAGGSAGTGSGATGGQSGSGSGTGQINTG 2828Kenya group 1 PATGSGSKPAASGAGSGSGTGTGTGATGGQTGNGSGAGTGSGATGGQSGSGSGTGQTGTGKF744391.1 (Rwanda) PATGSGSKPATSGAGSGSGTGTGTGATGGQTGTGSGAGTGSGATGGQSGSGSGTGQTGTGKF744392.1 (Rwanda) PATGGGAKPANSGAGSGSGTGTGTGATGGQTGTGSGAGAGSGATGGQSGSGSGTGQTGTGKP860936.1 (Ethiopia) PATGSGARPATSGAGSGSGTGTGAGATGGQTGAGSGAGTGSGAAGGQSGSGSGAGQTGTGKP772216.1 (Ethiopia) PATGGGARPAASGAGSGSGTGTGAGATGGQTGAGSGAGTGSGATGGQSGSGSGAGQTGTGGU474635.1 (Mexico) PATGSGAKPATSGAGSGSGTGTGTGVTGGQAGASSGAGTGSGATGGQSGSGSGTGQNGTGKenya group 2 PATGNG-------------TGTRTGATGGQTGVGGGTTTGSGATGGQTGSGNGAAQTNTSKP860935.1 (Ethiopia) PATGGG-------------TGAGTGATGGAAGTGGGAGTGAGATRGQSGSGGGTGQTNTGKenya group 3 PAGSGGTGSGTQGNGGQTGS------QGSSGQQGSGGGTGQGAAGN---------NGGGQJX047391.1 (China) PAGSGGTGSGTQGNGGQTGS------QGSGGQQGSGGGTGQGAAGN---------NGGGQJX286708.1 (Kenya) PAGSGGTGSGTQGNGGQTGS------QGSGGQQGSGGGTGQGAAGN---------NGGGQ ** ..* : *. ... :* **: . : .

JX188385.1 (Ohio,USA) 2829 SAGTSATGGQRDRDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQD 2888Kenya group 1 SAGTGSTGGQRDKDVDAGTTGNITVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKF744391.1 (Rwanda) SAGTGATGGQRDKDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKF744392.1 (Rwanda) SAGTGATGGQRDKDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKP860936.1 (Ethiopia) SAGTGATGGQRDKDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKP772216.1 (Ethiopia) SAGTGATGGQRDKDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDGU474635.1 (Mexico) SAGTSATGSQRDRDVDAGSTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKenya group 2 SAGTGATGGQRDKDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKP860935.1 (Ethiopia) SAGTGATGGQRDKDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKenya group 3 TGGSSGTSGQRDKDVDAGSAGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDJX047391.1 (China) TGGSSGTAGQRDKDVDAGSAGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDJX286708.1 (Kenya) TGGSSGTAGQRDKDVDAGSAGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQD :.*:..*..***:*****::*:*:************************************

Fig. 4 SCMV genetic variation. Coordinates are based on the Ohio isolate (JX188385.1). a SNP distribution across the SCMV genome for allsamples and by genetic group. b Partial polyprotein sequence alignment, using MAFFT, of Kenya samples in variation groups 1, 2 and 3, andisolates from other parts of the world relative to the Ohio isolate. The coat protein detected in the original description of maize lethal necrosis inKenya was used for comparison (JX286708.1) [6]. NIb and coat protein coding sequences are color coded blue and red, respectively. Greenbackground indicates variation

Wamaitha et al. Virology Journal (2018) 15:90 Page 9 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Region between the NIb and CP showed hyper-variability

Page 29: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

Ohio 8397 CAGTCGGGAACTGTTGATGCAGGTACACAAGGAGGCAGTGGAAGCCAAGGAACAACACCA 8456Kenya group 1 CAATCGGGAACAGTTGATGCAGGTGCACAAGGCGGCAGCGGAAGCCAAGGAACAACACCAKenya group 2 CAATCGGGAACAGTTGATGCAGGCGCACAAGGAGGCGGCGGAAATCAAGGAACAACACCGKenya group 3 CAATCTGGTCAAGTTGACGCAGGGAGACAGGGCGGTAGCGGTGCTCAAGGAGGCACGCCAJX286708.1 (Kenya) ---TCTGGTCAAGTTGACGCAGGGAGACAGGGCGGTAGCGGCGCTCAAGGAGGCACACCG ** ** ***** ***** *** ** ** * ** ****** ** **

Ohio 8457 CCAGCAACAGGCAGTGGAGCAAAACCAGCCACCTCAGGGGCAGGATCTGGTAGTAGCACA 8516Kenya group 1 CCAGCAACAGGTAGCGGATCGAAACCAGCGGCTTCAGGAGCAGGATCTGGTAGCGGAACAKenya group 2 CCAGCAACAGGTAACGGAACAGG-------------------------------------Kenya group 3 CCAGCAGGAAGTGGAGGCACTGGATCTGGCACTCAAGGCAATGGGGGTCAGA--------JX286708.1 (Kenya) CCAGCAGGAAGTGGAGGCACTGGATCTGGCACTCAAGGCAATGGGGGTCAGA-------- ****** * * ** * NIb Coat protein Ohio 8517 GGAGCTGGAACTGGTGTAACTGGAAGTCAAGCAGGGGCTGGCGGTAGCGCTGGGACGGGA 8576Kenya group 1 GGGACTGGAACCGGTGCAACTGGAGGCCAAACAGGAAATGGTAGTGGTGCTGGAACAGGAKenya group 2 --AACCAGAACTGGTGCAACTGGAGGCCAAACAGGAGTTGGTGGTGGAACTACAACAGGAKenya group 3 ----CGGGATCCCAAGGAAGTAGTGGTCAAC------------------------AAGGGJX286708.1 (Kenya) ----CGGGATCCCAAGGAAGTGGCGGTCAAC------------------------AAGGG * ** * * ** * * * *** **

Ohio 8577 TCCGGAGCAACCGGAGGCCAAYCAGGATCTGGAAGTGGCACTGGACAGATTAACACGGGT 8636 Kenya group 1 TCTGGAGCGACCGGAGGCCAATCAGGATCTGGAAGTGGCACTGGACAGACTGGCACAGGCKenya group 2 TCTGGAGCGACCGGAGGTCAGACAGGATCTGGAAATGGTGCTGCACAGACCAACACGAGCKenya group 3 TCCGGTGGGGGCACTGGTCAAGGAGCAGCTGGAAACAA---------CGGCGGAGGTCAGJX286708.1 (Kenya) TCCGGTGGGGGCACTGGTCAAGGAGCAGCTGGAAACAA---------CGGCGGAGGTCAG ** ** * * ** ** ** * ******

Ohio 8637 TCAGCAGGAACTAGTGCAACAGGAGGCCAAAGAGATAGGGATGTGGATGCAGGTACAACA 8696Kenya group 1 TCAGCAGGAACTGGTTCAACGGGAGGCCAGAGAGATAAGGATGTGGATGCAGGTACAACAKenya group 2 TCAGCAGGAACTGGTGCAACGGGAGGCCAGAGAGATAAGGATGTAGATGCAGGTACAACAKenya group 3 ACAGGAGGCTCTAGTGGGACATCTGGTCAGAGAGATAAGGACGTTGACGCAGGCTCGGCTJX286708.1 (Kenya) ACAGGAGGCTCTAGTGGGACAGCTGGTCAGAGAGATAAGGACGTTGACGCAGGCTCGGCT *** *** ** ** ** ** ** ******* *** ** ** ***** * *

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Partial alignment of contigs to reference genome

Page 30: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

Other viruses infecting maizeFour potyviruses and one polerovirus were detected in asmaller number of samples (Fig. 1d). Hubei Poty-likevirus 1 (19 samples), Scallion mosaic virus (7 samples),JGMV (5 samples), and Iranian JGMV (2 samples) arepotyviruses. Barley virus G (11 samples) is a polerovirus.

The Hubei Poty-like virus 1 reference genome (NC_032912.1) is 9356 nt long. The longest contig we ob-tained was 9323 nt long and was 77.3% similar to thereference (sample 48). The highest similarity (87.3%) tothe reference genome was obtained for a 206-bp contig(sample 68). The Scallion mosaic virus reference genome

b SCMV partial polyprotein sequence alignment

P1 HC-Pro P3 6K1 CI 6K2VPg NIa NIb CPAn

1 2 3 5 6 7 8 9 9.6 KbP3NPIPO

42 3

2

6

10

14

18

22

Kenya group123

All Kenyasamples

SN

P p

er 5

0 nt

0

350

700

1050

1400

1750

2100

2450

2800

3150

3500

3850

4200

4550

4900

5250

5600

5950

6300

6650

7000

7350

7700

8050

8400

8750

9100

9450

a SCMV single nucleotide polymorphism

SN

P p

er 5

0 nt

2468

101214161820

0

350

700

1050

1400

1750

2100

2450

2800

3150

3500

3850

4200

4550

4900

5250

5600

5950

6300

6650

7000

7350

7700

8050

8400

8750

9100

9450

0

JX188385.1 (Ohio,USA) 2710 ALRNLYLGTGIKEEEIEKYFKQFIKDLPGYIEDYNEDVFHQSGTVDAGTQGGSGSQGTTP 2769Kenya group 1 ALRNLYLGTGIKEEEIEIYFKQFVKDLPGYIEDYNEEVIHQSGTVDAGAQGGSGSQGTTPKF744391.1 (Rwanda) ALRNLYLGTGIKEEEIEIYFKQFVKDLPGYIEDYNEEVIHQSGTVDAGAQGGSGSQGTTPKF744392.1 (Rwanda) ALRNLYLGTGIKEEEIEKYFKQFVKDLPGYIEDYNEEVIHQSGTVDAGAQGGGGNQGTTPKP860936.1 (Ethiopia) ALRNLYLGTGIKEEEIEKYFKQFVKDLPGYIEDYNEEVIHQSGTVDAGAQGGSGSQGTTPKP772216.1 (Ethiopia) ALRNLYLGTGIKEEEIEKYFKQFVKDLPGYIEDYNEEVIHQSGTVDAGAQGGSGSQGTTPGU474635.1 (Mexico) ALRNLYLGTGIKEEEIEKYFKQFAKDLPGYIEDYNEDVFHQSGSVDAGVQGGSGNQGTTPKenya group 2 ALRNLYLGTGIKEEEIEKYFKQFVKDLPGYIEDYNEEVIHQSGTVDAGAQGGGGNQGTTPKP860935.1 (Ethiopia) ALRNLYLGTGIKEEEIEKYFKQFVKDLPGYIEDYNEDVIHQSGTVDAGAQGGSGNQGTTPKenya group 3 ALRNLYLGTGIKEEEIEKYFRQFVKDLPGYVEDYNEEVIHQSGQVDAGRQGGSGAQGGTPJX047391.1 (China) ALRNLYLGTSIKEEEIEKYFRQFVKDLPGYVEDYNEEVIHQSGQVDAGRQGGSGAQGGTPJX286708.1 (Kenya) -----------------------------------------SGQVDAGRQGGSGAQGGTP ** **** ***.* ** ** NIb Coat proteinJX188385.1 (Ohio,USA) 2770 PATGSGAKPATSGAGSGSSTGAGTGVTGSQAGAGGSAGTGSGATGGQSGSGSGTGQINTG 2828Kenya group 1 PATGSGSKPAASGAGSGSGTGTGTGATGGQTGNGSGAGTGSGATGGQSGSGSGTGQTGTGKF744391.1 (Rwanda) PATGSGSKPATSGAGSGSGTGTGTGATGGQTGTGSGAGTGSGATGGQSGSGSGTGQTGTGKF744392.1 (Rwanda) PATGGGAKPANSGAGSGSGTGTGTGATGGQTGTGSGAGAGSGATGGQSGSGSGTGQTGTGKP860936.1 (Ethiopia) PATGSGARPATSGAGSGSGTGTGAGATGGQTGAGSGAGTGSGAAGGQSGSGSGAGQTGTGKP772216.1 (Ethiopia) PATGGGARPAASGAGSGSGTGTGAGATGGQTGAGSGAGTGSGATGGQSGSGSGAGQTGTGGU474635.1 (Mexico) PATGSGAKPATSGAGSGSGTGTGTGVTGGQAGASSGAGTGSGATGGQSGSGSGTGQNGTGKenya group 2 PATGNG-------------TGTRTGATGGQTGVGGGTTTGSGATGGQTGSGNGAAQTNTSKP860935.1 (Ethiopia) PATGGG-------------TGAGTGATGGAAGTGGGAGTGAGATRGQSGSGGGTGQTNTGKenya group 3 PAGSGGTGSGTQGNGGQTGS------QGSSGQQGSGGGTGQGAAGN---------NGGGQJX047391.1 (China) PAGSGGTGSGTQGNGGQTGS------QGSGGQQGSGGGTGQGAAGN---------NGGGQJX286708.1 (Kenya) PAGSGGTGSGTQGNGGQTGS------QGSGGQQGSGGGTGQGAAGN---------NGGGQ ** ..* : *. ... :* **: . : .

JX188385.1 (Ohio,USA) 2829 SAGTSATGGQRDRDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQD 2888Kenya group 1 SAGTGSTGGQRDKDVDAGTTGNITVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKF744391.1 (Rwanda) SAGTGATGGQRDKDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKF744392.1 (Rwanda) SAGTGATGGQRDKDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKP860936.1 (Ethiopia) SAGTGATGGQRDKDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKP772216.1 (Ethiopia) SAGTGATGGQRDKDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDGU474635.1 (Mexico) SAGTSATGSQRDRDVDAGSTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKenya group 2 SAGTGATGGQRDKDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKP860935.1 (Ethiopia) SAGTGATGGQRDKDVDAGTTGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDKenya group 3 TGGSSGTSGQRDKDVDAGSAGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDJX047391.1 (China) TGGSSGTAGQRDKDVDAGSAGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQDJX286708.1 (Kenya) TGGSSGTAGQRDKDVDAGSAGKISVPKLKAMSKKMRLPKAKGKDVLHLDFLLTYKPQQQD :.*:..*..***:*****::*:*:************************************

Fig. 4 SCMV genetic variation. Coordinates are based on the Ohio isolate (JX188385.1). a SNP distribution across the SCMV genome for allsamples and by genetic group. b Partial polyprotein sequence alignment, using MAFFT, of Kenya samples in variation groups 1, 2 and 3, andisolates from other parts of the world relative to the Ohio isolate. The coat protein detected in the original description of maize lethal necrosis inKenya was used for comparison (JX286708.1) [6]. NIb and coat protein coding sequences are color coded blue and red, respectively. Greenbackground indicates variation

Wamaitha et al. Virology Journal (2018) 15:90 Page 9 of 19

Wamaitha, M. J. et al., 2018. Virol J 15: 90-97

Alignment of amino acid sequence to reference genome

Page 31: Metagenomic analysis of viruses associated to maize lethal ... · RESEARCH Open Access Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya Mwathi Jane Wamaitha1*,

Widely distributed virusesMaize chlorotic mottle virusSugarcane mosaic virusMaize yellow dwarf virusMaize streak virus

Other virusesPotyviruses

Hubei Poty-like virus 1Scallion mosaic virus andJhonson Grass Mosaic VirusIranian Jhonson Grass Mosaic Virus

PolerovirusesBarley virus G

Three genetically distinct strains of Sugarcane mosaic virus were detected

Summary