Distinctive - Proceedings of the National Academy of · PDF fileRNAsequenceofastrovirus:...
Transcript of Distinctive - Proceedings of the National Academy of · PDF fileRNAsequenceofastrovirus:...
Proc. Natl. Acad. Sci. USAVol. 90, pp. 10539-10543, November 1993Microbiology
RNA sequence of astrovirus: Distinctive genomic organization anda putative retrovirus-like ribosomal frameshifting signal thatdirects the viral replicase synthesisBAOMING JIANG*t, STEPHAN S. MONROE*, EUGENE V. KOONINt, SARAH E. STINE*, AND ROGER I. GLASS**Viral Gastroenteritis Section, Centers for Disease Control and Prevention, Atlanta, GA 30333; and tNational Center for Biotechnology Information, NationalInstitutes of Health, Bethesda, MD 20894
Communicated by Bernard Fields, July 27, 1993
ABSTRACT The genomic RNA of human astrovirus wassequenced and found to contain 6797 nt organized into threeopen reading frames (la, lb, and 2). A potential ribosomalframeshift site identified in the overlap region of open readingframes la and lb consists of a "shifty" heptanucleotide and anRNA stem-loop structure that closely resemble those at thegag-pro junction of some retroviruses. This translation frame-shift may result in the suppression of in-frame amber termi-nation at the end of open reading frame la and the synthesis ofa nonstructural, fusion polyprotein that contains the putativeprotease and RNA-dependent RNA polymerase. Comparativesequence analysis indicated that the protease and polymerase ofastrovirus are only distantly related to the respective enzymesof other positive-strand RNA viruses. The astrovirus polypro-tein lacks the RNA helicase domain typical of other positive-strand RNA viruses of similar genome size. The genomicorganization and expression strategy of astrovirus, with theprotease and the polymerase brought together by predictedframeshift, most dosely resembled those of plant luteoviruses.Specific features of the sequence and genomic organizationsupport the classification of astroviruses as an additional familyof positive-strand RNA viruses, designated Astroviridae.
Astroviruses were originally identified from the feces ofinfants with gastroenteritis on the basis of distinctive ultra-structural features: five- or six-pointed surface stars arecharacteristic of this agent (1). These nonenveloped agentswere subsequently determined to be positive-strand RNAviruses (2, 3). Five serotypes have been defined to date basedon their distinct antigenicity (4). Astroviruses cause acutegastroenteritis in children and adults worldwide (5), but thedisease burden has been difficult to determine because of thelack of sensitive diagnostic assays. Recent studies haveshown that astroviruses are more frequently found in childrenwith diarrhea than was previously thought (6). In addition,astroviruses have been detected in the diarrheal feces ofvarious animals (5).
Studies of the biochemical properties of purified particleshave provided divergent results' on the number and size ofproteins in astroviruses; two to six polypeptides have beenreported, ranging in size from 5.5 kDa to 42 kDa (5). Althoughthe fastidious growth of astroviruses in vitro has hinderedcharacterization of the genome, several investigators (3, 7)have reported partial sequence information from both inter-nal regions and the 3' end of human astrovirus serotype 1(H-Astl), and we have recently sequenced and characterizedthe subgenomic RNA of serotype 2 (H-Ast2; ref. 8). How-ever, the complete sequence and the genomic organization ofastroviruses remained unknown, and their classification wastentative.
In the present study, we have sequenced and analyzed theentire genomic RNA of H-Ast2§ and compared the sequenceand genomic structure with those of other positive-strandRNA viruses. The results highlight the specific genomicorganization of astroviruses and support their classificationin a separate virus family.
MATERIALS AND METHODSCells and Virus. LLCMK2 cells (ATCC CCL 7.1) were
propagated in Earle's minimal essential medium (MEM)supplemented with antibiotics-and 10% fetal bovine serum.H-Ast2 was obtained from John Kurtz (Oxford, England) andused to infect LLCMK2 cells in MEM/trypsin at 5 pg/ml asdescribed (2). Virions were partially purified from infectedcell lysates by centrifuging through a 30% (wt/vol) sucrosecushion, suspension in TNE buffer [0.05 M Tris (pH 7.5)/0.1M NaCl/5 mM EDTA]/1% SDS, and extraction with phenol/chloroform. Virion RNA was precipitated with 2 M LiCl andused for both sequencing and PCR assays.cDNA Synthesis and Sequencing. Single-stranded cDNA
was synthesized from virion RNA with Super reverse tran-scriptase (Molecular Genetics Resources, Tampa, FL) byusing primers derived originally from cDNA sequence (8) andsubsequently from sequences determined by directly se-quencing virion RNA, using a "primer walking" technique.DNA fragments of various length were amplified by the PCRassay with Taq polymerase (Perkin-Elmer) and virus-specificprimers. Sequences were determined from three sources:virion RNA, PCR DNA, and cDNA clones (8). Virion RNAwas directly sequenced by using an RNA sequencing kit(Boehringer Mannheim). Both the PCR DNA and the clonedcDNA were sequenced by using the Sequenase version 2.0DNA sequencing kit (United States Biochemical). Sequenceson both strands of DNA were determined with each basesequenced at least four times. Sequences were assembled andaligned by using the Genetics Computer Group sequence-analysis package (9), and a consensus sequence was derived.Sequences of the 5' and 3' ends of the genomic RNA were
determined by following the procedure of Lambden et al.(10). Briefly, a synthetic primer 1 was ligated to the 3' endsof virion RNA orcDNA corresponding to the 5' end of virionRNA with T4 RNA ligase (GIBCO/BRL). cDNA fragments(400-600 bp) spanning either the 5' or the 3' ends wereproduced by PCR amplification using a primer 2 complemen-
Abbreviations: H-Ast2, human astrovirus serotype 2; RdRp, RNA-dependent RNA polymerase; NLS, nuclear localization signal; ORF,open reading frame; RHDV, rabbit hemorrhagic disease virus.tTo whom reprint requests should be addressed at: Mailstop G04,Centers for Disease Control and Prevention, 1600 Clifton Road,Atlanta, GA 30333.§The sequence reported in this paper has been deposited in theGenBank data base (accession no. L13745).
10539
The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.
Proc. Natl. Acad. Sci. USA 90 (1993)
5'
Genome I1
Subgenomic RNA
3,AA
6797
AA
4314 6797
ORF lb
2773 Pol 4329
MB ProMB
V RFS
NLS2842
I I .a I I I . .I. I I I I I I I I I I I I I I I I I I I a I a I I I I I .2,000 3,000 4,000 5,000
Nucleotides
.. I ........
6,000
FIG. 1. Genomic organization of human astrovirus. The locations of three ORFs, predicted transmembrane helices (MB), protease (Pro),nuclear localization signal (NLS), ribosomal frameshift structure (RFS), and RNA-dependent RNA polymerase (Pol) are indicated. ORFs laand lb encode a putative nonstructural polyprotein, and ORF 2 codes for a capsid-protein precursor.
tary to the primer 1 and virus-specific primers and weresequenced by using internal primers.Comparative Sequence Analysis. Both nucleotide and de-
duced amino acid sequences were compared by using theBLAST program (11) and the BLOsUM62 matrix (12). Multiplealignments were done by using the OPrAL or MACAW pro-grams (13, 14). A phylogenetic tree was constructed by usingclustering unweighted pairwise group maximum averages(UPGMA), neighbor-joining, least-square (Fitch-Margo-liash), and protein-parsimony algorithms as implemented inthe PHYLIP package (15).
RESULTS AND DISCUSSIONThe genomic RNA of H-Ast2 is 6797 nt in length, excluding31 adenines [poly(A) tail] at the 3' end. The genome possessesthree overlapping open reading frames (ORFs la, lb, and 2;Fig. 1). The sequences surrounding the first AUG codons ofORFs la and 2 are predicted to be optimal for initiatingtranslation (16). ORF la is preceded by 82 untranslatednucleotides and encodes a polypeptide of 920 aa. Interest-ingly, ORF lb, which overlaps ORF la by 70 nt, is in readingframe +1, and its first AUG codon, which is predicted to beweak, is located 380 nt downstream of the ORF la termina-tion codon. ORF 2, present also in the subgenomic RNA,overlaps ORF lb by 5 nt, begins with an initiation codon atnt 4325, and ends with a stop codon 82 bases from the 3' end.ORF 2 codes for a capsid-protein precursor of 796 aa with apredicted molecular mass of 88 kDa (8).The existence of two separate ORFs (la and lb) located in
two different reading frames prompted us to examine the70-nt overlap region in more detail. A potential ribosomalframeshift signal was identified, consisting of the "shifty"'heptanucleotide (AAAAAAC) from position 2791 to 2797,followed by a stem-loop structure that may form apseudoknot with a downstream sequence (Fig. 2A). Theputative frameshift signal of the astrovirus showed a strikingresemblance to those at the gag-pro junction of some retro-viruses, such as mouse mammary tumor virus (Fig. 2B) andfit perfectly the simultaneous tRNA slippage model of -1frameshifting described for the synthesis of the gag-relatedpolyproteins (18). Ribosomal frameshifting recently has beenshown to be a normal expression mechanism in several
groups of positive-strand RNA viruses-namely, animalcoronaviruses and arteriviruses, and plant luteoviruses anddianthoviruses (19-22). However, the putative frameshiftingsignal of astrovirus was much less similar to the frameshiftregions of these viruses than to those of some retroviruses(data not shown). The ribosomal frameshifting during trans-lation of astrovirus RNA probably directs the synthesis ofanORF la/lb fusion nonstructural polyprotein of 1416 aa witha predicted molecular mass of 161 kDa.The nucleotide sequence of the astrovirus genomic RNA
and the deduced amino acid sequences of the nonstructuralpolyprotein and the capsid protein of H-Ast2 were comparedwith partial sequences available for H-Astl. Between sero-types, the ribosomal frameshifting region was completelyconserved (data not shown), and the amino acid sequencewas highly conserved (>90%o identical) in a portion of thenonstructural polyproteinsl (3) but was less conserved (51-56% identical) in the predicted capsid-protein regions (3, 7).Of interest, a region in the C-terminal portion of the non-structural polyprotein was significantly similar to the putativeRNA-dependent RNA polymerases (RdRps) of plant bymo-viruses (P = 0.015) and potyviruses (P = 0.095). This regioncontained the eight conserved motifs typical of the positive-strand RNA virus RdRps, indicating that it belongs to theso-called supergroup I, which includes the polymerases ofpicornaviruses, caliciviruses, potyviruses, and several othergroups of plant viruses (Fig. 3A; refs. 29 and 30).Comparison of the protein sequence of astrovirus with a
data base of sequences of other positive-strand RNA virusesidentified a region of similarity with RHDV that included theputative catalytic cysteine of the RHDV protease. Using thepreviously published alignments of chymotrypsin-relatedproteases of positive-strand RNA viruses, we identified, inthe putative protease domain of astrovirus, the conservedsegments surrounding the three catalytic amino acid residuesand a fourth distal segment implicated in substrate binding(Fig. 3B; ref. 31). In addition, sequences of the putativeproteases of H-Ast2, RHDV, and feline calicivirus were
lWillcocks, M. M. & Carter, M. J., Third International Symposiumon Positive Strand RNA Viruses, Sept. 19-24, 1992, Clearwater,FL, pp. 2-47 (abstr.).
LORF la
0
ORF 2
1 i~~~~~~~~~~~~~~~~~~~~~~
1,000
4325 6712
_ _ --
-
3
. . . . . . . . . . . . . . . . . . . .
10540 Microbiology: Jiang et al.
Proc. Natl. Acad. Sci. USA 90 (1993) 10541
A. Astrovirus-G A
CC
AA A
2810GG
2790 2800 C
GCCCCAAAAAACUACAAA CAAAAUUAUCACUCAtIMUGCAIGGAAAUCAUORF lb: P K K L Q I I I IORF la: A P K N Y K 2830 2840 2850 2860la-lb A P K K L Q
B. MMTVC C
A
GC
AAUUCAAAAAACUUG G C GCUCAAAAGGGGGAUGGAGUUpro: F K K L Lgag: N S K N L *gag-pro: N S K K L L
aligned. For a 118-aa residue overlap, an adjusted alignmentscore of 5.4 SDs above random expectation with an evolu-tionary distance of 214 was observed, values indicating agenuine evolutionary and functional relationship, given thatadditional evidence (e.g., conservation of specific functionalmotifs) is available (32).An important feature of the putative protease of H-Ast2 is
the substitution of serine for the catalytic cysteine found inmost positive-strand RNA-virus proteases of superfamily I.Previously, an analogous substitution was found in the pu-tative proteases of sobemoviruses, luteoviruses, and arteri-viruses (Fig. 3B; refs. 20, 31, 33, 34). However, the putativeprotease of H-Ast2 showed less similarity to these viralproteases than to the cysteine proteases of caliciviruses.An extensive search of the astrovirus nonstructural poly-
protein sequence for motifs defining other conserved do-mains of positive-strand RNA viruses failed to identify can-didate regions for an RNA helicase, methyltransferase, orpapain-like protease (35-37). Absence of the helicase domainis remarkable because this domain has always been identifiedin positive-strand RNA viruses with genomes >6000 nt (38).Absence of the methyltransferase domain suggested thatastrovirus may encode VPg, a viral protein covalently linkedto the 5' end of the viral genome (39, 40), a conjecturecompatible with the affinity of the putative H-Ast2 polymer-ase with supergroup I RdRps, which mostly belong to VPg-containing viruses (29, 30).
Additional features detected by computer analysis of thenonstructural polyprotein ofH-Ast2 included four transmem-brane a-helices and a NLS (Fig. 1). The transmembranehelices were located in the region upstream of the proteaseand may be involved in anchoring the viral RNA replicationcomplex in the membrane, as described for the 3A or 3ABproteins ofpoliovirus (41). In all positive-strand RNA virusesfor which the VPg domain has been localized, it is foundwithin a short region between a (putative) transmembranesegment and the protease (E.V.K., unpublished data) and islinked to the 5' end of the viral RNA by a tyrosine or a serine
FIG. 2. The putative ribosomal frameshiftingsignal in the astrovirus genome. (A) Nucleotidesequence and predicted RNA secondary structurein the overlap region of astrovirus ORFs la and lb.The putative frameshift site ("shifty" heptanucle-otide sequence) is underlined, and the terminationcodon for ORF la is boxed. A potential pseudoknotstructure was predicted by searching the regiondownstream of the stem-loop structure for se-quences complementary to the loop sequence.Three base pairs may be sufficient for thepseudoknot formation (17), but the formation of alarger "secondary" stem with a noncanonical GApair (shown by a dotted line) and two additionalcanonical base pairs is also possible. The deducedamino acid sequences of ORFs la, lb, and la-lbsurrounding the frameshift site are shown. (B) Nu-cleotide sequence and predicted RNA secondarystructure in the gag-pro overlap region of mousemammary tumor virus (MMTV) (18) are shown forcomparison. The frameshift site, the terminationcodon, and the RNA pseudoknot are indicated ordescribed as in A.
residue (39, 40). This region of the H-Ast2 polyprotein has noappropriately located tyrosines and has only one serine(Ser-420), suggesting that this serine may be the RNA-linkingamino acid of VPg. The NLS, spanning aa 666-682, isidentical to that of H-Astl. This signal may be involved intransport of astrovirus proteins to the nucleus, as substanti-ated by the fact that astrovirus products were detected byimmunofluorescence in the nucleus of bovine astrovirus-infected cells (42). The astrovirus NLS perfectly fits theconsensus for the bipartite-signal motif comprising two clus-ters of basic amino acid residues separated by a 10-aa spacerregion (43). In a curious analogy, both the protease and theRdRp of potyviruses contain similar NLSs and are accumu-lated in the nuclei of infected plant cells (44, 45).Although the data base screening failed to detect other
sequences significantly similar to the capsid protein ofH-Ast2,direct comparison of this capsid sequence with the sequencesof other positive-strand RNA virus-capsid proteins identifieda conserved domain (from position 107 to 286) with hepatitisE virus (from position 159 to 337), an agent phylogeneticallyremote from astrovirus and other supergroup I viruses in termsof the comparison ofRdRps and the other principal nonstruc-tural domains (data not shown; ref. 46). Because both astro-virus and hepatitis E virus replicate in the human gut, thisconserved domain might have resulted from a recombinationalevent during coinfection. Of interest, astrovirus-like particleshave been reported (47) in association with fatal hepatitis inducklings, suggesting a possible hepatic tropism for this virus.To gain further insight into the evolutionary relationship of
astroviruses, we generated a tentative phylogenetic tree (15)for the supergroup I RdRps, including the H-Ast2 sequence.The result showed that astroviruses constitute a distinctevolutionary lineage not closely associated with any othergroup of viruses (Fig. 4). We are inclined to interpret therelatively high similarity with the RdRps of bymoviruses andpotyviruses (see above) as conservation of ancestral featuresrather than direct evidence of common origin.Our data show that astrovirus has no close relatives among
other viruses, as demonstrated by comparative sequence
Microbiology: Jiang et al.
10542 Microbiology: Jiang et al. Proc. Natl. Acad. Sci. USA 90 (1993)
AI
..K.E.FLKKEI***S
SLKAELSLKAELGLKDELGLKDELALKDELYVKDELFLKDELFLKDEICLKDELCPKDELFVKGEPFVKQEP
13
121212121312121212121212
II.&.& R&& . &.
IVCADPIYTRIGA-CLEAHQNAL-MK-QHTD
VFTASPITSLFAM-KFYVDDFNK-KF-YATNTFTAAPIDTLLAG-KVCVDDFNN-QF-YDLNMIWGCDVGVATVCAAA-FKGVSDAITANHQYLLWGCDVGVA-VCAAAVFHNICYKLKMVARFLLWGADLGTV-VRAARAFGPFCDAIKSHTIKLIEASSLNDSVAM-RMAFGNLYA-AFHKNPGIVDVPPFEHCILG-RQLLGKFAS-KFQTQPGIVDVLPVEHILYT-RMMIGRFCA-QMHSNNGCIEACEVDYCIVY-RMIMMEIYD-KIYQTPCAIDACPLDYSILC-RMYWGPAIS-YFHLNPGLIMSVSLVDQLVA-RVLFQNQNKREISLWRSLISSVSIVDQLVE-RMLFGAQNELEIAEWQS
v....SG ... ... .NS.& ..&.&.
TTRGNPSGQFSTTMDNNMVNFWLQAFEF
* * *** ** .***. .
NVGNNSGQPSTVVDNTLVLMTAFLYAYHKGNNSGQPSTVVDNTLMVIIAMLYT-SSGLPSGMPLTSVINSLNHCLYVGCAIKRGLPSGMPFTSVINSICHWLLWSAAVKEGLPSGFPCTSQVNSINHWLITLCALKGGMPSGCSGTSIFNSMINNLIIRTLLTGGLPSGCAATSMLNTIMNNIIIIRAGLEGGMPSDCSATGIINTILNNIYVLYALHGGMPSGSPCTTVLNSLCNLMMCIYTTCGSMPSGSPCTALLNSIINNVNLYYVFPGVQKSGSYNTSSSNSRIR-VMAAYHCPGIMKSGSYCTSSTNSRIR-CLMAELI
15
171220191613131 3101644
VI.& . &GDD. &&
STVVYGDDRLS* ***
FVCNGDDNKFYYVNGDDLLIMMTYGDDGVYFYTYGDDGVYFSFYGDDEIVMIAYGDDVIAVLSYGDDLLVMISYGDDIVVPIVYGDDVILILCYGDDVLIAMAMGDDALECIAMGDDSVE
5
668876666666
36
34333939383637
3837421931
III&G . . . .&.CGWSPMEGGFKK
VGINKFGRGWEKVGMTKFYQGWNEMDSPSVEALFQRMTSRDVDVIINNNSIEDGPLIYAEVGCDP-DLFWSKIGCDP-DVHWTAVGCNP-DVDWQRVGINP-YKDWHFIGIDP-DRQWDEFGLST-DTQTAEMGLSV-IHQADA
VII....@~U...
CVGLSFCGFT
CENPYMSLTTQLWFMSHRDSVVFLKRTNKISFLKRTDGLVFLRRTENVTFLKRFEDVVFLKRKTDVTFLKRHMEVEFLKRKSELTFLKRS-ELEFCSHIYAVEFCSHV
13
1010101010131012131098
* *
H-AstFCVRHDVSRSVPLRVSBMVBaYMVTEVPVEMCVFMDVHAV
3C13C13C13C13C13C1NIaNIa3C3C3C3C
4531102112 71070
3941
22022632383838
NDIVTAAHVGVYASVAHVGLYISNTHTTVFITTTHVNALVTAEH-DVLMVPHHVDWILVPGHLPFIITNKHLNVAILPTHARTLVVNRHMTAYLVPRHLDWLLVPSHA
ECH022 3C 39 DEIILHGHS
25181421293134342831353735
KDIAFITCPG 47GEFCCFRSTK 47TDLCLVKGES 45GEFTQFRFSK 70NDISILVGPP 53IDFVLVKVPT 53-DVIAIRRPA 51-DMIIIRMPK 53LEITIITLKR 61TDVSFIRLSS 64SDAALMVLHR 64QDVVLMKVPT 69MDLAILKCKL 62
*
RTQDGMSGAPVC-DKYG---RVLAVHQTNETHPGDCGLPYI-DDNG---RVTGLHTGSQTTHGDCGLPLY-DSSG---KIVAIHTGKGTIPGDCGAPYV-HKRGNDWVVCGVHAAANTGPGYSGTGFW-SSKN----LLGVLKGFPTAKGWSGTPLY-TRDG---- IVGMHTGYSTVLGMCGCQFWTLER----QIDGIHVATQTKDGQCGSPLVSTRDG ---FIVGIHSASPTRAGQCGG-VITCT-G---KVIGMHVGGNTRKGWCGSALLADL-GGSKKILGIHSAGATRAGYCGGAVLAKD-GADTFIVGTHSAGAWRPGMCGGALVSSNQSIQNAILGIHVAGKSCKGMCGGLLISKVEG-NFKILGMHIAG
FIG. 3. Amino acid sequence alignment of the predicted functional domains of astrovirus with related domains of other positive-strand RNAviruses. (A) Putative RNA-dependent RNA polymerases. The designation of the motifs has been described (23). The consensus shows aminoacid residues that are conserved in at least 80% of the polymerases of supergroup I (23-25). U, bulky aliphatic residue (I, L, M, V); @, aromaticresidue (F, Y, W); &, bulky hydrophobic residue (aliphatic or aromatic); and *, any residue. Residues conserved in the (putative) polymerasesof all positive-strand RNA viruses of eukaryotes are highlighted by boldface type. Stars denote identical residues, and colons denote similarresidues in the sequences of the (putative) polymerases of H-Ast2 and barley yellow mosaic virus (BaYMV). The alignment was generated bythe MACAW program (14) using the available information on conserved motifs in viral polymerases (26). Distances between the aligned conservedmotifs and from the protein (or the polyprotein for astroviruses and caliciviruses) termini are indicated. The sequences were from GenBank(26-28). TEV, tobacco etch potyvirus; RHDV, rabbit hemorrhagic disease virus; FCV, feline calicivirus; SRSV, small round structured virus;FMDV, foot-and-mouth-disease virus (type 01K); EMCV, encephalomyocarditis virus; PV, poliovirus (type 1); HAV, hepatitis A virus;ECHO22, echovirus (type 22); SBMV, southern bean mosaic sobemovirus; and PLRV, potato leafroll luteovirus. (B) Putative chymotrypsin-likeproteases. The same set of sequences was used as in A, but the sequences were regrouped to show the closer similarity between the putativeproteases of the astrovirus and the caliciviruses. *, Putative catalytic residues; !, residues implicated in substrate binding (13); 3C1, 3C-likeprotease (after 3C protease of picornaviruses); NI, nuclear inclusion protein. Other designations, abbreviations, the procedure for alignmentgeneration, and the sources of the sequences are as in A.
analysis, and that its genomic organization is distinctive is remarkable, however, that astroviruses combine featuresamong animal viruses. Astrovirus can be distinguished from typical of several very different groups ofpositive-strand RNAother positive-strand, nonenveloped RNA viruses-Picorna- viruses and even retroviruses (the frameshift signal). Of spe-viridae, Caliciviridae, and hepatitis E virus-by the presence cial interest is the similarity of the genomic organization andof the ribosomal frameshift and the lack of a helicase domain. expression strategy of astrovirus and plant luteoviruses (51).Otherwise, the closest similarity in genomic organization is Both groups of viruses lack the helicase domain, whereas thewith caliciviruses, which differ from astrovirus in that ORF 2 protease and the polymerase domains are apparently fused viaencoding the capsid protein is separated from the 3' end by a ribosome frameshifting. Moreover, both share the substitutionsmall ORF 3; picornaviruses differ by the 5' terminal local- of serine for the catalytic cysteine in the viral proteases.ization of the capsid-protein genes and by the absence of The present findings strongly support the classification ofsubgenomic RNA; and hepatitis E virus has a distinct array of astroviruses in another family, Astroviridae. The availabilitynonstructural genes, even though the gene encoding the struc- of sequence information will be useful in the development oftural protein is similarly localized at the 3' end (46, 48-50). It sensitive diagnostic assays to further our understanding of
conslH-Ast
BaYMVTEVFCV
RHDVSRSV
PVEMCVFMDV
ECH022HAV
PLRVSBMV
cons 1
H-Ast
BaYMVTEVFCV
RHDVSRSV
PVEMCVFMDV
ECH022HAV
PLRVSBMV
B
1086
197169
14051 92
1391156156161163164284215
49
454743434340434342435049
15
14139
109
11131311142716
IV.D&[email protected]
* .*.*. .
GDGSRFDSSIDADGSQFDSSLTVDYSKWDSTQSLDYSKWDSTMSADYTAWDSTQNFDYTGYDASLSVDYSNFDSTHSVDYSAFDTNHCMDYSQYDGSLSLDFSAFDASLSTDCSGFDWSVAADISGFDWSVQ
VIII.U...
KLMASLLKPY
SLPVERI IAIKLEEERIVS ILLDRSSILRQKLDKSSILRQRLDRASIERQVMPMKEIHESVMNREALEAMVMASKTLEAILLDTENMI QHAISEKTIWSLVNTNKMLYKLTSWPKTLYRF
71
8098
1025
95615759636951
100
847552
?529282474721925292320
Proc. Natl. Acad. Sci. USA 90 (1993) 10543
FIG. 4. Tentative phylogenetic relationship for RNA-dependentRNA polymerases of supergroup I positive-strand RNA viruses. Thesame set of sequences was analyzed as for Fig. 3 A and B. Thedendrogram was derived by majority-rule consensus of 100 boot-strapped trees generated using the KITSCH program of the PHYLIPpackage, which implements the distance-matrix algorithm of Fitchand Margoliash under the evolutionary-clock assumption. The otheralgorithms applied (15) produced very similar tree topologies, exceptfor the protein-parsimony algorithm, which suggested grouping oftheastrovirus polymerase with the bymovirus and potyvirus polymer-ases. For abbreviations, see Fig. 3A legend.
the importance of this group of viruses as a cause of diseasein humans and animals.
We thank Daniel Bradley, Jon Gentsch, and John O'Connor forcritical reading of the manuscript.
1. Madeley, C. R. & Cosgrove, B. P. (1975) Lancet ii, 124.2. Monroe, S. S., Stine, S. E., Gorelkin, L., Herrmann, J. E.,
Blacklow, N. R. & Glass, R. I. (1991) J. Virol. 65, 641-648.3. Matsui, S. M., Kim, J. P., Greenberg, H. B., Young,
L. V. M., Smith, L. S., Lewis, T. L., Herrmann, J. E., Black-low, N. R., Dupis, K. & Reyes, G. R. (1993) J. Virol. 67,1712-1715.
4. Kurtz, J. B. & Lee, T. W. (1984) Lancet i, 1405.5. Greenberg, H. B. & Matsui, S. M. (1992) Infect. Agents Dis. 1,
71-91.6. Herrmann, J. E., Taylor, D. N., Echeverria, P. & Blacklow,
N. R. (1991) N. Engl. J. Med. 324, 1757-1760.7. Willcocks, M. M. & Carter, M. J. (1992) Arch. Virol. 124,
279-289.8. Monroe, S. S., Jiang, B., Stine, S. E., Koopmans, M. & Glass,
R. I. (1993) J. Virol. 67, 3611-3614.9. Devereux, J., Haeberli, P. & Smithies, 0. (1984) Nucleic Acids
Res. 12, 387-395.10. Lambden, P. R., Cooke, S. J., Caul, E. 0. & Clarke, I. N.
(1992) J. Virol. 66, 1817-1822.11. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman,
D. J. (1990) J. Mol. Biol. 215, 403-410.12. Henikoff, S. & Henikoff, J. G. (1992) Proc. Natl. Acad. Sci.
USA 89, 10915-10919.13. Gorbalenya, A. E., Blinov, V. M., Donchenko, A. P. & Koo-
nin, E. V. (1989) J. Mol. Evol. 28, 256-268.
14. Schuler, G. D., Altschul, S. F. & Lipman, D. J. (1991) Proteins9, 180-190.
15. Felsenstein, J. (1989) Cladistics 5, 164-166.16. Kozak, M. (1991) J. Biol. Chem. 266, 19867-19870.17. Pleij, C. W. (1990) Trends Biochem. Sci. 15, 143-147.18. Chamorro, M., Parkin, N. & Varmus, H. E. (1992) Proc. Natl.
Acad. Sci. USA 89, 713-717.19. Brierley, I., Digard, P. & Inglis, S. C. (1989) Cell 57, 537-547.20. den Boon, J. A., Snijder, E. J., Chirnside, E. D., de Vries,
A. A., Horzinek, M. C. & Spaan, W. J. (1991) J. Virol. 65,2910-2920.
21. Xiong, Z., Kim, K. H., Kendall, T. L. & Lommel, S. A. (1993)Virology 193, 213-221.
22. Prufer, D., Tacke, E., Schmitz, J., Kull, B., Kaufmann, A. &Rohde, W. (1992) EMBO J. 11, 1111-1117.
23. Eggen, R. & van Kammen, A. (1988) in RNA Genetics, eds.Ahlquist, P., Holland, J. & Domingo, E. (CRC, Boca Raton,FL), Vol. 1, pp. 49-69.
24. Hellen, C. U., Krausslich, H. G. & Wimmer, E. (1989) Bio-chemistry 28, 9881-9890.
25. Palmenberg, A. C. (1990) Annu. Rev. Microbiol. 44, 603-623.26. Koonin, E. V. (1991) J. Gen. Virol. 72, 2197-2206.27. Hyypia, T., Horsnell, C., Maaronen, M., Khan, M., Kalkki-
nen, N., Auvinen, P., Kinnunen, L. & Stanway, G. (1992) Proc.Natl. Acad. Sci. USA 89, 8847-8851.
28. Meyers, G., Wirblich, C. & Thiel, H. J. (1991) Virology 184,664-676.
29. Dolja, V. V. & Carrington, J. C. (1992) Semin. Virol. 3, 315-326.
30. Koonin, E. V. & Dolja, V. V. (1993) Crit. Rev. Biochem. Mol.Biol. 28, 375-430.
31. Gorbalenya, A. E., Donchenko, A. P., Blinov, V. M. & Koo-nin, E. V. (1989) FEBS Lett. 243, 103-114.
32. Doolittle, R. F. (1986) OfURFs and ORFs (University ScienceBooks, Mill Valley, CA).
33. Gorbalenya, A. E., Koonin, E. V., Blinov, V. M. &Donchenko, A. P. (1988) FEBS Lett. 236, 287-290.
34. Bazan, J. F. & Fletterick, R. J. (1989) FEBS Lett. 249, 5-7.35. Gorbalenya, A. E., Koonin, E. V., Donchenko, A. P. & Bli-
nov, V. M. (1989) Nucleic Acids Res. 17, 4713-4730.36. Gorbalenya, A. E., Koonin, E. V. & Lai, M. M. (1991) FEBS
Lett. 288, 201-205.37. Rozanov, M. N., Koonin, E. V. & Gorbalenya, A. E. (1992) J.
Gen. Virol. 73 (8), 2129-2134.38. Gorbalenya, A. E. & Koonin, E. V. (1989) Nucleic Acids Res.
17, 8413-8440.39. Vartapetian, A. B. & Bogdanov, A. A. (1987) Prog. Nucleic
Acid Res. Mol. Biol. 34, 209-251.40. Wimmer, E. (1982) Cell 28, 199-201.41. Giachetti, C., Hwang, S. S. & Semler, B. L. (1992) J. Virol. 66,
6045-6057.42. Aroonprasert, D., Fagerland, J. A., Kelso, N. E., Zheng, S. &
Woode, G. N. (1989) Vet. Microbiol. 19, 113-125.43. Dingwall, C. & Laskey, R. A. (1991) Trends Biochem. Sci. 16,
478-481.44. Carrington, J. C., Freed, D. D. & Leinicke, A. J. (1991) Plant
Cell 3, 953-962.45. Li, X. H. & Carrington, J. C. (1993) Virology 193, 951-958.46. Koonin, E. V., Gorbalenya, A. E., Purdy, M. A., Rozanov,
M. N., Reyes, G. R. & Bradley, D. W. (1992) Proc. Natl.Acad. Sci. USA 89, 8259-8263.
47. Gough, R. E., Collins, M. S., Borland, E. & Keymer, L. F.(1984) Vet. Rec. 114, 279.
48. Lambden, P. R., Caul, E. O., Ashley, C. R. & Clarke, I. N.(1993) Science 259, 516-519.
49. Kitamura, N., Semler, B. L., Rothberg, P. G., Larsen, G. R.,Adler, C. J., Dorner, A. J., Emini, E. A., Hanecak, R., Lee,J. J., van der Werf, S., Anderson, C. W. & Wimmer, E. (1981)Nature (London) 291, 547-553.
50. Tam, A. W., Smith, M. M., Guerra, M. E., Huang, C.-C.,Bradley, D. W., Fry, K. E. & Reyes, G. R. (1991) Virology185, 120-131.
51. Bazan, J. F. & Fletterick, R. J. (1990) Semin. Virol. 1, 311-322.
Microbiology: Jiang et al.