Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating...
Transcript of Materials and Methods Virus and Cell Culture · 2003. 5. 29. · M93390], porcine hemagglutinating...
Materials and Methods
Virus and Cell Culture. The newly recognized coronavirus that is associated
with severe acute respiratory syndrome (SARS-CoV, Urbani strain) was isolated
on Vero cells from the throat washings of a patient who was exposed to the virus
in Vietnam and subsequently died from progressive respiratory failure (6). RNA
was purified from infected Vero cells by the guanidinium acid-phenol method and
used for all subsequent experiments.
Reverse Transcription-Polymerase Chain Reaction (RT-PCR) and
Sequencing. The complete sequence of the genome of SARS-CoV was
determined using a combination of techniques. Most of the sequence was
derived from RT-PCR products that were amplified directly from viral RNA.
Initially, degenerate, inosine-containing primers were designed to anneal to sites
encoding conserved amino acid motifs that were identified on the basis of
alignments of available coronavirus ORF1a, ORF1b, S, HE, M, and N gene
sequences. Additional, specific, primers were designed as sequences were
generated from RT-PCR products amplified with the degenerate primers and as
SARS-CoV sequences became available from the World Health Organization
Laboratory Network (3, 6). In all cases, the RT-PCR products were gel-isolated
and purified for sequencing by means of a QIAquick Gel Extraction kit (Qiagen,
Inc., Santa Clarita, CA). Both strands were sequenced by automated methods,
using fluorescent dideoxy-chain terminators (Applied Biosystems; Foster City,
CA).
1
For RT-PCR products of less than 3 kb, cDNA was synthesized in a 20-µl
reaction mixture containing 500 ng of RNA, 200 U of Superscript II reverse
transcriptase (Invitrogen Life Technologies, Carlsbad, CA), 40 U of RNasin
(Promega Corp., Madison, WI), 100 mM each dNTP (Roche Molecular
Biochemicals, Indianapolis, IN), 4 µl of 5X reaction buffer (Invitrogen Life
Technologies), and 200 pmol of the reverse primer. The reaction mixture, except
for the reverse transcriptase, was heated to 70°C for 2 minutes, cooled to 4°C for
5 minutes and then heated to 42°C in a thermocycler. The mixture was held at
42°C for 4 minutes, and then the reverse transcriptase was added, and the
reactions were incubated at 42°C for 45 minutes. Two microliters of the cDNA
reaction was used in a 50-µl PCR reaction containing 67 mM Tris-HCl (pH 8.8), 1
mM each primer, 17 mM ammonium sulfate, 6 mM EDTA, 2 mM MgCl2, 200 mM
each dNTP, and 2.5 U of Taq DNA polymerase (Roche Molecular Biochemicals).
The thermocycler program for the PCR consisted of 40 cycles of denaturation at
95°C for 30 seconds, annealing at 42°C for 30 seconds, and extension at 65°C
for 30 seconds. For specific primers, the annealing temperature was increased
to 55°C.
For amplification of fragments longer than 3 kb, regions of the genome
between sections of known sequence were amplified by means of a long RT-
PCR protocol and SARS-CoV-specific primers. First-strand cDNA synthesis was
performed at 42°C or 50°C using Superscript II RNase H reverse transcriptase
(Invitrogen Life Technologies) according to the manufacturer’s instructions with
2
minor modifications. Coronavirus-specific primers (500 ng) and SARS-CoV RNA
(350 ng) were combined with the PCR Nucleotide Mix (Roche Molecular
Biochemicals, Mannheim, Germany), heated for 1 minute at 94°C, and cooled to
4°C in a thermocycler. The 5X first-strand buffer, dithiothreitol (Invitrogen), and
Protector RNase Inhibitor (Roche Molecular Biochemicals) were added, and the
samples were incubated at 42°C or 50°C for 2 minutes. After reverse
transcriptase (200 U) was added, the samples were incubated at 42°C or 50°C
for 1.5 to 2 hours. Samples were inactivated at 70°C for 15 minutes and
subsequently treated with 2 U of RNase H (Roche Molecular Biochemicals) at
37°C for 30 minutes. Long RT-PCR amplification of 5- to 8-kb fragments was
performed using Taq Plus Precision (Stratagene, La Jolla, CA) and AmpliWax
PCR Gem 100 beads (Applied Biosystems) for “hot start” PCR with the following
thermocycling parameters: denaturation at 94°C for 1 minute followed by 35
cycles of 94°C for 30 seconds, 55°C for 30 seconds, an increase of 0.4 degrees
per second up to 72°C, and 72°C for 7 to 10 minutes, with a final extension at
72°C for 10 minutes. RT-PCR products were separated by electrophoresis on
0.9% agarose TAE gels and purified by use of a QIAquick Gel Extraction Kit
(Qiagen, Inc).
The sequence of the leader was obtained from the subgenomic mRNA
coding for the N gene and from the 5’ terminus of genomic RNA. The 5’ rapid
amplification of cDNA ends (RACE) technique (4) was used with reverse primers
specific for the N-gene or for the 5’ untranslated region. RACE products were
either sequenced directly or were cloned into a plasmid vector before
3
sequencing. A primer that was specific for the leader of SARS-CoV was used to
amplify the region between the 5’ terminus of the genome and known sequences
in the rep gene. The 3’ terminus of the genome was amplified for sequencing by
use of an oligo-(dT) primer and primers specific for the N gene.
Once the complete genomic sequence had been determined, it was
confirmed by sequencing a series of independently amplified RT-PCR products
spanning the entire genome. Positive- and negative-sense sequencing primers,
at intervals of approximately 300 nt, were used to generate a confirmatory
sequence with an average redundancy of 9.1. The confirmatory sequence was
identical to the original sequence. The sequence has been deposited in the
GenBank sequence database (accession no. AY278741). The sequences of the
primers used for sequencing and RT-PCR are available upon request.
Microarray Design and DNA Recovery. N gene sequences for SARS-CoV
were also obtained using a DNA microarray that contains approximately ~11,000
70-mer oligonucleotides representing all complete, previously described
reference viral genome sequences available from the National Center for
Biotechnology Information, National Library of Medicine (7). Total nucleic acid
was amplified from infected cell RNA by using a random-primer protocol as
described (1, 7) with the following modifications: first-strand synthesis was
primed by using primer A (5’-GTTTCCCAGTCACGATCNNNNNNNNN) followed
by PCR amplification with primer B (5’-GTTTCCCAGTCACGATC) for 40 cycles.
Aminoallyl-dUTP was incorporated into the PCR product by using an additional
4
20 cycles of thermocycling. Microarray spots were visualized by fluorescence
microscopy (Nikon TE300). Amplified viral DNA hybridized to individual
microarray spots was recovered by means of a tungsten wire probe (Omega
Engineering, Inc.) mounted on a micromanipulator to scrape a 100-µm area of
the microarray. Recovered material was PCR-amplified with primer B and
subsequently cloned and sequenced.
Sequence Analyses. Predicted amino acid sequences were compared with
those from reference viruses representing each species for which complete
genomic sequence information was available (group 1 representatives included
human coronavirus 229E [GenBank accession no. AF304460], porcine epidemic
diarrhea virus [GenBank accession no. AF353511], and transmissible
gastroenteritis virus [GenBank accession no. AF271965]; group 2
representatives included bovine coronavirus [GenBank accession no. AF220295]
and mouse hepatitis virus [GenBank accession no. AF201929]; group 3 was
represented by infectious bronchitis virus [GenBank accession no. M95169]).
Sequences for representative strains of other coronavirus species for which
partial sequence information was available were included for some of the
structural protein comparisons (group 1 representative strains included canine
coronavirus [GenBank accession no. D13096], feline coronavirus [GenBank
accession no. AY204704], and porcine respiratory coronavirus [GenBank
accession no. Z24675]; and group 2 representatives included three strains of
human coronavirus OC43 [GenBank accession nos. M76373, L14643, and
5
M93390], porcine hemagglutinating encephalomyelitis virus [GenBank accession
no. AY078417], and rat coronavirus [GenBank accession no. AF207551]).
Sequence alignments and neighbor-joining trees were generated by using
ClustalX (5), version 1.83, with the Gonnet protein comparison matrix. The
resulting trees were adjusted for final output by using treetool version 2.0.1.
Uncorrected pairwise distances were calculated from the aligned sequences by
using the Distances program from the Wisconsin Sequence Analysis Package,
version 10.2 (Accelrys, Burlington, MA). Distances were converted to percent
identity by subtracting from 100.
Poly(A)+ RNA Isolation and Northern Hybridization. Total RNA from infected
or uninfected Vero cells was isolated with Trizol reagent (Invitrogen Life
Technologies) used according to the manufacturer’s recommendations. Poly(A)+
RNA was isolated from total RNA by use of the Oligotex Direct mRNA Kit
(Qiagen), following the instructions for the batch protocol, followed by ethanol
precipitation. RNA isolated from 1 cm2 of cells was separated by electrophoresis
on a 0.9 % agarose gel containing 3.7% formaldehyde, followed by partial
alkaline hydrolysis (2). RNA was transferred to a nylon membrane (Roche
Molecular Biochemicals) by vacuum blotting (Bio-Rad, Hercules, CA) and fixed
by UV cross-linking. The DNA template for probe synthesis was generated by
RT-PCR amplification of SARS-CoV nt 29,083 to 29,608, by using a reverse
primer containing a T7 RNA polymerase promoter to facilitate generation of a
negative-sense riboprobe. In vitro transcription of the digoxigenin-labeled
6
riboprobe, hybridization, and detection of the bands were carried out with the
digoxigenin system by using manufacturer’s recommended procedures (Roche
Molecular Biochemicals). Signals were visualized by chemiluminescence and
detected with x-ray film.
7
HCoV
-229
E (
S: 1
to
1,17
4)
HCoV-SARS (S: 1 to 1,256)0
500
1,000
1,0005000
TGEV
(S
: 1
to 1
,449
)
HCoV-SARS (S: 1 to 1,256)0
500
1,000
1,0005000
PEDV
(S
: 1
to 1
,384
)
HCoV-SARS (S: 1 to 1,256)0
500
1,000
1,0005000MH
V (
S: 1
to
1,36
2)
HCoV-SARS (S: 1 to 1,256)0
500
1,000
1,0005000BC
oV
(S:
1 to
1,3
64)
HCoV-SARS (S: 1 to 1,256)0
500
1,000
1,0005000
IBV
(S:
1 t
o 1,
163)
HCoV-SARS (S: 1 to 1,256)0
500
1,000
1,0005000
HCoV
-229
E (
E: 1
to
78)
HCoV-SARS (E: 1 to 77)0
20
40
60
6040200
TGEV
(E
: 1
to 8
3)
HCoV-SARS (E: 1 to 77)0
20
40
60
80
6040200
PEDV
(E
: 1
to 7
7)
HCoV-SARS (E: 1 to 77)0
20
40
60
6040200MH
V (
E: 1
to
89)
HCoV-SARS (E: 1 to 77)0
20
40
60
80
6040200BC
oV
(E:
1 to
85)
HCoV-SARS (E: 1 to 77)0
20
40
60
80
6040200
IBV
(E:
1 t
o 10
9)HCoV-SARS (E: 1 to 77)
0
50
100
500
HCoV
-229
E (
M: 1
to
226)
HCoV-SARS (M: 1 to 222)0
50
100
150
200
200150100500
TGEV
(M
: 1
to 2
63)
HCoV-SARS (M: 1 to 222)0
50
100
150
200
250
200150100500
PEDV
(M
: 1
to 2
27)
HCoV-SARS (M: 1 to 222)0
50
100
150
200
200150100500MH
V (
M: 1
to
229)
HCoV-SARS (M: 1 to 222)0
50
100
150
200
200150100500BC
oV
(M:
1 to
231
)
HCoV-SARS (M: 1 to 222)0
50
100
150
200
200150100500
IBV
(M:
1 t
o 22
6)HCoV-SARS (M: 1 to 222)
0
50
100
150
200
200150100500
HCoV
-229
E (
N: 1
to
390)
HCoV-SARS (N: 1 to 423)0
100
200
300
4003002001000
TGEV
(N
: 1
to 3
83)
HCoV-SARS (N: 1 to 423)0
100
200
300
4003002001000
PEDV
(N
: 1
to 4
42)
HCoV-SARS (N: 1 to 423)0
100
200
300
400
4003002001000MH
V (
N: 1
to
452)
HCoV-SARS (N: 1 to 423)0
100
200
300
400
4003002001000BC
oV
(N:
1 to
449
)
HCoV-SARS (N: 1 to 423)0
100
200
300
400
4003002001000
IBV
(N:
1 t
o 41
0)HCoV-SARS (N: 1 to 423)
0
100
200
300
400
4003002001000
HCoV
-229
E (
pp1a
b: 1
to
6,78
9)
HCoV-SARS (pp1ab: 1 to 7,074)0
2,000
4,000
6,000
6,0004,0002,0000
TGEV
(p
p1ab
: 1
to 6
,695
)
HCoV-SARS (pp1ab: 1 to 7,074)0
2,000
4,000
6,000
6,0004,0002,0000
PEDV
(p
p1ab
: 1
to 6
,792
)
HCoV-SARS (pp1ab: 1 to 7,074)0
2,000
4,000
6,000
6,0004,0002,0000MH
V (
pp1a
b: 1
to
7,13
2)
HCoV-SARS (pp1ab: 1 to 7,074)0
2,000
4,000
6,000
6,0004,0002,0000BC
oV
(pp1
ab:
1 to
7,0
60)
HCoV-SARS (pp1ab: 1 to 7,074)0
2,000
4,000
6,000
6,0004,0002,0000
IBV
(pp
1ab:
1 t
o 6,
640)
HCoV-SARS (pp1ab: 1 to 7,074)0
2,000
4,000
6,000
6,0004,0002,0000
Figure S1. Identification of conserved regions of coronavirus proteins. The predicted SARS-CoV proteins (S, E, M, N, and pp1ab) were compared to the corresponding proteins from each of six reference viruses for which complete genomic sequence information was available (Group1: human coronavirus 229E [HCoV-229E], af304460; porcine epidemic diarrhea virus [PEDV], af353511; transmissible gastroenteritis virus [TGEV], aj271965. Group 2: bovine coronavirus [BCoV], af220295; murine hepatitis virus [MHV], af201929. Group 3: infectious bronchitis virus [IBV], m95169) using the compare program of the Wisconsin Sequence Analysis Package version 10.2 (Accelrys, Burlington, MA). A sliding window of 30 amino acids was used for each comparison, with the stringency set in proportion to the pairwise identity. In each panel, the SARS-CoV sequence is depicted along the horizontal axis and the comparison sequence is depicted along the vertical axis. (A) Comparison of coronavirus S proteins, stringency = 15. (B) Comparison of coronavirus E proteins, stringency = 15. (C) Comparison of coronavirus M proteins, stringency = 24. (D) Comparison of coronavirus N proteins, stringency = 20. (E) Comparison of coronavirus pp1ab proteins, stringency = 26.
200 400 600 800 1000 1200
*
200 400 600 800 1000 1200
*
200 400 600 800 1000 1200
*
200 400 600 800 1000 1200
*
200 400 600 800 1000 1200
*
200 400 600 800 1000 1200
*
200 400 600 800 1000 1200
*
200 400 600 800 1000 1200 1400
*
200 400 600 800 1000 1200 1400
*
200 400 600 800 1000 1200 1400
*
200 400 600 800 1000 1200
*
100 200 300 400 500 600 700 800 900 1000 1100
*
100 200 300 400 500 600 700 800 900 1000 1100
*
BCoV
HCoV-OC43
HEV
MHV
RtCoV
SARS-CoV
PRCoV
TGEV
CCoV
FCoV
PEDV
HCoV-229E
IBVFigure S2. Predicted alpha-amphipathic regions in the coronavirus S proteins. Alpha-amphipathic regions were calculated according to Eisenberg. Red boxes represent longer regions of heptad repeat regions, whereas blue boxes show shorter regions. Heptad repeat regions in the carboxyl terminal region of the S protein are thought to collapse into coiled-coils after receptor binding, thus bringing the viral membrane into close proximity with the cellular membrane, leading to fusion.
HCoV-229E GYWNVQKR..FRTRKGKRVDLSPKLHFYYLGTGPHKDAKFRERVEGVVWVAPEDV GYWNEQIR..WRMRRGERIEQPSNWHFYYLGTGPHGDLRYRTRTEGVFWVATGEV GYWNRQTR..YRMVKGQRKELPERWFFYYLGTGPHADAKFKDKLDGVVWVACCoV GYWNRQTR..YRMVKGRRKNLPEKWFFYYLGTGPHADAKFKQKLDGVVWVAFCoV GYWNRQIR..YRIVKGQRKELAERWFFYFLGTGPHADAKFKDKIDGVFWVAPRCoV GYWNRQTR..YRMVKGQRKELPERWFFYYLGTGPHADAKFKDKLDGVVWVABCoV GYWYRHNRRSFKTADGNQRQLLPRWYFYYLGTGPHAKDQYGTDIDGVFWVAMHV GYWYRHNRRSFKTPDGQHKQLLPRWYFYYLGTGPHAGAEYGDDIEGVVWVAHEV GYWYRHNRRSFKTADGNQRQLLPRWYFYYLGTGPHAKDQYGTDIDGVFWVARtCoV GYWYRHNRRSFKTPDGQQKQLLPRWYFYYLGTGPHAGASFGDSIEGVFWVAIBV GYWRRQAR..FKPGKGGRKPVPDAWYFYYTGTGPAADLNWGDTQDGIVWVA
SARS-CoV GYYRRATRR.VRGGDGKMKELSPRWYFYYLGTGPEASLPYGANKEGIVWVA
Figure S3. Conserved motifs in coronavirus N proteins. The predicted amino acid sequences of the nucleocapsid proteins of the indicated coronaviruses were aligned by Clustalx 1.83. The portion of the aligned sequences around the conserved motif, FYYLGTGP, is shown. Amino acids that were identical in at least 11 of the 12 aligned sequences are highlighted in blue. Sequences used for the alignments included the following for group 1: human coronavirus 229E (HCoV-229E), af304460; porcine epidemic diarrhea virus (PEDV), af353511; transmissible gastroenteritis virus (TGEV), aj271965; canine coronavirus (CCoV), d13096; feline coronavirus (FCoV), ay204704; porcine respiratory coronavirus (PRCoV), z24675; for group 2: bovine coronavirus (BCoV), af220295; murine hepatitis virus (MHV), af201929; porcine hemagglutinating encephalomyelitis virus (HEV), ay078417; rat coronavirus (RtCoV), af207551; for group 3: infectious bronchitis virus (IBV), m95169.
S (180 kD)
N (55 kD)
Figure S4. SDS-PAGE analysis of purified SARS-CoV virions. SARS-CoV was concentrated from infected Vero cell supernatant medium by precipitation with polyethylene glycol. Virions were purified by centrifugation though a 20-60% sucrose gradient before being subjected to electrophoresis on a 10% SDS-PAGE gel. Proteins were visualized by staining with Coomassie Blue. A preparation of Ebola virus proteins of known molecular weights were included as size markers.
Table S1. Locations of SARS-CoV ORFs and sizes of predicted proteins and mRNAs
Genome Location Predicted Size
ORF TRSa ORF Start ORF End Protein (aa) mRNA (nt)b
1a 72 265 13,398 4,378 29,727
1b 13,398 21,482 2,695
S 21,491 21,492 25,256 1,255 8,308c
X1 25,265 25,268 26,089 274 4,534c
X2 25,689 26,150 154
E 26,117 26,344 76
M 26,353 26,398 27,060 221 3,446c
X3 27,074 27,262 63
X4 27,272 27,273 27,638 122 2,527c
X5 27,778 27,864 28,115 84 2,021d
N 28,111 28,120 29,385 422 1,688c
a The location is the 3’-most nucleotide in the consensus transcriptional regulatory sequence
(TRS), AAACGAAC.
b Not including poly(A). Predicted size is based on the position of the conserved TRS.
c Corresponding mRNA detected by Northern blot analysis (Fig. 1C)
d No mRNA corresponding to utilization of this consensus TRS was detected by Northern blot
analysis (Fig. 1C)
Table S2. Calculated molecular weights and potential N-linked glycosylation sites of coronavirus S proteins.
Virus S Protein MW Potential Glycosylation SitesNXS / NXT / Total
HCoV-229E 128,653 9 / 21 / 30PEDV 151,371 7 / 22 / 29TGEV 160,136 6 / 26 / 32CCoV 160,487 4 / 29 / 33FCoV 160,489 4 / 31 / 35PRCoV 134,809 4 / 25 / 29
BCoV 150,889 8 / 11 / 19MHV 149,861 11 / 9 / 20HCoV-OC43 150,108 10 / 11 / 21HEV 149,512 10 / 9 / 19RtCoV 149,566 11 / 10 / 21
IBV 128,062 12 / 17 / 29
SARS-CoV 137,688 7 / 16 / 23
The predicted molecular weight and consensus glycosylation sites were calculated for representative coronavirus S proteins. Sequences used for the comparisons included the following: for group 1: human coronavirus 229E (HCoV-229E), af304460; porcine epidemic diarrhea virus (PEDV), af353511; transmissible gastroenteritis virus (TGEV), aj271965; canine coronavirus (CCoV), d13096; feline coronavirus (FCoV), ay204704; porcine respiratory coronavirus (PRCoV), z24675; for group 2: bovine coronavirus (BCoV), af220295; murine hepatitis virus (MHV), af201929; human coronavirus OC43 (HCoV-OC43), m76373, l14643, m93390; porcinehemagglutinating encephalomyelitis virus (HEV), ay078417; rat coronavirus (RtCoV), af207551; and for group 3: infectious bronchitis virus (IBV), m95169.
Table S3. Description and comparison of SARS-CoV genomic sequences available in GenBank (as of 25 April 2003).
Sourcea Strain Accession GI Mod Date Length Poly(A) Uniqueb 5'Endc
HKU HKU-39849 AY278491.2 30023963 18-Apr-2003 29,742 15 29,727 0CUHK CUHK-W1 AY278554.2 30027610 21-Apr-2003 29,736 24 29,712 -15CDC Urbani AY278741.1 30027617 21-Apr-2003 29,727 0 29,727 0BCCA GSC TOR2 AY274119.2 30088476 23-Apr-2003 29,736 24 29,712 -15
Positiond Consensus HKU-39849 CUHK-W1 Urbani TOR2 2,601 T C * * * 7,746 G * T * * 7,919 C * * T * 7,930 G A * * * 8,387 G C * * * 8,417 G C * * * 9,404 T * C * * 9,479 T * C * *
13,494 G A * * * 13,495 T G * * * 16,622 C * * T * 17,564 T * G * * 17,846 C * T * * 18,065 G A * * * 19,064 R A G G A 21,721 G * A * * 22,222 T * C * * 23,220 T * * * G 24,872 T * * C * 25,298 G * * * A 25,569 T A * * * 26,600 C T * * * 26,857 T * * C * 27,827 T * C * *
a Original source of the sequence information: The University of Hong Kong (HKU), Chinese University of Hong Kong (CUHK); US Centers for Disease Control and Prevention (CDC), British Columbia Cancer Agency, Genome Sciences Centre (BCCA GSC). b Length of the unique sequence without poly(A). c Number of nucleotides missing from the 5’-end, assuming that the longest reported sequences are full-length. d Position based on alignment with the two longer sequences.
References
1. S.K. Bohlander et al., Genomics 13,1322 (1992).
2. T. Brown, T. In Current protocols in molecular biology, Vol. 1. Eds. F. M.
Ausubel, R. Brent, R. E. Kingston, D. D. Moore, J. G. Seidman, J. A.
Smith, K. Struhl, (John Wiley & Sons, Inc., New York, N.Y. 1996) Ch. 4.9
3. C. Drosten, et al., 2003, N Engl J Med. Available April 17 at
http://nejm.org/earlyrelease/sars.asp#4-2
4. B.H. Harcourt, et al., Virology 271, 334 (2000).
5. J.D. Thompson, et al., Nucleic Acids Res, 25, 4876 (1997).
6. T.G. Ksiazek, et al., N Engl J Med 348, 1947 (2003).
7. D. Wang, et al., PNAS, 99, 15687 (2002).
1