Analysis of mRNA Poly-A Sequence Variants by High ...

12
Application Note Pharma & Biopharma Author Brian Liau Agilent Technologies, Inc. Introduction The urgency engendered by the SARS-CoV-2 pandemic of 2020 has prompted policy makers and pharmaceutical firms alike to develop and deploy mRNA vaccines with unprecedented speed. mRNA vaccines have shown impressive safety and efficacy in clinical trials 1-4 , outperforming vaccines based on alternative technologies. As mRNA vaccines are considered gene therapies 5 , FDA guidance requires extensive characterization of product-related impurities. These may include populations of mRNA molecules with slight errors in their sequence, known as sequence variants. In addition, mRNA vaccines require lengthy, repetitive sections of A nucleotides (poly-A) at the 3' terminus for optimal stability and biological activity. 6 Both the length and the composition of poly-A sequences are therefore critical quality attributes. Analysis of mRNA Poly-A Sequence Variants by High-Resolution LC/MS

Transcript of Analysis of mRNA Poly-A Sequence Variants by High ...

Page 1: Analysis of mRNA Poly-A Sequence Variants by High ...

Application Note

Pharma & Biopharma

AuthorBrian Liau Agilent Technologies, Inc.

IntroductionThe urgency engendered by the SARS-CoV-2 pandemic of 2020 has prompted policy makers and pharmaceutical firms alike to develop and deploy mRNA vaccines with unprecedented speed. mRNA vaccines have shown impressive safety and efficacy in clinical trials1-4, outperforming vaccines based on alternative technologies. As mRNA vaccines are considered gene therapies5, FDA guidance requires extensive characterization of product-related impurities. These may include populations of mRNA molecules with slight errors in their sequence, known as sequence variants. In addition, mRNA vaccines require lengthy, repetitive sections of A nucleotides (poly-A) at the 3' terminus for optimal stability and biological activity.6 Both the length and the composition of poly-A sequences are therefore critical quality attributes.

Analysis of mRNA Poly-A Sequence Variants by High-Resolution LC/MS

Page 2: Analysis of mRNA Poly-A Sequence Variants by High ...

2

This work uses an Agilent AdvanceBio 6545XT LC/Q-TOF to analyze poly-A tail sequences formed by E. coli Poly-A Polymerase (PAP), which is a common component of in vitro transcription systems. Findings show that PAP is not fully selective for ATP, and can act on both CTP and UTP precursors to incorporate significant quantities of undesirable C and U nucleotides under standard in vitro transcription conditions. As these sequence variants can be regarded as product-related impurities, the results caution against the use of PAP and show the value of LC/MS as a sensitive and efficient method for process optimization and quality control of nucleic acid therapies.

Abbreviations used in this work:

– ATP – adenosine triphosphate

– CTP – cytidine triphosphate

– UTP – uridine triphosphate

– GTP – guanosine triphosphate

– A, C, U, and G nucleotides – adenosine, cytidine, uridine, and guanosine monophosphate

– Poly-A – polyadenosine

– PAP – E. coli Poly-A Polymerase

– RNA-seq – RNA sequencing

Experimental

In-vitro transcription of mRNAA pCMV3 plasmid encoding a 3822 nt gene flanked by an upstream T7 promoter and a downstream BGH terminator sequence was purchased from Sino Biological. The DNA sequence was PCR amplified for 35 cycles using T7 and BGH terminator primers (Agilent Herculase II Fusion DNA Polymerase, part number 600677). After cleanup (Agilent StrataPrep PCR Purification kit, part number 400771), the amplified dsDNA was analyzed on an Agilent 2100 Bioanalyzer with a DNA 7500 kit (part number 5067-1506) to measure its concentration and to assess the uniformity of amplification. The amplified dsDNA (~13 nM) was then transcribed in vitro using a HiScribe T7 ARCA mRNA Kit (New England Biolabs, part number E2060S) and tailed with the included PAP enzyme using the manufacturer’s recommended protocol, then precipitated with LiCl. Aliquots of transcribed mRNA before and after PAP tailing were analyzed on a 2100 Bioanalyzer with an RNA 6000

Nano kit (part number 5064-1511) to monitor the reaction.

For PAP selectivity studies, a synthetic 10-mer poly-A sequence with 5' and 3'-OH (Integrated DNA Technologies) was extended with PAP enzyme using only one precursor nucleoside triphosphate per reaction (1 mM of either ATP, CTP, UTP, or GTP) for 30 minutes at 37 °C, as illustrated in Figure 1A.

Sample preparationApproximately twenty picomoles of poly-A tailed mRNA was digested with 1,000 U of RNase T1 for 3 hours at 37 °C to liberate poly-A sequences. Each sample was subjected to five rounds of purification using 200 µL of oligo-dT magnetic beads to pull down poly-A sequences.7 Each pull-down was eluted in 50 µL of 1x IDTE buffer (Integrated DNA Technologies, part number 11-05-01-05) and pooled into a final volume of 250 µL. Prior to LC/MS analysis, the pooled eluate was desalted into 60 µL of deionized water using Vivaspin 500 cartridges with 10 kDa MWCO (Sartorius, part number VS0102).

Figure 1. Schematic of tailing reactions performed in this application note. (A) Reactions on RNA primer carried out with only one precursor per reaction. (B) Reactions carried out on in vitro transcribed mRNA under standard conditions (all precursors).

Poly-A tailing experiments

RNA primer (A10)+

ATP

CTPUTPGTP

E. Coli Poly-A

Polymerase

A

B

In vitro transcribed mRNA

E. Coli Poly-A

PolymerasePoly(A) +

C, U and G+

ATP

CTPUTPGTP

Page 3: Analysis of mRNA Poly-A Sequence Variants by High ...

3

LC-DAD/MS of poly-A sequencesInstrumentation consisted of:

– 1290 Infinity II LC with diode array detector (P/N G7117B)

– 6545XT AdvanceBio LC/Q-TOF

Care was taken to eliminate glass from the flow path to reduce alkaline metal adduction. Agilent Nalgene bottles (part number 9301-6460) were used as mobile phase containers, and each solvent line was equipped with a steel frit. Agilent polypropylene sample vials were used (part number 5190-2242). Before first use, the LC system and column were flushed with a 50% MeOH + 0.1% formic acid solution overnight to further reduce alkaline metal adducts. If required, a 30-minute flush with 50% MeOH + 0.1% formic acid was usually enough to clean the system between experiments.8

Poly-A sequences were separated on a PLRP-S column (2.1 × 50 mm, 5 µm, 1,000 Å, part number PL1912-1502). To achieve higher chromatographic resolution, an Infinity Poroshell 120 HPH-C18 column (2.1 × 50 mm, 1.9µm, 120 Å, part number  699675-702) was used in PAP selectivity experiments. The mobile phase and LC gradients are shown in Table 1. The mass spectrometer was operated in negative ion mode with settings in Table 2, and data analysis was performed in BioConfirm 10.0 with deconvolution settings in Table 3.

Agilent 1290 Infinity II LC System

Column InfinityLab Poroshell 120 HPH-C18, 1.9 µm, 2.1 × 50 mm,120 Å Agilent PLRP-S, 5 µm, 2.1 × 50 mm, 1,000 Å

Solvent A 15 mM dibutylamine + 25 mM HFIP in DI water

Solvent B 15 mM dibutylamine + 25 mM HFIP in methanol

Gradient0 to 2 min, 15% B 12 min, 30% B 12.1 to 13 min, 90% B

0 to 1 min, 15% B 10.5 min, 45% B 10.6 to 11.5 min, 90% B

Column Temperature 50 °C 80 °C

Flow Rate 0.4 mL/min

Injection Volume 10 to 20 µL

Table 1. Mobile phase and LC gradients.

Table 2. Mass spectrometer settings.

Agilent 6545XT AdvanceBio LC/Q-TOF

LC/MS LC/MS/MS

Acquisition Mode Negative, standard (3,200 m/z) mass range, high sensitivity (2 Ghz)

Gas Temperature 350 °C

Gas Flow 12 L/min

Nebulizer 55 psig

Sheath Gas Temperature 275 °C

Sheath Gas Flow 10 L/min

Vcap 4,500 V

Nozzle Voltage 2,000 V

Fragmentor 250 V

Skimmer 65 V

MS1 Range 400 to 3,200 m/z

MS1 Scan Rate 2 Hz 5 Hz

MS2 Range

N/A

50 to 3,200 m/z

MS2 Scan Rate 3 Hz

MS2 Isolation Width Medium (~4 amu)

Collision Energy 0, 40, 60 V

Threshold for MS2 On; 3 repeat then exclude for 0.2 min

Precursor Abundance Based Scan Speed

Yes

Target (Counts/Spectrum) 25,000

Use MS2 Accumulation Time Limit

Yes

Purity 100% stringency, 30% cutoff

Sort Precursors By abundance only; +3, +2, +1

Reference Mass 1,033.9881

Table 3. Deconvolution settings.

Agilent MassHunter BioConfirm B10.0 Settings

Oligonucleotide Length ≤30 nt ≥90 nt

Deconvolution Algorithm Maximum Entropy

Subtract Baseline 1

Adduct Proton loss

Mass Range 3,000 to 10,000 Da 30,000 to 60,000 Da

Mass Step 0.05 Da 0.05 Da

Use Limited m/z Range 1,040 to 3,200 800 to 2,500

Page 4: Analysis of mRNA Poly-A Sequence Variants by High ...

4

Results and discussionThis first test analyzed poly-A sequences extended by PAP on a synthetic RNA primer consisting of 10 repeated A nucleotides (A10) in the presence of 1 mM ATP. As shown in Figure 2 this resulted in a bimodal distribution of poly-A sequences, with one population consisting of shorter oligonucleotides eluting from 2.5 to 6 minutes, and

another consisting of longer nucleotides eluting in a broad peak at ~10.6 minutes. Mass spectrometric analysis indicated the shorter population ranged in size from 11 to 22 nt (Figure 3D), whereas the longer population ranged from 108 to 149 nt in length (35,492.89 to 48,990.17 Da, Figure 4C).

Extracted and deconvoluted mass spectra from three selected peaks from the shorter oligonucleotide

population are shown in Figure 3. Mass spectra consisted primarily of doubly and triply charged ions generated through proton loss, as well as minor populations of sodium adducts. Isotopically resolved deconvoluted mass spectra were assigned identities of A20, A21, and A22 (Figures 3B to 3D) with <5 ppm error based on the respective monoisotopic peaks.

Figure 2. UV absorbance at 260 nm (A: reference = 360 nm) and total ion chromatogram (B) of RNA primers extended with PAP in the presence of only ATP. Separation was carried out on a PLRP-S column.

00.51.01.52.02.53.03.54.04.55.0

10.675

3.4893.184

2.879 3.8122.555 4.207 4.599 4.957 5.275 5.830

00.10.20.30.40.50.60.70.80.91.0 10.639

3.1722.865 3.4722.549 3.796 4.187 4.578 4.952 5.268 5.817

2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0

A11 to A22

A108 to A149RNA Primer + ATPUV: λ = 260 nm

Total ion current

×102

×108

Acquisition time (min)

Resp

onse

Resp

onse

A

B

Page 5: Analysis of mRNA Poly-A Sequence Variants by High ...

5

Figure 3. A11 to A22 oligonucleotides formed by PAP. (A) Total ion current chromatogram showing the three selected peaks with extracted mass spectra shown in (B to D). Deconvoluted mass spectra of A20 (Mobs = 6,519.12 Da, Mtheo = 6,519.09 Da), A21 (Mobs = 6,848.16 Da, Mtheo = 6,848.15 Da), and A22 (Mobs = 7,177.22 Da, Mtheo = 7,177.20 Da) are shown as insets. Mobs: observed monoisotopic mass; Mtheo: theoretical monoisotopic mass.

0.20.30.40.50.60.70.80.91.01.11.21.31.41.5

3.1722.8653.472

2.549

3.796

4.187

4.578

4.952

5.2685.550

5.817 *6.066

2.4 2.8 3.2 3.6 4.0 4.4 4.8 5.2 5.6 6.0

00.40.81.21.62.02.42.83.23.64.04.44.85.2

2392

.392

7023

91.7

2447

2393

.061

99

2390 2392 2394 2396 2398 2400 2402 2404 2406 2408 24100

0.51.01.52.02.53.03.54.04.55.05.56.06.57.07.5

2282

.709

6722

82.0

4118

2283

.376

84

2280 2282 2284 2286 2288 2290 2292 2294 2296 2298 2300

00.10.20.30.40.50.60.70.80.91.01.11.21.31.4

2172

.691

2121

73.3

6040

2172

.357

10

2173

.694

37

2172

.023

29

2170 2172 2174 2176 2178 2180 2182 2184 2186 2188 2190

1 2 3

[M–3H]3–

[M–3H]3–

[M–4H+Na]3–

[M–5H+2Na]3–

[M–3H]3–

[M–4H+Na]3–

[M–5H+2Na]3–[M–4H+Na]3–

[M–5H+2Na]3–

00.20.40.60.81.01.21.41.61.8 6521.13

6523.156520.13

6524.15

6525.156519.126526.16

6527.166529.15

6518 6520 6522 6524 6526 6528

0123456789

6851.18

6852.19

6849.186853.20

6854.216848.16

6855.216856.21

6858.24

6848 6850 6852 6854 6856 68580

1

2

3

4

5

6

77180.23

7179.227181.25

7178.237182.24

7183.257184.267177.22

7185.257186.31

7178 7180 7182 7184 7186

Deconvoluted Deconvoluted

Deconvoluted

Peaks

×101

×103 ×103 ×103 ×103

×104 ×104

A

C

B

D

Acquisition time (min)

Coun

ts

Coun

ts

Coun

ts

Coun

ts

Coun

ts

Coun

tsRe

spon

se (%

)

Mass-to-charge (m/z)

Mass-to-charge (m/z)Mass-to-charge (m/z)

TIC: A11 to A22 Peak 1: A20

Peak 2: A21 Peak 3: A22

Deconvoluted mass (amu)

Deconvoluted mass (amu)Deconvoluted mass (amu)

A portion of the longer oligonucleotides was sampled for deconvolution (Figure 4A). The charge envelope in the extracted mass spectra from 10 to 10.3 minutes primarily fell between 800 to 2,500 m/z (Figure 4B) and was deconvoluted to a destination mass range of 30 to 60 kDa. The deconvoluted mass spectra (Figure 4C) clearly showed a heterogenous population of sample peaks from 34 to 50 kDa, which were evenly separated by 329.2 ±1 Da (Figure 4D). These mass increments were consistent with single additions of

A nucleotides, increasing the theoretical average mass by 329.209 Da. Table 4 shows that mass peaks in Figure 4D were confidently annotated as A121 to A138 with differences between theoretical and observed masses ≤1.16 Da.

To assess the selectivity of PAP for ATP, duplicate experiments were conducted where PAP was added to the RNA primer in the presence of only 1mM CTP, UTP, or GTP. Although the extension of long polymeric chains were not observed, chromatographically resolved

additions of up to two monomers of C nucleotides (Figure 5A) or one U nucleotide (Figure 5B) to the RNA primer, indicating that PAP was not wholly selective for ATP. The addition of guanosine monophosphate was not observed in this experiment (Figure 5C) but could not rule out the possibility that appreciable quantities might be added with longer reaction times or higher GTP concentrations. Overall, PAP showed the highest activity with ATP, followed by CTP, UTP, and GTP in descending order.

Page 6: Analysis of mRNA Poly-A Sequence Variants by High ...

6

×102

×102 ×102

×103

A

C

B

D

Acquisition time (min)

Coun

tsCo

unts

Coun

tsRe

spon

se (%

)

Mass-to-charge (m/z)

TIC: A108 to A149 Extracted MS: 10 to 10.3 minutes

Deconvoluted MS Deconvoluted MS (zoomed)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.010.639

9.3 9.5 9.7 9.9 10.1 10.3 10.5 10.7 10.9 11.1

00.20.40.60.81.01.21.41.61.82.02.22.42.62.83.03.23.43.6

40000 40500 41000 41500 42000 42500 43000 43500 44000 445000

0.20.40.60.81.01.21.41.61.82.02.22.42.62.83.03.23.4

41418.82

45040.6839114.50

51398.77 55306.54

57539.2852717.55

4997

4.73

32000 36000 40000 44000 48000 52000 56000

00.20.40.60.81.01.21.41.61.82.02.22.42.62.8

1033.98812524.96535

966.00006

723.90809

400 600 800 10001200140016001800200022002400260028003000

Deconvolute10 to 10.3 minutes

Deconvoluted mass (amu) Deconvoluted mass (amu)

41418.82

43393.78

40101.5342406.3539773.03 42735.8741748.39

44381.5443723.0940759.75 43065.0444710.2844052.9340431.09

329.52329.17

328.74 329.30

329.85

328.61328.73

329.28328.68

329.56329.36329.71

328.66329.56328.49

Figure 4. A108 to A149 oligonucleotides formed by PAP. (A) Total ion current chromatogram showing the region 10 to 10.3 minutes sampled for deconvolution. (B) Charge envelope and (C) Deconvoluted mass spectrum of sampled region. Dashed arrows (left = 35,492.89 Da, right = 48,990.17 Da) indicate the range of mass peaks that could be confidently assigned identities A108 to A149. (D) Enlarged deconvoluted mass spectrum showing regular intervals between peaks from 39,773.03 to 44,710.28 Da.

Table 4. Annotated mass peaks from Figure 4D.

Oligonucleotide Observed Mass (Da) Theoretical Mass (Da) Mass Difference (Da)

A121 39,773.03 39,772.08 0.95

A122 40,101.53 40,101.28 0.25

A123 40,431.09 40,430.49 0.6

A124 40,759.75 40,759.70 0.05

A125 41,089.46 41,088.90 0.56

A126 41,418.82 41,418.11 0.71

A127 41,748.39 41,747.32 1.07

A128 42,077.06 42,076.52 0.54

A129 42,406.35 42,405.73 0.62

A130 42,735.87 42,734.94 0.93

A131 43,065.04 43,064.15 0.89

A132 43,393.78 43,393.35 0.43

A133 43,723.09 43,722.56 0.53

A134 44,052.93 44,051.77 1.16

A135 44,381.54 44,380.97 0.57

A136 44,710.28 44,710.18 0.1

Page 7: Analysis of mRNA Poly-A Sequence Variants by High ...

7

Figure 5. UV absorbance (Abs = 260 nm, Ref = 360 nm) chromatograms showing promiscuous activity of PAP towards (A) GTP, (B) UTP, and (C) CTP. No addition of guanosine monophosphate was detected. (D) Relative quantitation of A10, A10C, and A10CC as shown in panel (C). Separation was carried out on an Agilent Poroshell 120 HPH-C18 column.

0

0.2

0.4

0.6

0.8

1.0 4.158

0

0.2

0.4

0.6

0.8

1.04.155

4.285

0

1

2

3

4

5

6

7

8

9

4.117

4.387

5.052

4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5 5.1 5.2 5.3 5.4

RNA Primer + CTP, UTP, or GTP

+GTP

+UTP

+CTP

A10

A10

A10

A10U

A10C

A10CC

×103

×103

×102

A

B

C

DAcquisition time (min)

Rres

pons

eRr

espo

nse

Rres

pons

e

Page 8: Analysis of mRNA Poly-A Sequence Variants by High ...

8

The deconvoluted mass spectra of the unmodified RNA primer and those extended with C or U nucleotides are shown in Figure 6. Isotopically resolved deconvoluted mass spectra were assigned identities of A10, A10C, A10CC and A10U with <13 ppm error based on the respective monoisotopic peaks. MS/MS experiments showed that C

and U nucleotides were indeed added to the 3' terminus of the RNA primer (Figure 7), resulting in the formation of characteristic doubly charged y-ions 1601.271 m/z and 1601.758 m/z. In contrast, the unmodified RNA primer was terminated with a 3' A nucleotide, yielding a doubly charged y-ion 1448.749 m/z upon fragmentation.

Next, full-length, in vitro transcribed mRNA were analyzed on a Bioanalyzer equipped with RNA 6000 Nano kit and by LC/MS. Before tailing, transcribed mRNA showed the expected length of ~3,800 nt which increased to ~4,200 nt after reaction with PAP (Figure 8A), indicating that successful poly-A tailing had been achieved.

Figure 6. Extracted and deconvoluted mass spectra of (A) Unmodified A10 RNA primer (Mobs = 3,228.61 Da, Mtheo = 3,228.57 Da), (B) Extended with one C nucleotide (Mobs = 3,533.65 Da, Mtheo = 3,533.61 Da), (C) Extended with two C nucleotides (Mobs = 3,838.67 Da, Mtheo = 3,838.65 Da), (D) Extended with one U nucleotide (Mobs = 3,534.62 Da, Mtheo = 3,534.59 Da). Mobs: observed monoisotopic mass; Mtheo: theoretical monoisotopic mass.

×106 ×106

×104×105 ×105

×105 ×106

×105

A

C

B

D

Coun

tsCo

unts

Coun

tsRe

spon

se (%

)

Mass-to-charge (m/z)Mass-to-charge (m/z)

Mass-to-charge (m/z)Mass-to-charge (m/z)

A10 A10C

A10UA10CC

00.51.01.52.02.53.03.54.04.55.05.56.06.57.07.58.0

1766

.299

6417

65.7

9879

1767

.302

9917

67.8

0368

1777

.289

2217

78.2

9111

1762 1766 1770 1774 1778 1782 1786 1790 17940

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1.1

1613

.778

9816

14.2

8019

1613

.277

64

1614

.781

79

1624

.768

66

1615

.783

84

1612 1616 1620 1624 1628 1632 1636

00.20.40.60.81.01.21.41.61.82.02.22.4 3229.61

3228.61 3230.61

3231.61

3232.613233.61

3235.61

3228 3230 3232 3234 3236

00.51.01.5

2.02.53.03.54.04.5

5.05.56.06.5

1918

.819

5119

18.3

1778

1919

.821

4419

20.3

2232

1929

.808

5519

30.8

1204

1940

.798

93

1918 19201922 1924 19261928 1930 19321934 1936 19381940 1942

00.20.40.60.81.01.21.41.6 3839.68

3840.68

3838.673841.68

3842.683843.68

3844.68

3838 3840 3842 3844 3846

00.10.20.30.40.50.60.70.80.91.01.11.21.31.41.5

1766

.791

1617

66.2

8933

1767

.792

74

1778

.294

27

1768

.293

94

1777

.278

69

1779

.299

06

1788

.777

08

176417661768177017721774177617781780178217841786178817901792

00.40.81.21.62.02.4 3535.63

3536.633534.62

3537.63

3538.633539.63

3540.62

3534 3536 3538 3540 3542

Deconvoluted

00.20.40.60.81.01.21.4 3534.66

3533.65

3536.66

3537.663538.66

3539.65

3533 3535 3537 3539 3541

Deconvoluted Deconvoluted

Deconvoluted

[M–2H]2–

[M–3H+Na]2–

[M–4H+2Na]2–

[M–2H]2–

[M–3H+Na]2–

[M–4H+2Na]2–

[M–2H]2–

[M–3H+Na]2–

[M–4H+2Na]2–

[M–2H]2–

[M–3H+Na]2–

[M–4H+2Na]2–

Deconvoluted mass (amu) Deconvoluted mass (amu)

Deconvoluted mass (amu)Deconvoluted mass (amu)

Page 9: Analysis of mRNA Poly-A Sequence Variants by High ...

9

Figure 7. MS/MS of selected oligonucleotides showing diagnostic ions characteristic of their different 3' termini.

00.20.40.60.81.01.21.4

1766

.300

71

1698

.775

12

657.

0990

8

1601

.775

30

1315

.202

60

900.

1856

8

1229

.234

64

571.

1303

6

986.

1463

4

1888

.333

82

96.9

7037

1480

.224

44

1100

.180

19

777.

0150

8

2845

.475

04

2074

.621

58

2217

.399

90

351.

5737

0

1975

.061

67

2342

.413

32

188.

5551

0

2657

.235

76

692.

7610

8

2541

.467

01

504.

0329

443

8.19

732

2772

.836

40

2439

.888

99

2961

.484

08

3101

.573

9330

42.0

3588

00.20.40.60.81.01.21.41.6

1766

.789

07

1699

.257

17

986.

1470

1

1230

.210

69

572.

1109

6

901.

1632

8

1602

.764

57

657.

0996

6

1315

.200

47

1889

.329

22

771.

1311

1

1383

.460

45

2092

.235

46

2304

.624

40

2864

.153

46

2632

.455

64

2961

.431

48

485.

0672

7

2434

.747

89

1973

.305

95

1504

.148

58

2217

.358

39

81.6

6265

329.

0501

3

159.

5327

8

2727

.184

66

1069

.432

00

3152

.754

40

265.

7971

1

2548

.434

73

3036

.676

75

0

0.5

1.0

1.5

2.0

2.5

1613

.781

24

1546

.252

61

657.

0977

3

986.

1492

6

595.

1419

1

924.

1941

9

1315

.203

92

1448

.753

97

1253

.249

44

771.

1240

7

1912

.344

99

1381

.725

62

1974

.310

90

2240

.407

74

346.

0572

5

522.

0429

0

2303

.365

25

408.

0112

5

1100

.181

08

2569

.464

22

851.

0947

2

75.6

9614

1168

.627

26

1758

.270

00

2632

.434

53

1838

.252

19

2460

.912

0023

68.5

1287

2817

.989

75

2088

.323

33

241.

6136

6

2167

.292

22

2702

.688

02

3032

.880

26

3143

.121

22

2911

.445

90

200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000

00.40.81.21.62.02.42.8

1601

.773

23

1601

.270

66

1602

.269

99

1602

.778

01

1603

.263

62

1603

.787

54

1604

.779

38

1603

.568

12

1602

.168

36

1602

.514

50

1601

.894

86

1602

.694

62

1603

.017

76

1605

.057

83

1604

.357

2516

04.5

7208

1601

.564

87

1605

.564

09

1600

.992

75

1604

.164

36

1600

.686

13

1603

.948

83

1600.5 1601.5 1602.5 1603.5 1604.5 1605.5

00.51.01.52.02.53.03.54.04.55.0

1602

.764

57

1601

.758

70

1602

.273

40

1603

.270

41

1602

.654

40

1603

.749

40

1605

.500

56

1603

.929

94

1605

.246

51

1604

.442

3016

04.2

6304

1603

.125

58

1602

.069

85

1604

.941

42

1600.5 1601.5 1602.5 1603.5 1604.5

00.51.01.52.02.53.03.54.0

1449

.252

77

1448

.749

40

1449

.753

83

1450

.255

83

1450

.748

13

1447

.246

66

1451

.254

62

1448

.246

68

1451

.738

56

1447

.008

17

1447

.718

59

1451

.440

59

1449

.936

45

1449

.393

2114

49.1

2714

1450

.975

16

1452

.647

45

1453

.031

20

1450

.420

05

1452

.398

0014

52.0

5549

1453

.229

16

1448

.917

84

1447 1448 1449 1450 1451 1452 1453

MS/MS of RNA oligonucleotides

A10

A10C

A10U

3’ diagnostic y-ion 1,448.749 (z = –2)

3’ diagnostic y-ion 1,601.271 (z = –2)

3’ diagnostic y-ion 1,601.758 (z = –2)

×103×102

×102

×102

×103

×103

Coun

tsCo

unts

Coun

ts

Mass-to-charge (m/z)

200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000Mass-to-charge (m/z)

200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000Mass-to-charge (m/z)

Mass-to-charge (m/z)

Mass-to-charge (m/z)

Mass-to-charge (m/z)

Page 10: Analysis of mRNA Poly-A Sequence Variants by High ...

10

Full length mRNA samples were digested with RNase T1, followed by repeated pull-downs with oligo dT magnetic beads to yield purified tail sequences. As with PAP-extended RNA primers, tail sequences derived from in vitro transcribed mRNA consisted of both a shorter population of oligonucleotides eluting between 3.7 to 7.5 minutes and a longer population eluting ~10.6 minutes (Figure 8B). Extracted and deconvoluted

mass spectra from selected peaks in the shorter population revealed poly-A sequences ranging in length from 16 to 27 nt, with each containing a single misincorporated U nucleotide (Figure 9). Although not seen in this dataset, misincorporated C nucleotides were also observed in other experiments.

As noted by M. Beverly et al.7, tail sequences formed by PAP are considerably more heterogenous in

length as compared to genetically templated poly-A sequences. The results indicate that this heterogeneity is compounded by the misincorporation of differing numbers of C and U nucleotides when the tailing reaction takes place under standard conditions with all four precursor nucleoside triphosphates present, making the mass spectra of longer tail sequences very challenging to deconvolute.

0

00.10.20.30.40.50.60.70.80.91.0

10.625

3.740 8.0724.089

4.430 6.3845.129 * 5.9195.3704.780

Acquisition time (min)

Resp

onse

Resp

onse

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0

0.10.20.30.40.50.60.70.80.91.01.1 10.644

8.7108.1523.760 7.7797.5356.932

7.2605.407 6.3794.120 5.947

5.129 5.6804.4444.7973.420

3.0302.785

In vitro transcribed mRNA

Ladd

er

T7 O

nly

T7 +

PAP

25200500

1000200040006000

Bioanalyzer RNA 6000 Nano Kit

(A16 to A27) + U

BA

UV: λ = 260 nm

Total ion current

×102

×102

Figure 8. (A) Bioanalyzer analysis of in vitro transcribed mRNA before (lane 1) and after (lane 2) tailing with PAP. (B) UV absorbance at 260 nm (top panel) and total ion chromatogram (bottom panel) of poly-A sequences appended to in vitro transcribed mRNA in the presence of all four nucleoside phosphate precursors. Separation was carried out on a PLRP-S column.

Page 11: Analysis of mRNA Poly-A Sequence Variants by High ...

11

ConclusionThis study shows that: (1) the intact masses of long (121 to 136 nt), heterogenous poly-A sequences can be accurately measured by deconvolution of their ensemble mass spectra, and (2) PAP is not fully selective for ATP under standard in vitro transcription conditions, causing both C and U nucleotides to be added to poly-A tail sequences.

Although these sequence variants may be inconsequential for in vitro studies, they are highly significant from a regulatory standpoint. Notably, other in vitro transcription enzymes such as T7 polymerase may also produce sequence variants through mechanisms such as slippage or transcriptional arrest9, underscoring the need for highly sensitive and selective methods for detecting these impurities.

To achieve such sensitivity and selectivity, one prior study demonstrated PAP’s off-target activity by using radiolabeled nucleotides.10 Such techniques can be hazardous and are ill-suited for production environments. LC/MS can achieve single-nucleotide selectivity without the need for such reagents. Moreover, LC/MS can detect and quantify sequence variants without the lengthy reverse transcription, ligation and amplification steps characteristic of RNA-seq, which are known to introduce biases and artifacts.

-0.50

0.5

1.52.0

1.0

2.5

3.0

3.5

4.04.5

5.0

3.740

4.089 4.430 6.384*6.1685.129*5.919

5.3704.780 *5.678

3.6 3.8 4.0 4.2 4.4 4.6 4.8 5.0 5.2 5.4 5.6 5.8 6.0 6.2 6.4 6.6 6.8

00.5

1.01.52.0

2.53.0

3.5

4.04.5

5.05.5

2275

.033

5022

74.6

9989

2275

.367

96

2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 22850

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

2165

.348

9721

65.0

1600

2165

.684

41

2164

.681

07

2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176

0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

5.5

2055

.331

1820

55.6

6660

2054

.996

83

2055

.999

55

2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066

1 2 3 00.10.20.30.40.50.60.70.80.91.0 6169.05

6168.04 6171

.05

6172

.05

6167.04 6173

.05

6175

.05

6177

.05

6166 6170 6174 6178

012345678 6499.11

6497.10 6500.11

6501.11

6502.126496.106503.10

6496 6498 6500 6502 6504012345678

6828.16

6829.16

6826.166830.17

6831.166833.166825.15

6824 6826 6828 6830 6832 6834

Peaks

[M–3H]3–

[M–4H+Na]3–

[M–3H]3–

[M–3H]3–

[M–4H+Na]3–[M–4H+Na]3–

Deconvoluted Deconvoluted

Deconvoluted

×103×103 ×103 ×103

×103 ×104

A

C

B

D

Acquisition time (min)

Coun

tsCo

unts

Coun

tsRe

spon

se

Mass-to-charge (m/z)

Mass-to-charge (m/z)Mass-to-charge (m/z)

TIC: (A16 to A27) + U Peak 1: A18 + U

Peak 2: A19 + U Peak 3: A20 + U

Deconvoluted mass (amu)

Deconvoluted mass (amu)Deconvoluted mass (amu)

Figure 9. A16 to A27 oligonucleotides with misincorporated U nucleotide. (A) Total ion current chromatogram showing three selected peaks with extracted mass spectra shown in (B to D). Deconvoluted mass spectra of A18 + U (Mobs = 6,167.04 Da, Mtheo = 6,167.01 Da), A19 + U (Mobs = 6,496.10 Da, Mtheo = 6,496.07 Da), and A20 + U (Mobs = 6,825.15 Da, Mtheo = 6,825.12 Da) are shown as insets. Mobs: observed monoisotopic mass; Mtheo: theoretical monoisotopic mass.

Page 12: Analysis of mRNA Poly-A Sequence Variants by High ...

www.agilent.com/chem

For Research Use Only. Not for use in diagnostic procedures.

RA44231.5456365741

This information is subject to change without notice.

© Agilent Technologies, Inc. 2021 Printed in the USA, March 8, 2021 5994-3005EN

References1. Jackson, L. A. et al. An MRNA

Vaccine against SARS-CoV-2 — Preliminary Report. N. Engl. J. Med. 2020, 383, 1920–1931.

2. Mulligan, M. J. et al. Phase I/II Study of COVID-19 RNA Vaccine BNT162b1 in Adults. Nature 2020, 586, 589–593.

3. Pfizer and BioNTech Announce Vaccine Candidate Against COVID-19 Achieved Success in First Interim Analysis from Phase 3 Study.

4. Promising Interim Results from Clinical Trial of NIH-Moderna COVID-19 Vaccine.

5. Wadhwa, A. et al. Opportunities and Challenges in the Delivery of MRNA-Based Vaccines. Pharmaceutics 2020, 12, 102.

6. Kahvejian, A. et al. Mammalian Poly (A)-Binding Protein Is a Eukaryotic Translation Initiation Factor, Which Acts via Multiple Mechanisms. Genes Dev. 2005, 19, 104–113.

7. Beverly, M.; Hagen, C.; Slack, O. Poly A Tail Length Analysis of In Vitro Transcribed MRNA by LC/MS. Anal. Bioanal. Chem. 2018, 410, 1667–1677.

8. Birdsall, R. E. et al. Reduction of Metal Adducts in Oligonucleotide Mass Spectra in Ion-Pair Reversed-Phase Chromatography/Mass Spectrometry Analysis. Rapid Commun. Mass Spectrom. 2016, 30, 1667–1679.

9. Tateishi-Karimata, H.; Isono, N.; Sugimoto, N. New Insights into Transcription Fidelity: Thermal Stability of Non-Canonical Structures in Template DNA Regulates Transcriptional Arrest, Pause, and Slippage. PLOS ONE 2014, 9.

10. Yehudai-Resheff, S.; Schuster, G. Characterization of the E.Coli Poly(A) Polymerase: Nucleotide Specificity, RNA-Binding Affinities and RNA Structure Dependence. Nucleic Acids Res. 2000, 28, 1139–1144.