Determining Functional Conformations of Two HDV III Strains

1
Determining Functional Conformations of Two HDV III Determining Functional Conformations of Two HDV III Strains Strains Wojciech Kasprzak, 1 Sarah D. Linnstaedt, 2 John L. Casey, 2 and Bruce A. Shapiro 3 1 Basic Research Program, SAIC-Frederick, Inc., NCI-Frederick, Frederick, MD 2 Department of Microbiology and Immunology, Georgetown University Medical Center, Washington, DC 3 Center for Cancer Research Nanobiology Program, National Cancer Institute at Frederick, Frederick, MD Hepatitis Delta virus (HDV) is a sub- viral human pathogen aggravating Hepatitis B virus (HBV) liver infections. The short HDV genome (~1680 nt) is a single stranded, circular RNA encoding only one protein, the hepatitis delta antigen (HDAg). The host enzyme ADAR1 edits the HDV stop codon (UAG) into a tryptophan (W) codon (UGG) enabling expression of the two forms of the protein, short and long, from the same open reading frame. HDAg-S is required for replication, while HDAg-L enables viral particle formation and inhibits replication. The balance between the two forms is crucial and editing must be regulated. We have applied our programs, MPGAfold and StructureLab, to predict and examine the folding coformations/states of an HDV III construct. This construct includes the editing site (amber/W) and has the editing capabilities of the full HDV III. The predicted secondary structure folding dynamics indicates that the HDV III RNA forms a meta-stable branched structure and a stable rod structure. Both were observed in vitro, and the branched structure was identified as the one enabling editing. Computational predictions and the experimental data also indicate that an Ecuadorian strain folds into the editing-capable structures more readily than a Peruvian strain, and we indicate the reasons for the difference. Thus the folding dynamics of HDV III strains appears to strongly influence their RNA editing levels. Funded in part by NCI Contract N01-CO-12400 Abstract HDV genome is a single-stranded, circular RNA encoding only the hepatitis delta antigen protein (HDAg). RNA editing produces HDAg-S and HDAg-L from the same open reading frame. Editing takes place at the amber/W site (ADAR1 deamination from UAG stop codon to UIG=UGG tryptophan (W) codon). HDAg-S is required for replication. HDAg-L enables viral particle formation and inhibits replication. Balance is crucial and editing must be regulated. Hepatitis delta virus (HDV) increases the severity of liver disease in Hepatitis B virus (HBV) infections. Hepatitis Delta Virus: Background HDV type III Peruvian and Ecuadorian isolates have been examined by computational analysis and in vitro and in vivo experiments. We have been studying how these two strains differ in their ability to distribute their RNA between branched (edited) and unbranched structures, as well as studying the efficiency of editing. Both structure and substrate quality were found to contribute to overall editing levels. This presentation concentrates on the structural issues responsible for the differences in the levels of editing conformations in HDV III Ecuadorian and Peruvian strains. Control of Editing Levels in HDV III Strains Introduction HDV antigenome RNA coding side non-coding side | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | UAG HDV RNA construct HDAg-L HDAg-S HDAg W UGG GENERATION 1 581 Branched editing Linea r rod near-rod Branche d A B D STEMS 5’ 3’ 1 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 in 5´ to 3´order Most Frequent Structure at a Generation HDV III Ecuador MPGAfold and Experiments C E C A SL1 (B) 5’-3’ (A) Branched editing LINKER (C) SL2 (D) E = -188.4 kcal/mol SL B E = -196.6 kcal/mol Linear (rod) A C BRANCHED RNA LINEAR RNA BRANCHED LINEAR TIME (min) BRANCHED LINEAR HDV CONSTRUCTS WT MUT Mutations destabilizing linear structure 1 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 E All percentages are based on 100 runs for each MPGAfold population with the output based on the peak histogram structures. Population 4K 8K 16K 32K 64K Linear (all) 35 40 57 81 98 Branched (all) 65 60 43 19 2 B. edit final 31 30 16 6 0 B. edit trans. 23 30 41 57 71 Reference: RNA (2006), 12:1521- 1533. MPGAfold Statistics MPGAfold captures the folding states of the HDV III Ecuador, showing the edit conformation (Be) as both the final and a transitional state. Agreement with the high levels of Be conformers was observed experimentally in vitro and in patients. Folding results for the HDV III Peruvian strain reflect experimentally observed differences in the levels of the Be edit structures and their relative stability. Conclusions 1. Seed a population of a chosen size with initial structure elements (stems from a pre-generated stem pool). 2. Apply random structure mutations (stems) and recombinations (sets of stems) to produce new structures for each generation. 3. Apply the fitness function to the new structures and select for the next generation from those that are most fit (best free energy, including coaxial stem stacking calculations). 4. Repeat steps 2 and 3 for N generations, iterating toward the optimal solution. GA is a stochastic algorithm, which requires multiple runs to find the prevailing conformation. MPGAfold: Massively Parallel Genetic Algorithm Parent 1 Parent 2 Child 3x3 Neighborhood Stem Trace plots all of the unique stems, defined as triplets (5´, 3´, size), for all of the structures in a solution space StructureLab: Stem Trace Data Visualization Computational Tools The massively parallel genetic algorithm (MPGAfold) captures RNA folding pathways, including functional intermediates and final states existing in a highly combinatoric solution space. A significant amount of information comes from each MPGAfold run, as well as from a set of runs, including variable population runs . Interpretation of the results is facilitated by various visualization tools that are part of StructureLab and MPGAfold. Each one of these tools views the data from a somewhat different perspective. Ultimately, these perspectives are combined to reach an understanding of the folding patterns of the RNA in question. RNA Structure Prediction and Analysis Tools SL 1 SL 2 extra nt LINKER 5’- 3’ Be - Peruvian WT mutations - Amber/W site extra nt Folding pathways results for the Peruvian strain has not yet (as of November 2007) been published, and we decided not to show this information at this point (even though it was presented in the poster form at the IMA 2007 meeting). HDV III Peru E = - 189.8 kcal/mol Be/Bx 5’- 3’ post-SL 2 HDV III Ecuador: Branched Conformations SL 1 SL 2 E = - 188.4 kcal/mol Be 5’- 3’ LINKER E = -195.7 kcal/mol SL 1 E = -196.6 kcal/mol E = -189.0 kcal/mol SL 1 L2x Lx Lro d 5’- 3’ 5’- 3’ post-SL 2 post-SL 2 Linear Conformations SL 1 SL 2 LINKER E = -188.2 kcal/mol Bx 5’- 3’ post-SL 2 SL 1 SL 2 LINKER LINKER - Amber/W site STRUCTURES STEMS (5’, 3’, size) 20 Raw stem trace: stems in order of appearance in the input structures 4 5’ – position sorted stem trace 20 4 SL3 PBS TAR SL1s polyA LDI HIV-1 MN, 366 nt, 20 runs of GA, population: 16K Struct. 20 SL3 TAR PBS DIS in LDI Struct. 4 TAR polyA SL DIS SL3 PBS 1 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 DIS Population Fitness Map and Histogram Linstaedt, S.D., Kasprzak, W., Shapiro, B.A., and Casey, J.L.: The Role of Metastable RNA Secondary Structure in Hepatitis Delta Virus Genotype III RNA Editing. RNA, 12(8): 1521- 1533, 2006. Shapiro, B.A., Kasprzak, W., Grunewald, C., and Aman, J.: Graphical Exploratory Data Analysis of RNA Secondary Structure Dynamics Predicted by the Massively Parallel Genetic Algorithm. Journal of Molecular Graphics and Modeling, 25(4): 514-531, 2006. Shapiro, B.A., Bengali, D., Kasprzak, W., and Wu, J-C.: RNA folding pathway functional intermediates: Their prediction and Analysis. Journal of Molecular Biology, 312:27-44, 2001. Kasprzak, W. and Shapiro, B.A.: Stem Trace: an interactive visual tool for comparative RNA structure analysis, Bioinformatics, 15(1):16-31, 1999. Shapiro, B.A., Wu, J-C.: Predicting RNA H- type pseudoknots with the massively parallel genetic algorithm, Comput Appl Biosci. 13: 459-71, 1997. Shapiro BA, Navetta J.: A massiverly parallel Selected References

description

Struct. 4. HDV antigenome RNA. 90. 90. 90. STEMS (5’, 3’, size). 80. 80. 80. 70. 70. 70. 60. 60. 60. coding side. 50. 50. 50. UAG. 40. 40. 40. 30. 30. 30. non-coding side. 20. 20. 20. 10. 10. 10. Linear (rod). A. C. HDV CONSTRUCTS WT MUT. - PowerPoint PPT Presentation

Transcript of Determining Functional Conformations of Two HDV III Strains

Page 1: Determining Functional Conformations of Two HDV III Strains

Determining Functional Conformations of Two HDV III StrainsDetermining Functional Conformations of Two HDV III StrainsWojciech Kasprzak,1 Sarah D. Linnstaedt,2 John L. Casey,2 and Bruce A. Shapiro3

1Basic Research Program, SAIC-Frederick, Inc., NCI-Frederick, Frederick, MD2Department of Microbiology and Immunology, Georgetown University Medical Center, Washington, DC

3Center for Cancer Research Nanobiology Program, National Cancer Institute at Frederick, Frederick, MD

Hepatitis Delta virus (HDV) is a sub-viral human pathogen aggravating Hepatitis B virus (HBV) liver infections. The short HDV genome (~1680 nt) is a single stranded, circular RNA encoding only one protein, the hepatitis delta antigen (HDAg). The host enzyme ADAR1 edits the HDV stop codon (UAG) into a tryptophan (W) codon (UGG) enabling expression of the two forms of the protein, short and long, from the same open reading frame. HDAg-S is required for replication, while HDAg-L enables viral particle formation and inhibits replication. The balance between the two forms is crucial and editing must be regulated.

We have applied our programs, MPGAfold and StructureLab, to predict and examine the folding coformations/states of an HDV III construct. This construct includes the editing site (amber/W) and has the editing capabilities of the full HDV III. The predicted secondary structure folding dynamics indicates that the HDV III RNA forms a meta-stable branched structure and a stable rod structure. Both were observed in vitro, and the branched structure was identified as the one enabling editing. Computational predictions and the experimental data also indicate that an Ecuadorian strain folds into the editing-capable structures more readily than a Peruvian strain, and we indicate the reasons for the difference. Thus the folding dynamics of HDV III strains appears to strongly influence their RNA editing levels.

Funded in part by NCI Contract N01-CO-12400

Abstract

HDV genome is a single-stranded, circular RNA encoding only the hepatitis delta antigen protein (HDAg).

RNA editing produces HDAg-S and HDAg-L from the same open reading frame.

Editing takes place at the amber/W site (ADAR1 deamination from UAG stop codon to UIG=UGG tryptophan (W) codon).

HDAg-S is required for replication. HDAg-L enables viral particle formation and inhibits replication. Balance is crucial and editing must be regulated.

Hepatitis delta virus (HDV) increases the severity of liver disease in Hepatitis B virus (HBV) infections.

Hepatitis Delta Virus: Background

HDV type III Peruvian and Ecuadorian isolates have been examined by computational analysis and in vitro and in vivo experiments.

We have been studying how these two strains differ in their ability to distribute their RNA between branched (edited) and unbranched structures, as well as studying the efficiency of editing.

Both structure and substrate quality were found to contribute to overall editing levels.

This presentation concentrates on the structural issues responsible for the differences in the levels of editing conformations in HDV III Ecuadorian and Peruvian strains.

Control of Editing Levels in HDV III Strains

Introduction

HDV antigenome RNA

coding side

non-coding side

| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |UAG

HDV RNA construct

HDAg-L HDAg-S HDAg W

UGG

GENERATION1 581

Branched editing

Linearrodnear-rod

Branched

A

B

DSTEMS

5’

3’

10

90

80

70

6050

40

30

20

in 5´ to 3´order

Most Frequent Structure at a Generation

HDV III EcuadorMPGAfold and Experiments

C

E

C

A

SL1 (B)

5’-3’ (A)

Branched editing

LINKER (C)

SL2 (D)

E = -188.4 kcal/mol

SL B

E = -196.6 kcal/mol

Linear (rod)A

C

BRANCHED RNA LINEAR RNA

BRANCHED

LINEAR

TIME (min)

BRANCHED

LINEAR

HDV CONSTRUCTS

WT MUTMutations

destabilizing linear structure

10

90

80

70

6050

40

30

20

E

All percentages are based on 100 runs for each MPGAfold population with the output based on the peak histogram structures.

Population 4K 8K 16K 32K 64K

Linear (all) 35 40 57 81 98

Branched (all) 65 60 43 19 2

B. edit final 31 30 16 6 0

B. edit trans. 23 30 41 57 71

Reference: RNA (2006), 12:1521-1533.

MPGAfold Statistics

MPGAfold captures the folding states of the HDV III Ecuador, showing the edit conformation (Be) as both the final and a transitional state.

Agreement with the high levels of Be conformers was observed experimentally in vitro and in patients.

Folding results for the HDV III Peruvian strain reflect experimentally observed differences in the levels of the Be edit structures and their relative stability.

Conclusions

1. Seed a population of a chosen size with initial structure elements (stems from a pre-generated stem pool).

2. Apply random structure mutations (stems) and recombinations (sets of stems) to produce new structures for each generation.

3. Apply the fitness function to the new structures and select for the next generation from those that are most fit (best free energy, including coaxial stem stacking calculations).

4. Repeat steps 2 and 3 for N generations, iterating toward the optimal solution.

GA is a stochastic algorithm, which requires multiple runs to find the prevailing conformation.

MPGAfold: Massively Parallel Genetic Algorithm

Parent 1

Parent 2

Child

3x3 Neighborhood

Stem Trace plots all of the unique stems, defined as triplets (5´, 3´, size), for all of the structures in a solution space

StructureLab: Stem Trace Data Visualization

Computational Tools

The massively parallel genetic algorithm (MPGAfold) captures RNA folding pathways, including functional intermediates and final states existing in a highly combinatoric solution space.

A significant amount of information comes from each MPGAfold run, as well as from a set of runs, including variable population runs.

Interpretation of the results is facilitated by various visualization tools that are part of StructureLab and MPGAfold.

Each one of these tools views the data from a somewhat different perspective.

Ultimately, these perspectives are combined to reach an understanding of the folding patterns of the RNA in question.

RNA Structure Prediction and Analysis Tools

SL 1

SL 2

extra nt

LINKER

5’-3’

Be

- Peruvian WT mutations

- Amber/W site

extra nt

Folding pathways results for the Peruvian strain has not yet (as of November 2007) been published, and we decided not to show this information at this point (even though it was presented in the poster form at the IMA 2007 meeting).

HDV III Peru

E = -189.8 kcal/mol

Be/Bx

5’-3’post-SL 2

HDV III Ecuador: Branched Conformations

SL 1

SL 2

E = -188.4 kcal/mol

Be

5’-3’

LINKER

E = -195.7 kcal/molSL 1

E = -196.6 kcal/mol

E = -189.0 kcal/molSL 1L2x

Lx

Lrod

5’-3’

5’-3’

post-SL 2

post-SL 2

Linear Conformations

SL 1

SL 2

LINKER

E = -188.2 kcal/mol

Bx

5’-3’ post-SL 2

SL 1

SL 2

LINKER

LINKER

- Amber/W site

STRUCTURES

STEMS (5’, 3’, size)

20

Raw stem trace: stems in order of appearance in the

input structures

4

5’ – position sorted stem trace

204

SL3

PBS

TAR

SL1s

polyA

LDI

HIV-1 MN, 366 nt, 20 runs of GA, population: 16K

Struct. 20

SL3

TAR

PBS

DIS in LDI

Struct. 4

TAR

polyA SL

DIS

SL3

PBS

10

90

80

70

6050

40

30

20

DIS

Population Fitness Map and Histogram

Linstaedt, S.D., Kasprzak, W., Shapiro, B.A., and Casey, J.L.: The Role of Metastable RNA Secondary Structure in Hepatitis Delta Virus Genotype III RNA Editing. RNA, 12(8): 1521-1533, 2006.

Shapiro, B.A., Kasprzak, W., Grunewald, C., and Aman, J.: Graphical Exploratory Data Analysis of RNA Secondary Structure Dynamics Predicted by the Massively Parallel Genetic Algorithm. Journal of Molecular Graphics and Modeling, 25(4): 514-531, 2006.

Shapiro, B.A., Bengali, D., Kasprzak, W., and Wu, J-C.: RNA folding pathway functional intermediates: Their prediction and Analysis. Journal of Molecular Biology, 312:27-44, 2001.

Kasprzak, W. and Shapiro, B.A.: Stem Trace: an interactive visual tool for comparative RNA structure analysis, Bioinformatics, 15(1):16-31, 1999.

Shapiro, B.A., Wu, J-C.: Predicting RNA H-type pseudoknots with the massively parallel genetic algorithm, Comput Appl Biosci. 13: 459-71, 1997.

Shapiro BA, Navetta J.: A massiverly parallel genetic algorithm for RNA secondary structure prediction,The Journal of Supercomputing. 8: 195-207, 1994.

Selected References