Trans -splicing in Trypanosoma brucei— results from genome-wide experiments

34
Trans-splicing in Trypanosoma brucei— results from genome-wide experiments Shai Carmi Bar-Ilan University Department of physics and the faculty of life sciences February 2010

description

Trans -splicing in Trypanosoma brucei— results from genome-wide experiments. Shai Carmi Bar-Ilan University Department of physics and the faculty of life sciences. February 2010. mRNA processing in T. brucei. Almost all genes have no promoters. - PowerPoint PPT Presentation

Transcript of Trans -splicing in Trypanosoma brucei— results from genome-wide experiments

Page 1: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Trans-splicing in Trypanosoma brucei—

results from genome-wide experiments

Shai CarmiBar-Ilan University

Department of physics and the faculty of life sciences

February 2010

Page 2: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

mRNA processing in T. brucei

Almost all genes have no promoters.

Gene expression is regulated by controlling splicing (?),mRNA stability, and translation.

Gene1 Gene2 Gene3 Gene4

PolycistronicTranscript

AAAA

AAAA

AAAAAAAA

SL

Itai Dov Tkacz

Trans-Splicing=And

Polyadenylation=

mature transcripts

translation

Page 3: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Splicing overviewSL- Spliced Leader RNA

See also:Liang et. al, Euk. Cell (2003).

Page 4: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

cis-splicing machinery and consensus

3’ splice-site

snRNPs

Yeast conserved branch site: TACTAAC

10-12nts

mammalian

Page 5: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Splicing regulation

splicing enhancer splicing silencer

SR proteins create ’bridges’ to stabilize the spliceosome

hnRNPIn trypanosomes:• U2F65 and 35 exist and do not interact.• U2F65 interacts with SF1.• Interacting SR proteins were identified.• hnRNP proteins exist.

Page 6: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Open questions 3’ splice site recognition and selection.

Spatial organization of splicing factors: protein-protein and protein-RNA interactions.

Splicing efficiency and gene expression regulation.

Detailed molecular mechanism of trans-splicing and spliceosome assembly, structure of 5’ splice site, SL-RNA biogenesis, and coupling to poly-adenylation:not in this talk.

Page 7: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Past studies of splicing regulation Clayton et. al, Mol. Biochem. Parasit. (2005):

Calculated the statistical properties of the splice sites based on a couple of hundreds ESTs.

Clayton et. al, Mol. Cell. Biol. (1994); Ullu et. al, Mol. Cell. Biol. (1998); Cross et. al, Mol. Cell. Biol. (2005):Used reporter gene systems with the splice sites of model genes (tubulin, actin, procyclin) to study the effect of splice site composition on splicing efficiency.

Limited applicability.

promoter intron 5’UTR reporter geneAG

Taken from endogenous gene and mutated

3’ splice-site

Page 8: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Major known facts Poly-adenylation is coupled to downstream trans-splicing.

Hierarchy of trans-splicing and polyA signals exist. Specific sequences in the 5’UTR (exon) are required for splicing. Optimal PPT should be 25 nts long, U dominated but interspersed

with Cs, and have no two consecutive purines. Optimal PPT-AG spacer should be 20-25 nts long, have U at

position -3 and never AC at [-3,-4].

reporter gene 3’UTR 5’UTR reporter geneintergenic region

3’ splice-sitepolyA-site

Page 9: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Research strategy– outline

Sequence all messenger RNAs to map transcript boundaries.

Silence splicing factors and measure the effect on each transcript.

Examine the splice site regions of regulated genes to infer possible roles for splicing factors and mechanisms of splicing regulation.

Page 10: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Methods– deep sequencing

illumina guide.

Page 11: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Deep sequencing of T. brucei mRNA Experiment performed at Ullu and Tschudi’s lab, Yale University. Library preparation:

Total RNA

Poly(A)+ RNA selectionTerminator exonuclease treatment

First strand cDNA synthesis with random hexamer or oligo(dT) primers

First strand cDNA synthesis with random hexamer primers

Second strand cDNA synthesis with RNaseH-derived RNA primers

Second strand cDNA synthesis with SL primer

cDNA fragmentation and size selection

Addition of adapters and amplification

Illumina sequencing

15 million useful reads!

Page 12: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Ullu’s lab results 532 transcripts with misannotated start codon. 805 annotated genes not producing an transcript. 442 genes with alternative transcript in their UTRs. 1,114 new transcripts, conserved coding and non-coding. Trans-splicing and polyadenylation of snoRNA clusters. The experimental method can be slightly modified to discover pol-II

transcription initiation sites. These sites were found at strand-switch-regions, in proximity to tRNA genes, and within transcription units.

Digital gene expression.

0

5

10

15

20

25

30

1 10 100 1000 10000 100000 1000000relative abundance

num

ber o

f gen

es

0-1 1-10 10-100 >100 mRNAmolecules

per cell

75% of genes

Page 13: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Examples of reannotated featuresChr VIII

Chr X

Chr VII

Chr XI

Chr VII

Correctly annotated gene cluster.Blue- number of reads from SL-enriched library.Red- number of reads from polyA-enriched library.

A novel transcript.

A misannotated start codon.Blues line at the bottom denote SL reads.

An ORF which is part of a larger transcript.

A short transcript at the 3’end of a gene.Red lines at the bottom denote polyA reads.

Examples were experimentally verified for all cases.

Page 14: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Statistics of UTR lengths

UTR length distribution is approximately log-normal.

median- 91

5’

3’

median- 388

Page 15: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Splice-site composition

PPT

No signal observed in the exon

No G allowed at the -3 position

Non AG splice-sites due to sequencing errors and strain differences.

Maximum at about -25,distance from AG varies:unique to trypansomes.

Page 16: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Splice-site compositionPyrimidine content

Sites closer to the PPT are stronger.

PPT disturbed along tens of nucleotides.Purines favored in the exon.

exon

AG

Page 17: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Splice-site compositionAC is not preferred at positions [-3,-4] of the 3’ splice-site:Splice-site with AC are less abundant.

Page 18: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Splicing heterogeneity

Not alternative splicing in the regular sense- leads to the same protein.

Average distance (nts) of all weak splice sites from the strongest splice site.

Uncertainty of splice-site usage.

i

ii ppH ln

log-scale

6967 genes: one major site978 genes: two major sites21 genes: three major sitesUncertainty

Page 19: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Splicing heterogeneity illustrated

• Each row correspond to one gene.• Each site is denoted with a bar.• Sites are centered around the strongest site.• Bar color is according to relative usage.

0204060

-300 -100 100 300

ATG

nt position relative to START codon

rela

tive

usag

e of

tran

s-sp

lice

site

s

Downstream sites are more popular.Some sites are found in frame.

Page 20: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Predicting splicing heterogeneity What determines if a gene will be differentially spliced? Look at 100nts up- and down-stream the strongest site. Rank all potential splice sites: TAG-3, AAG, CAG-2, GAG-1. heterogeneity rank of a gene = sum of ranks of all other AG

dinucleotides / rank of strongest site. Average heterogeneity rank about 10 for high uncertainty genes, but

only about 7 for low uncertainty genes (P=10-20). Signatures do not look meaningful, but analysis show that longer

5’UTRs, shorter PPTs, and longer PPT-AG distance also contribute significantly to heterogeneity.

Page 21: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

What is heterogeneity good for? Unclear at the moment. Such heterogeneity is not found

in other organisms. In cis-splicing, exon boundaries must be conserved to

maintain intact coding sequence. In trans-splicing, such evolutionary pressure does not exist.

However, trans-splicing heterogeneity was not observed in C. elegans.

Can reflect another level of complexity in gene expression regulation, as the degree of heterogeneity significantly varies throughout the genome.

Page 22: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Explaining abundance A-rich exons are more abundant.

Other correlations:Genes with longer PPT and shorter 5’UTR are more abundant.

Splice-site ambiguity is anti-correlated with abundance.

Page 23: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

A possible model for splicing factors organization? U2F65 does not bind U2F35, so AG can be far from PPT. Variable distance between AG and PPT allows regulation by

differential binding of the splicing efficiency.

intergenic region BP PPT AG 5’UTR

0-8010-30

Optimal: 25 25 AC-rich

AG

competitor splice-site

Page 24: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Silencing methods– RNAiStem-loop construct

T7-opposing construct

Inducible by Tertracycline.Gene is silenced after 3 days.

Wang et. al, JBC (2000).

Page 25: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Silencing methods– microarrays Microarrays are chips on

which thousands of DNA oligos are printed in an array. Each oligo represents a fragment of one gene.

Expression profiles of entire genomes are obtained in a single experiment.

Wikipedia

Page 26: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Genome-wide observations

Hundreds of genes are upregulated- unprecedented phenomenon.

U2F65 and SF1 are physically interacting and thus have similar pattern.

Vazquez et al., Mol. Biochem Parasitol. 164, 137 (2009).

red-up, green-down.

Page 27: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Genome-wide correlations

Potential protein-protein interactions should be biochemically verified. Interactions maybe indirect.

Spearman correlation coeffi cientPrp43 SmD1 U2F35 Prp31 U2F65 SF1 U1 PTB1 PTB2 Tsr1 Tsr1IP hnRNP_FHPrp19

Prp43 1 0.278349 -0.13685 0.294357 0.240051 0.342593 0.149605 0.125257 0.130586 -0.02391 0.221945 0.204737 0.10404SmD1 0.278349 1 0.044152 0.383218 0.333834 0.315953 -0.01695 0.230517 0.163068 0.041223 0.28852 0.494197 0.068624U2F35 -0.13685 0.044152 1 -0.3023 0.435671 0.190754 0.378621 0.010175 0.264658 0.375165 0.500294 0.255059 0.088768Prp31 0.294357 0.383218 -0.3023 1 0.217689 0.248819 0.017184 0.179219 -0.13272 0.024078 0.106424 -0.11128 0.126101U2F65 0.240051 0.333834 0.435671 0.217689 1 0.698639 0.428154 0.071559 0.290715 0.394992 0.742415 0.366936 0.169848SF1 0.342593 0.315953 0.190754 0.248819 0.698639 1 0.261155 0.175059 0.276896 0.056967 0.682194 0.344552 0.199872U1 0.149605 -0.01695 0.378621 0.017184 0.428154 0.261155 1 0.007941 0.189195 0.312908 0.38916 0.174986 0.078526PTB1 0.125257 0.230517 0.010175 0.179219 0.071559 0.175059 0.007941 1 0.254598 -0.11872 0.169165 0.024827 0.21833PTB2 0.130586 0.163068 0.264658 -0.13272 0.290715 0.276896 0.189195 0.254598 1 0.178874 0.345913 0.37377 0.178053Tsr1 -0.02391 0.041223 0.375165 0.024078 0.394992 0.056967 0.312908 -0.11872 0.178874 1 0.30302 0.147911 0.07821Tsr1IP 0.221945 0.28852 0.500294 0.106424 0.742415 0.682194 0.38916 0.169165 0.345913 0.30302 1 0.348961 0.231646hnRNP_FH 0.204737 0.494197 0.255059 -0.11128 0.366936 0.344552 0.174986 0.024827 0.37377 0.147911 0.348961 1 0.052611Prp19 0.10404 0.068624 0.088768 0.126101 0.169848 0.199872 0.078526 0.21833 0.178053 0.07821 0.231646 0.052611 1

Page 28: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Processes affected by splicing defects Upregulated- Mostly

ribosomal and translation involved proteins, peptidases, and chaperones.10 candidates verified experimentally by RT-PCR.

Downregulted-Mostly metabolic enzymes and transporters.

Page 29: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Downregulated genes The sequence at the splice site of the genes most impacted by

silencing may indicate the role of the splicing factor. Look at PPT length and distance to 3’ splice-site.

Most results are negative (discuss reason later).

P-value=0.001 P-value=0.004

Genes with shorter PPT require SF1 Genes with longer PPT-AG distance require PTB1

Page 30: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Sequence motifs Using DRIM tool of Yael Mandel-Gutfreund’s lab.

Hard to assess the significance of the motifs. Surprisingly no pyrimidine-rich motifs identified. Other tools not suited for RNA motifs or intended for the human

genome and thus perform poorly. Should look which elements are conserved.

U2F65 SF1 U2F65 SF1 U2F65 SF1Up Up Down Down Both Both5'UTR 3'UTR 5'UTR 3'UTR 5'UTR 3'UTR 5'UTR 3'UTR 5'UTR 3'UTR 5'UTR 3'UTRAGGGT TTAAG TTGCT TAAGG ACTTC TTTAG None TGTCA ACTCT AAGGG None GCGGGTACAT GAAAA CAACC AAAAC ATAAA AAGCG AATTT TAAGGCCCCA GGCAG AGAGA TCAAT GCGGG

TAAGT GGGGT GGTAA CAAAACTTTT ACTCA TTAGTACATA CTACC

hnRNPF/H binding sites.

Page 31: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Mechanisms of regulation RNA level regulation can be mediated via two mechanisms: 1. mRNA stability. The 3’UTR carries a specific sequence that causes stabilization or destabilization under

given experimental conditions (silencing). Demonstrated experimentally for a few upregulated genes. Binding can be directly to the silenced splicing factor (U2F65, SF1, …). Splicing factors

have been shown to bind mature mRNA in human cells (Carmo-Fonseca et. al, 2006). Alternatively, binding can be to some other factor which is affected by the silencing

(secondary effect). Binding can induce both up- and down-regulation of different genes, depending on the

context (e.g., competing with stabilizing/destabilizing proteins). Regulation might not due to binding but due to secondary structure. 2. Splicing defects. The absence of a splicing factor might cause downregulation of genes for which it is

required for splicing. Such genes may have certain properties such as weak splice site, long PPT-AG distance,

short PPT, competition with other AGs, etc.

Page 32: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Discussion (problems) Computational approaches are limited by low reproducibility of the

microarrays, noisy fold changes, and the very small number of genes affected by more than one factor.

Genes with splicing defects are masked by many more genes which are regulated by mRNA stability. It is unclear at the moment if there is a significant number of genes regulated by splicing.

mRNA stability can be mediated by more than one factor (primary and secondary effects).

Thus, a clean set of genes which undergo the same regulation is hard to obtain.

Page 33: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Discussion (future plans) Computational: Deep-sequencing of Leishmania at Ullu’s lab may provide

information about conserved regulatory elements. Secondary structure of 3’UTR will be explored. Experimental: Reporter gene system with the intergenic region of a model gene. CLIP-seq (in vivo cross linking and immunoprecipiation followed by

deep-sequencing) should yield RNA binding sites. Examine splicing defects (accumulation of SL-RNA or Y-structure)

of individual genes or genome-wide (co-silencing of the exosome).

Page 34: Trans -splicing in  Trypanosoma brucei—  results from genome-wide experiments

Thank you for your attention!