Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course...

75
Alternative Splicing Alternative Splicing Hedi Hegyi, PhD @ Institute of Enzymology, Budapest Institute of Enzymology, Budapest http://www.enzim.hu/~hegyi/ Szeged University, Biochemistry Szeged University, Biochemistry Course Course Oct 31, 2007 Oct 31, 2007

Transcript of Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course...

Page 1: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Alternative SplicingAlternative Splicing

Hedi Hegyi, PhD@ Institute of Enzymology, BudapestInstitute of Enzymology, Budapest

http://www.enzim.hu/~hegyi/

Szeged University, Biochemistry Szeged University, Biochemistry Course Course

Oct 31, 2007Oct 31, 2007

Page 2: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Scientific American 2005/04

Page 3: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Scientific American 2005/04

-Spring of 2000. Molecular biologists placing dollar bets: how many genes in human genome?

90,000? 153,000? C.elegans:19,500, Maize:40,000

35,000, 30,000, a paltry 25,000!

Page 4: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

C-value paradox: Complexity C-value paradox: Complexity does not correlate with does not correlate with genome size. (C.A. Thomas, genome size. (C.A. Thomas, Jr, 1971)Jr, 1971)

3.4 x 109 bpHomo sapiens

6.7 x 1011 bpAmoeba dubia

Page 5: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

N-value paradox: Complexity N-value paradox: Complexity does not correlate with gene does not correlate with gene number.number.

~25,000 genes~25,000 genes ~26,000 genes~26,000 genes ~50,000 genes~50,000 genes

Page 6: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Discovery of Alternative SplicingDiscovery of Alternative Splicing

-- Alternative splicing gives two forms of the protein with different C-termini:

–- 1 form is shorter and secreted–- Other stays anchored in the plasma

membrane via C-terminus

-First predicted by Walter Gilbert in 1978

- First discovered for an Immunoglobulin heavy chain gene in 1980 (Edmund Choi, Michael Kuehl & Randolph Wall, Nature 286, 776 - 779)

Page 7: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

S - signal peptide Red – untranslated regionV - variable region Green – membrane anchorC - constant region YellowYellow – end of coding reg. for secreted form

Alternative splicing of the mouse Alternative splicing of the mouse immunoglobulin μ heavy chain immunoglobulin μ heavy chain genegene

Page 8: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Splicing & the Splicing & the spliceosomespliceosome

StructureStructure• 60S dynamic structure – a large

complex consisting of ~ 150 proteins• Five small nuclear RNAs (U1, U2, U4, U1, U2, U4,

U5 & U6U5 & U6) • RNAs assemble with proteins to form

snRNPs (“snurpssnurps”)• Protein splicing factorsAssembly of spliceosome requires Assembly of spliceosome requires

ATPATP

Splicing defectsSplicing defects• Estimation: 15% of all genetic

diseases associated with mutated splice sites Green globuleGreen globule: RNA pol

Yellow globuleYellow globule: spliceosome

Page 9: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

snRNAssnRNAs

snRNAsnRNALength Length

(nts)(nts)FunctionFunction

U1 165 Binds 5’ splice site, then 3’ splice site

U2 185Binds the branch site and forms part

of the catalytic center

U4 116 Masks the catalytic activity of U6

U5 145 Binds the 5’ splice site

U6 106 Catalyzes splicing

Page 10: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

U1

U2

U4

U5

U6

Orange - interaction with 5’ splice siteGreen – Interaction with branch siteBlue - interaction between U2 and U6Tan - Sm-binding site (PuAU4-6GPu) flanked by two stem-loop structures

Secondary structure of snRNAsSecondary structure of snRNAs

Page 11: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

U1 snRNAU1 snRNA

• Contains conserved sequence complementary to 5’ splice site of nuclear mRNA introns

• Contains pseudouridine ()

Upstream exon

5’ splice site

GUAAGU-------3’ ::::::3’---CAUUCA---cap-5’

U1 snRNA

Page 12: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Splice-site recognitionSplice-site recognition

---AGGUAAGU-----------A--------(Py)nNCAGG

upstream exon

downstream exon

Intron

5’ splice site 3’ splice site

branch site

~ 20 – 50 nts

Branch site in yeast: often 5’- UACUAAC-3’

Page 13: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Splice Site ConservationSplice Site Conservation

E I E EI 3’5’

Splice Junction

XX YY

Class XX YY

U2_GT_AG GT AG

U2_GC_AG GC AG

U12_GT_AG GT AG

U12_AT_AC AT AC

Donor (5’) SS Acceptor (3’) SS

Page 14: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Splice Site ConservationSplice Site Conservation

E I E EI 3’5’

Splice Junction

XX YY

Class XX YY

U2_GT_AG (13289) GT AG

U2_GC_AG (1085 ) GC AG

U12_GT_AG (688) GT AG

U12_AT_AC (187) AT AC

Donor (5’) SS Acceptor (3’) SS

Page 15: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Splicing mechanismSplicing mechanism

GU A AGU1 U2

U4U5

U6

Exon 1 Intron Exon 2

3’ splice site5’ splice site branch site

U1 U2

ATP

U4U5

U6

AG

Page 16: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Factors Playing a Role in Exon Factors Playing a Role in Exon RecognitionRecognition1. Evolution appears to have weakened splice sites

Derived from 253, only 3% of the S. cerevisiae genes contain intronsNo Alternative Splicing

Derived from 4,697 S. pombe genes; approximately 43% of all genes contain intronsIntron Retention

Derived from 49,778, nearly 100% contain introns75% Alternative Splicing

- G T C C A T T C A - 5' U1

Page 17: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Exon Recognition is complex

Complexity means multiple points of possible regulation and that exons could be skipped by failing to get all the pieces in place

Nature, Vol. 418, p. 236, 2002

The ability to form or disrupt these interactions is thought to play a key role in alternative splicing!!!

Exon Definition

Intron Definition

Factors Playing a Role in Exon Factors Playing a Role in Exon RecognitionRecognition

Page 18: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Intron Intron statisticsstatistics Species Average Average Average Average %

exon exon No. intron No. length(kb) kb mRNA per gene

Yeast 1 0 1.6 1.6 100Nematode 4 3 4.0 3.0 75Fruit fly 4 3 11.3 2.7 24Chicken 9 8 13.9 2.4 17Mammals 7 6 16.6 2.2 13

Human Human genesgenes Median Mean

Size of internal exons 122 bp 145 bpNumber of exons 7 8.8Size of introns 1023 bp 3356 bp

Page 19: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Recognizing Exons - Are Splice Sites Recognizing Exons - Are Splice Sites Enough?Enough?Most mammalian genes contain more than one intron

Most genes are uninterrupted in yeast, but most genes are interrupted in flies and mammals

Extreme Examples:

Collagen Gene - 50 exons and a 40

kb precursor RNA

DMD Gene - 79 exons, 2.3 mb

precursor, intron 20 is 180 kb

Neurexin - has a 0.5 mb intron!

Page 20: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Support for the Exon Definition Model Support for the Exon Definition Model

Exon size is conserved in vertebrates Intron size is not

Recognition requires a consistent target size

Page 21: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Recruiting SR proteins can stabilize E complex formation

Blocking snRNP or protein interactions can prevent E complex Blocking snRNP or protein interactions can prevent E complex formationformation

Additional Additional CisCis Elements and Elements and TransTrans-Acting -Acting FactorsFactors

- Splicing has Enhancers and Silencers- They function to modulate spliceosome formation

ESE - Exonic Splicing EnhancerESE - Exonic Splicing EnhancerESS - Exonic Splicing SilencerESS - Exonic Splicing Silencer

ISE - Intronic Splicing EnhancerISE - Intronic Splicing EnhancerISS - Intronic Splicing SilencerISS - Intronic Splicing Silencer

Page 22: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Models of Exon Recognition - Final Models of Exon Recognition - Final PointsPoints

Trans-Factor Interaction with Exon differs from Interaction with Introns

Page 23: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Types of Types of Alternative Alternative

SplicingSplicing

38%

18%

8%

3%

Remaining 33%

Cell 126:37 (2006)Nature Reviews: Genetics 5:773 (2004)

Page 24: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

How Prevalent is Alternative How Prevalent is Alternative Splicing?Splicing?

No one really knows for sure.

EST Database estimates between 35 - 60% of protein coding gene have alternative mRNAs

Caveat - These databases contain sequences derived from aberrant, as well as, alternative splicing, they are typically 3' and 5' end biased, and have insufficient number to infer frequency

Therefore, database mining may overestimate the rate of alternative splicing

Page 25: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Array-Based NumbersArray-Based Numbers

Science 302, 2141-44 (2003)74%

Page 26: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Genome-Wide Survey of Human Alternative Pre-Genome-Wide Survey of Human Alternative Pre-mRNA Splicing with Exon Junction Microarrays mRNA Splicing with Exon Junction Microarrays ((Science, 2003Science, 2003))

Conclusion: 74% of multi-exon human genes are alternatively spliced

10,000 multi-exonhuman genes in 52tissues

Page 27: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Number of Splicing Isoforms Number of Splicing Isoforms per Gene by EST Comparisonper Gene by EST Comparison

Harrington et al. Nature Genetics 36:916 (2004)

3.8

Page 28: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Regulation of/by Alternative Regulation of/by Alternative SplicingSplicing

• Sex determination in Drosophila involves 3 regulatory genes that are differentially spliced in females versus males; 2 of them affect alternative splicing

1. Sxl (sex-lethal) - promotes alternative splicing of tra (exon 2 is skipped) and of its own (exon 3 is skipped) pre-mRNA

2. Tra – promotes alternative splicing of dsx (last 2 exons are excluded)

3. Dsx (double-sex) - Alternatively spliced form of dsx needed to maintain female state

Fig. 14.38

Page 29: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Alternative splicing

Sxl and Tra are SR proteinsTra binds exon 4 in dsx mRNA causing it to be retained in mature mRNA.

Alternative splicing in Drosophila Alternative splicing in Drosophila maintains the female statemaintains the female state

Page 30: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Known Roles of Alternative SplicingKnown Roles of Alternative SplicingStamm et al. Gene 344:1-20 (2005)Stamm et al. Gene 344:1-20 (2005)

• Introduction of stop codons -

25-35% of alternative splicing events introduce stop codons that either function to produce truncated proteins or regulate mRNA stability through the nonsense mediated decay (NMDNMD) pathway

Page 31: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Nonsense Mediated Decay Nonsense Mediated Decay

- A surveillance mechanism that selectively degrades nonsense mRNAs

- Regulates gene expression by alternative splicing

- Transcripts containing a PTC (premature termination codon) are degraded rapidly

1/31/3rdrd of alternative transcripts contain premature termination codons of alternative transcripts contain premature termination codons

Brenner, SE et al, PNAS January 7, 2003 vol. 100 no. 1 189–192

Page 32: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

• Add new protein parts -

75% of alternative splicing involves the protein coding region, in addition to truncations you can change the overall protein sequence

Known Roles of Alternative SplicingKnown Roles of Alternative SplicingStamm et al. Gene 344:1-20 (2005)Stamm et al. Gene 344:1-20 (2005)

Page 33: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Known Roles of Alternative SplicingKnown Roles of Alternative SplicingStamm et al. Gene 344:1-20 (2005)Stamm et al. Gene 344:1-20 (2005)

• Consequences of new protein parts -

Alter protein binding properties, eg. receptor/ligand

Alter intracellular localization, eg. membrane insertion

Alter extracellular localization, eg. secretion

Alter enzymatic or signaling activities, eg. TK truncations

Alter protein stability, eg. inclusion of cleavage sites

Insertion of post-translation modification domains

Change ion channel properties eg. slo

Page 34: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

• Coordinated Regulation of Biological Events

Potassium channel activity associated with hearing (slo)

Muscle contraction

Neurite (axon or dendrite) growth

Cell differentiation

Apoptosis

Neuron development (Dscam) (TIBS 31:581-588, 2006)

Known Roles of Alternative SplicingKnown Roles of Alternative SplicingStamm et al. Gene 344:1-20 (2005)Stamm et al. Gene 344:1-20 (2005)

Page 35: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

The Power of Alternative RNA Splicing The Power of Alternative RNA Splicing

Exon 412 Alternatives

Exon 648 Alternatives

Exon 933 Alternatives

Exon 172 Alternatives

12 X 48 X 33 X 2

Equals38,016 Possible mRNAs

Genome has only 14,800 genes!

Ig Loop 3 Ig Loop 4 Ig Loop 7 Trans-membrane

The final mRNA chooses 24 exons from 115 possibilities(20 constitutive exons and 4 alternatively spliced ones)

Drosophila DSCAM gene codes for an axon guidance receptor

Page 36: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Evolutionary Overview of Alternative Evolutionary Overview of Alternative SplicingSplicing

• Introns unlikely to have been derived from ancient genes

• Multi-intron genes probably predated alternative splicing

• Most eukaryotes have introns but alternative splicing prevalent only in multicellular organism

• S.cerevisiae has only 253 introns (3% of its genes) and only 6 genes have 2 introns

• S. pombe: 43% of its genes have introns (usually 40-75 nt)

• S.cerevisiae and S. pombe have NO alternative splicing

Page 37: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Finding Alternatively Spliced Finding Alternatively Spliced ExonsExons

• Compare cDNA & genomic DNA sequences

• Compare ESTs & genomic DNA sequences

• Compare protein & genomic sequences

Page 38: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Large-scale multiple Large-scale multiple alignment of expressed alignment of expressed

sequencessequences

• Databases: •tens of thousands of mRNAs•millions of ESTs

• From large-scale alignments: 60-80% of all human genes undergo alternative splicing.

Page 39: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

AluAlu elements elements• Length = ~Length = ~300 bp300 bp• Repetitive: > Repetitive: > 1,400,0001,400,000 times in the human times in the human

genomegenome• Constitute >10% of the human genome Constitute >10% of the human genome • Found mostly in intergenic regions and intronsFound mostly in intergenic regions and introns• Propagate in the genome through retroposition Propagate in the genome through retroposition

(RNA intermediates). (RNA intermediates).

Page 40: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

AluAlu elements can be divided into elements can be divided into subfamiliessubfamilies

The subfamilies The subfamilies are distinguished are distinguished by ~16 diagnostic by ~16 diagnostic positions.positions.

Page 41: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Alu-containing exonsAlu-containing exons

• Out of 1,182 alternatively spliced cassette exons, 62 have a significant hit to an Alu sequence.

• Out of 4,151 constitutively spliced exons, none has a significant hit to an Alu sequence.

all all AluAlu-containing exons -containing exons are alternatively spliced.are alternatively spliced.

Graur et al., Graur et al., Genome Res. Genome Res. ((20022002))

Page 42: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

The minus strand ofThe minus strand of Alu Alu elements contains “near” elements contains “near”

splice sitessplice sites• The minus strand of The minus strand of AluAlu contains ~3 sites contains ~3 sites

that resemble the acceptor recognition site:that resemble the acceptor recognition site:

Consensus acceptor site:YYYYYYNCAG/RConsensus acceptor site:YYYYYYNCAG/RAlu-J: (127-114) :TTTTTTGtAG/AAlu-J: (127-114) :TTTTTTGtAG/A

• The minus strand of The minus strand of AluAlu contains ~9 sites contains ~9 sites thatthat resemble the consensus donor site:resemble the consensus donor site:

Consensus donor site: CAG/GTRAGTConsensus donor site: CAG/GTRAGTAlu-J: (25-17) : CAG/GTGtGAAlu-J: (25-17) : CAG/GTGtGA

Page 43: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

The selection of AGs in the 3SSs of Alu-derived exons are underlined

3 genetic diseases:1. COL4A3- Alport syndrome2. GUSB- Sly syndrome3. OAT- OAT deficiency

Lev-Maor G, et al, Science, 2003.

Alu Alu exonizatioexonizationn

Page 44: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Sorek, Genome Res 14:1617 (2004)

- Sequence features of alternatively regulated exons are different from constitutive exons. - These features are conserved between species.

Factors Playing a Role in Exon Factors Playing a Role in Exon RecognitionRecognition

Page 45: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Takeda, J.-i. et al. Nucl. Acids Res. 2006 34:3917-3928; doi:10.1093/nar/gkl507

Large-scale identification of alternative Large-scale identification of alternative splicsplice e variants of human gene transcripts variants of human gene transcripts

using 56using 56,,419419 cDNAs cDNAsDistribution of the length difference between the alternative splicing variants

Page 46: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Large-scale identification of Large-scale identification of human human alternative splicalternative splicee

variantsvariants

(A) ‘motif-changed’(B) ‘subcellular localization-changed’(C) ‘transmembrane domain-changed’

Page 47: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

- Alternative Splicing & Transcript Diversity Db ASTDASTDhttp://www.ebi.ac.uk/astd/- SpliceMinerSpliceMiner (querying EVDBEVDB - Evidence Viewer Database)http://discover.nci.nih.gov/spliceminer/- Hollywoodhttp://hollywood.mit.edu-Human Alternative Splicing Db (HASDBHASDB),http://www.bioinformatics.ucla.edu/~splice/HASDB/-Putative Alternative Splicing Database,PALSPALS db, http://palsdb.ym.edu.tw/

Alternative splicing databases (1,560,000 Alternative splicing databases (1,560,000 hits in google)hits in google)

Page 48: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Stamm, S. et al. Nucl. Acids Res. 2006 34:D46-D55

Structure of ASTDStructure of ASTD

databases are integrated, cross-linked and are available through a variety of interface tools

ASTD data are integrated with Ensembl genome annotation

Page 49: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Spliceminer (NCBI)Spliceminer (NCBI)

Querying EVDB (Evidence Viewer DB). Composite of five separate interactive queries. Each query corresponds to a different Affymetrix HG-U133A Probe. The composite permits facile comparison of the exons that are targeted by each of the probes. For example, the probes for exons 16 and 18 uniquely identify the splice variants NM_006487 and NM_006485, respectively.

Page 50: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Kan, Z. et al. Nucl. Acids Res. 2005 33:5659-5666; doi:10.1093/nar/gki834

Evolutionarily conserved and diverged Evolutionarily conserved and diverged alternative splicing events show different alternative splicing events show different

expression and functional profiles expression and functional profiles (Kan, NAR, (Kan, NAR, 2005)2005)

Page 51: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Evolutionarily conserved and diverged Evolutionarily conserved and diverged alternative splicing events show different alternative splicing events show different

expression and functional profiles expression and functional profiles ((Kan et al, Kan et al, NAR, 2005NAR, 2005))

• Alternative splicing events in 10818 pairs of human and mouse genes

• 43% (8921) of mouse alternative splices could be found in the human genome but not in human transcripts

• Only 7% of human alternative splices are conserved in mouse transcripts

• 5 of 11 tested mouse predictions were observed in human tissues

• Diverged alternative splicing is more prevalent in cancerous cell-lines

Page 52: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Evolutionarily conserved and diverged alternative Evolutionarily conserved and diverged alternative splicing events show different expression and functional splicing events show different expression and functional

profiles profiles ((Kan et al, NAR, 2005Kan et al, NAR, 2005))

Page 53: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Microarray expression of alternatively spliced Microarray expression of alternatively spliced human-mouse pairs (ASP) of genes in different human-mouse pairs (ASP) of genes in different

tissues (tissues (Kan et al, 2005Kan et al, 2005))

(i) level of conserved alternative splicing most elevated in brain(ii) diverged alternative splicing is the most enriched in testis

Page 54: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Functional transcripts for the α, β (brain, periphery) and γ (brain) and receptors

CorticotrophinCorticotrophin releasing hormone releasing hormone receptor receptor 22 (CRHR2) (CRHR2) alternative splices alternative splices

Catalano et al, Molecular Endocrinology. First published December 18, 2002 as doi:10.1210/me.2002-0302

β

γ

α

Extracellular Domain Cytoplasmic Domain

Page 55: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

http://www.ensembl.org

Page 56: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.
Page 57: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

The implications of alternative The implications of alternative splicing in thesplicing in the ENCODE protein ENCODE protein

complementcomplement (cont’d) (cont’d)Fig. 2. The potential effect of splicing on protein structure. Four splice isoforms mapped onto the nearest structural templates. Structures are colored in purple where the sequence of the splice isoform is missing. (a) Hemoglobin (b) SET domain-containing protein 3, (c) Mitochondrial cysteine desulfurase (d). Eukaryotic initiation factor 6.

Page 58: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

How many genes and How many genes and transcripts in human transcripts in human

genome?genome?• Ensembl NCBI 35 release (Dec,

2005): 33,869 transcripts derived from 22,218 genes

• Ensembl NCBI 36 release (May, 2006): 48,851 transcripts derived from 23,710 genes

Page 59: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

• Major source of uncertainty: nucleic acid-based identification of putative proteins in humans and other organisms.

• Most proteins have not been seen as

proteins per se but their existence has been inferred from genomic DNA, cDNA and ESTs.

How many genes and How many genes and transcripts in human transcripts in human

genome? Cont’dgenome? Cont’d

Page 60: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Surveillance systemsSurveillance systems

• NMD (nonsense-mediated decay) controls alternative splices that cause a premature stop codon

• ERAD (Endoplasmic Reticulum Associated Degradation) eliminates AS (alternative splicing) isoforms with no stable 3D structure

Page 61: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Computer-modeled Computer-modeled surveillance?surveillance?

• Selecting predicted proteins with domains of abnormal length

– Study Pfam-A domains derived from Swissprot proteins

– 8129 Pfam-A domain families currently in Pfam

– How many Pfam-A families can be used for such a surveillance?

Page 62: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

20 most irregular Pfams in human Swissprot 20 most irregular Pfams in human Swissprot proteinsproteins

PfamA HumHum

irrHum fx

seed average

seed SD PfamA Description

PF01391 487 192 185 60 0 Collagen triple helix repeat

PF00023 717 107 30 33.48 2.35 Ankyrin repeat

PF00028 630 97 90 94.53 4.81 Cadherin domain

PF00400 688 54 30 39.23 2.35 WD domain, G-beta repeat

PF00076 224 36 24 70.6 2.53 RNA recognition motif

PF00435 315 32 19 105.9 2.71 Spectrin repeat

PF00041 402 27 16 84.44 5.47 Fibronectin type III domain

PF00069 302 24 20 270.8 17.3 Protein kinase domain

PF00053 167 24 12 50.67 5.07 Laminin EGF-like (Domains III and V)

PF01344 171 24 12 47.22 3.06 Kelch motif

PF00681 76 19 9 44.95 0.3 Plectin repeat

PF00595 202 18 10 84.1 5.01 PDZ domain (Also known as DHR or GLGF)

PF00018 151 16 13 56.11 1.09 SH3 domain

PF00071 107 14 11 165.2 6.36 Ras family

PF00036 303 13 4 28.89 0.56 EF hand

PF00067 57 12 8 459.1 11.4 Cytochrome P450

PF00063 38 12 2 678.3 17.5 Myosin head (motor domain)

PF00324 21 12 12 467.6 23 Amino acid permease

PF00989 19 12 5 111.6 5.12 PAS fold

PF00515 247 11 3 34 0.07 Tetratricopeptide repeat

Page 63: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Relative length of matching human

Sushi domains

00.20.40.60.8

11.21.4

0 20 40 60 80 100 120

pid (% identity)

rela

tive

len

gth

rel_length

Page 64: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Relative length of matching

human Cadherin domains

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 20 40 60 80 100 120

pid (% identity)

rela

tive

len

gth

rel_length

Page 65: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Nested and overlapping Nested and overlapping domainsdomains

Pfam1 short name ave1 std1 Pfam2 short name ave2 std2 cooccur

PF00413 Peptidase_M10 164.59 31.97 PF00040 fn2 41.34 1.28 766

PF00406 ADK 177.13 12.95 PF05191 ADK_lid 36.06 0.32 97

PF00389 2-Hacid_dh 317.81 8.52 PF02826 2-Hacid_dh_C 178.23 5.4 89

PF02463 SMC_N 1033.91 307.22 PF06470 SMC_hinge 116.73 7.95 89

PF07716 bZIP_2 54.53 1.16 PF00170 bZIP_1 58.94 2.31 57

PF05221 AdoHcyase 457.38 21.27 PF00670 AdoHcyase_NAD 162.25 1.02 40

PF00478 IMPDH 446.25 54.33 PF00571 CBS 123.73 15.22 25

PF04857 CAF1 305.75 122.27 PF01424 R3H 56.08 1.64 25

PF00443 UCH 417.91 131.18 PF00627 UBA 40.49 0.82 18

PF05192 MutS_III 305.26 9.61 PF05190 MutS_IV 93.63 2.43 18

PF03917 GSH_synth_ATP 479.4 10.82 PF03199 GSH_synthase 104.55 2.16 17

PF02166 Androgen_recep 436.75 9.22 PF02155 GCR 374.5 2.12 12

PF01546 Peptidase_M20 322.42 24.47 PF07687 M20_dimer 115.06 21.64 9

PF02463 SMC_N 1033.91 307.22 PF04423 Rad50_zn_hook 55.21 1.18 9

PF01137 RTC 329.09 9.34 PF05189 RTC_insert 101.9 6.48 8

PF01193 RNA_pol_L 195.56 68.81 PF01000 RNA_pol_A_bac 120.27 8.7 8

PF00514 Arm 41.26 1.21 PF02985 HEAT 38.72 1.01 5

PF07723 LRR_2 25.92 1.03 PF00560 LRR_1 22.97 2.17 5

PF04998 RNA_pol_Rpb1_5 580.84 52.22 PF04990 RNA_pol_Rpb1_7 134.29 1.99 4

PF04998 RNA_pol_Rpb1_5 580.84 52.22 PF04992 RNA_pol_Rpb1_6 189.43 8.17 4

PF07690 MFS_1 368.24 29.28 PF05978 DUF895 156.27 1.42 4

Page 66: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Nested and truncated Nested and truncated domainsdomains

- Matrix metalloproteinase 9

- Matrix metalloproteinase 11

- Glycosyl hydrolase domain

- Peptidoglycan-binding domain+DUF

Existing proteins (PFAM models) Existing proteins (PFAM models)

Putative proteins (PFAM models)Putative proteins (PFAM models)

Page 67: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Computer-modeled Computer-modeled surveillance?surveillance? Cont´d Cont´d

• Select those predicted proteins with domains of abnormal length (>=40% of domain missing). Use for filtering:

– 8129 Pfam-A domain families currently in Pfam– 2752 Pfam-A domain families in human Swissprot

proteins– How many of these 2752 Pfam-A families are

regular enough to be used for such a surveillance? Answer: 2529

Page 68: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Ensembl Proteome Coverage Ensembl Proteome Coverage with with regular Pfamregular Pfam-A-A domains domains**

# of proteins

Proteins w/ regPfams

ProteomeCoverage (%)

DomainMatch Pfams

Trunc-atedDoms

Trunc-atedProteins

Trunc-ated Proteins (%)

InternallytruncatedProteins

Internallytruncated

Prot (%)

Human proteome vs Human Pfam 48851 20283 42.14 37894 2460 1250 1188 5.86% 326 27.44%

Human vs Vert 48851 19118 39.14 35903 2401 1092 1072 5.61% 273 25.47%

Mouse vs Vert 31986 11692 36.55 20736 2373 414 403 3.45% 74 18.36%

Rat vs Vert 32543 12261 37.68 21887 2297 600 553 4.51% 98 17.72%

Opossum vs Vert 26943 9107 33.80 15779 2192 436 405 4.45% 73 18.02%

Chicken vs Vert 24168 8090 33.47 13615 2045 577 553 6.84% 109 19.71%

Xenopus vs Vert 28324 8146 28.76 13487 1927 417 384 4.71% 47 12.24%

Fugu vs Vert 22102 6614 29.92 10502 1839 412 389 5.88% 62 15.93%

Sea squirt vs MF 20000 2202 11.01 3101 773 149 147 6.68% 26 17.69%

C.elegans vs MF 26032 2949 11.32 3976 1037 57 57 1.93% 6 10.52%

Dros vs MF 19606 3415 17.42 5291 1144 57 52 1.52% 9 17.31%

TrHuman vs Human Pfam 58077 20167 34.72 32804 2149 3114 3051 15.13% 562 18.42%

TrHuman (nonfrag) vs Human 35374 11136 31.48 20163 1949 962 931 8.36% 246 26.42%

* Belonging to one of the 2529 regular Pfam-A families

Page 69: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Structural Studies of ASStructural Studies of AS

Pyrophosphorylase

Sulphotransferase

RAC1Tumor necrosis factor

Glutathione S-transferase

Disordered ASregions

Structured AS regions

Page 70: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Parallel ParadigmsParallel Paradigms

CatalysisAA seq → 3-D Structure → Function

SignalingAA seq → Disordered → Function

Ensemble

Page 71: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Alternative Splicing and Intrinsic Alternative Splicing and Intrinsic DisorderDisorder

• Find proteins with both ordered and disordered regions.

• Find mRNA alternative splicing information for these proteins and map to the ordered and disordered regions.

• For alternatively spliced regions of mRNA, do they code for ordered protein more often or do they code for disordered protein more often?

Romero PR et al. Proc Natl Acad Sci U S A. 2006 May 30;103(22):8390-5

Page 72: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Studying the Relationship Intrinsic Studying the Relationship Intrinsic Disorder Disorder ASAS

DisProt

Database of proteins withexperimentally determined

structure and disorderwww.disprot.org

ASG(AS Gallery)

SwissProt(VarSplic)

ASED dataset:

(Alternative Splicing & Experimental Disorder)46 proteins

74 characterized AS regions>19,000 charaterized residues, 35% ID

Romero PR et al. Proc Natl Acad Sci U S A. 2006 May 30;103(22):8390-5

Page 73: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Results on ASEDResults on ASEDDistribution of structurally characterized AS regions

Romero PR et al. Proc Natl Acad Sci U S A. 2006 May 30;103(22):8390-5

Page 74: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

Take-home MessageTake-home Message• Alternative splicing evolved simultaneously

with multicellularmulticellular organisms

• It increases the functional diversityfunctional diversity and complexitycomplexity of an organism

• Not all of alternative splicing is functional (e.g. Alu-exonizationAlu-exonization)

• It is a factor in genetic diseasesgenetic diseases (cancer, etc.)

• It is strongly associated with protein protein disorderdisorder

Page 75: Alternative Splicing Institute of Enzymology, Budapest hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 Alternative Splicing.

The EndThe End……to get updates on this lecture and to get updates on this lecture and

other related, updated info, go toother related, updated info, go to http://www.enzim.hu/~hegyi/