RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length(...
-
Upload
trinhhuong -
Category
Documents
-
view
216 -
download
0
Transcript of RNA sequencing$module$ - Wiki.uio.no · oslo.genomics.no Non coding$RNAs$$$ Category$ Name Length(...
oslo.genomics.no
RNA sequencing module Wednesday 15.10.14 – Day1 09.00 Welcome 09.10 RNA sequencing introduc>on 10.00 RNA-‐seq data analysis –Introduc>on 10.45 RNA-‐seq – prac>cal part 1 12.00 Lunch 12.45 RNA-‐seq – prac>cal part 2 Thursday 16.10.14 – Day2 09.00 Introduc>on to genome-‐guided transcriptome assembly 09.30 Transcriptome assembly – prac>cal part 1 12.00 Lunch 12.45 Transcriptome assembly – prac>cal part 2 14.30 Func>onal annota>on 15.00 Alterna>ve RNA-‐seq applica>ons 15.30 Ques>ons and discussion
oslo.genomics.no
Dr. Susanne Lorenz
Genomics Core Facility Helse Sør-‐Øst
Dept. of Tumor Biology
The Norwegian Radium Hospital, OUS
RNA sequencing -‐ Introduc5on
oslo.genomics.no
exon 1 exon 2 exon 3 exon 4
exon 1 exon 2 exon 3 exon 4 exon 1 exon 2 exon 3 exon 4
Genome
pre-‐ mRNA
Blood Brain Transcrip2on
AAAAAAAAAA mRNA
Splicing, Poly(A) tailing
AAAAAAAAAA
From a Gene to RNA
Transcript A1 Transcript A2
Gene A
à messanger RNA (mRNA) will be translated into protein (coding RNAs) à in human 20.000-‐25.000 protein coding genes
oslo.genomics.no
exon 1 exon 2
exon 1 exon 2
Genome
Blood Brain Transcrip2on
Non-‐coding RNA
Spligcing Transcript
Gene B
à Non-‐coding RNA (ncRNA) will not be translated into protein à Some types of ncRNAs have a polyA-‐tail, others not à Three main categories: houskeeping RNAs, short (< 200 bp) and long ncRNAs (>200bp)
pre Non-‐ coding RNA
From a Gene to RNA
oslo.genomics.no
Non-‐coding RNAs
Category Name Length (bp) Func2on
Housekeeping RNAs
Ribosomal RNA (rRNA) 120-‐5000 ribosome structure
Transfer RNA (tRNA) 73-‐94 protein transla>on
small nuclear RNA (snRNA) ~ 150 splicing
small nucleolar RNA (snoRNA) 70-‐200 post-‐transcrip>onal modifica>on
Short non coding RNAs
(smallRNAs)
micro RNAs 16-‐30 (21-‐24) transla>onal repression
PIWI-‐interac>ng RNAs 26-‐31 regulate transposon ac>vity and chroma>n state
promotor-‐associated short RNAs ~18 may regulate gene expression at
chroma>n level
Long non coding RNAs
long intergenic ncRNA > 200 epigene>c, transcrip>onal and post-‐transcrip>onal regula>on
pseudogenes > 200 compe>>ve endogenous RNA
Enhancer RNA 50-‐2000 not known
An>sense RNA > 200 gene expression
long intronic ncRNA > 200 not known
Repeat associated long RNA > 200 not known
à Ribosomal RNA represents a challenge for RNA sequencing as it cons>tutes up to 80 % of total RNA
oslo.genomics.no
RNA sequencing
What is RNA sequencing?
Massive parallel sequencing to characterize and quanDfy transcriptomes (all acDvely transcribed genes) What does RNA sequencing offer?
• Iden>fica>on of all ac>vely transcript genes in a cell type/>ssue • Differen>ally gene expression
• Iden>fica>on of new transcripts • Detec>ng of alterna>ve splicing events • Detec>on of fusion transcripts • Strand-‐specific measurements • Muta>on analysis – expression level of genomic muta>ons, RNA edi>ng
oslo.genomics.no
RNA sequencing in comparison
“RNA-‐Seq: a revolu>onary tool for transcriptomics” Wang Z. et al., 2009 Nature Reviews
oslo.genomics.no
RNA sequencing protocols
1. mRNA (protein coding) stranded sequencing à only Poly-‐A tail RNA à no rRNA contamina>on but genes encoding proteins of the
ribosome
2. total RNA stranded transcriptome (ribosomal RNA deple>on) à total RNA isola>on followed by rRNA deple>on à generates informa>on about all RNA molecules except rRNAs and RNA molecules longer than 120 bp
3. Capturing systems for stranded RNA-‐sequencing
à hybridiza>on based à dependent on annota>on à increased sequencing depth at coding regions à capable for very low star>ng material (10 ng)
oslo.genomics.no
Illumina TruSeq strand-‐specific RNA protocols
1. Poly-A selection
mRNA Sequencing Total RNA Sequencing
oslo.genomics.no
Strand-‐specific total RNA sequencing-‐ advantages
§ more even coverage along the transcript à significant less 3´ -‐bias compared to Poly-‐A tailing à more accurate quan>fica>on of gene expression
oslo.genomics.no
Strand-‐specific total RNA sequencing-‐ advantages
Fresh frozen high quality sample (RNA RIN value 9.0)
Formalin-‐fixed paraffin-‐embedded sample (RNA RIN value 6.0)
§ robust and efficient method even for low quality samples
oslo.genomics.no
Strand-‐specific total RNA sequencing-‐ advantages
§ Improved discrimina>on of overlapping transcripts à more accurate quan>fica>on of gene expression
oslo.genomics.no
1. Hybridiza5on and amplifica5on on the flow cell
RNA sequencing -‐ Illumina 2. Sequencing
4. Millions of short sequences in fastq format
> HWUSI-EAS100R:6:73:941:1973#0/1 AGCGTAACCGGTAACGATAGCAGAT @ HWUSI-EAS100R:6:73:941:1973#0/1 bbbbbbbb%%%++)(%%%%)1**((((***+
3. Image analysis and base calling
oslo.genomics.no
RNA sequencing -‐ Illumina
Read1 Read2
cDNA fragment
Single-‐end sequencing (Read1 only)
Paired-‐end sequencing (Read1 and Read2)
oslo.genomics.no
Scien5fic RNA sequencing case 1
“Au>sm spectrum disorder (ASD) is a common, highly heritable neuro-‐developmental condi>on characterized by marked gene>c heterogeneity.” RNAseq is used to inves>gate gene expression in au>s>c brain compared to normal brain.
oslo.genomics.no
Transcriptomic analysis of au5s5c brain
Heatmap of the top 200 differen>ally expressed genes between au>sm and control cortex samples
à dis>nct clustering of the majority of au>sm cortex samples, in contrast to genomic heterogeneity (shown in GWAS study)
oslo.genomics.no
A) Significant expression differences between frontal and temporal cortex in control samples (top) and au>sm samples (bomom).
B) Top 20 genes differen>ally expressed between frontal and temporal cortex in controls. None of the genes show significant expression differences between frontal and temporal cortex in au>sm.
Transcriptomic analysis of au5s5c brain
oslo.genomics.no
Transcriptomic analysis of au5s5c brain
Results: § Dis>nct transcriptomic differences between au>sm and control
cortex samples even if heterogeneous at genomic level (WGAS)
§ Gene ontology analysis showed down-‐regulated genes related to synap>c func>on, whereas up-‐regulated genes were related to immune and inflammatory response
§ Consistent expression in frontal and temporal cortex compared to differen>al expression in normal samples
à Gained knowledge about biology behind the disease that can improve the development of diagnosis and treatment strategies
oslo.genomics.no
Scien5fic RNA sequencing case 2
”To idenDfy the precise geneDc elements and study the exclusive nature of three immunohistochemically different breast cancer types, we employed massively parallel mRNA sequencing.”
oslo.genomics.no
PCA plots showing the clustering of the TNBC (magenta), Non-‐TNBC (Red) and HER2-‐posi>ve (green) breast cancer samples based on the transcriptomic expression profiles. Table showing the number of sta>s>cally significant differen>ally expressed transcripts.
Transcriptomic landscape of breast cancer through mRNA sequencing
oslo.genomics.no
Transcriptomic landscape of breast cancer through mRNA sequencing
The table presents the six most common highly abundant primary transcripts and all of the associated informa>on. The bomom four lines of the table show the primary transcript expression profiles specific for the TNBC and Non-‐TNBC (APOE) and HER2-‐posi>ve (FN1, PP1B and OAZ1) groups.
oslo.genomics.no
Transcriptomic landscape of breast cancer through mRNA sequencing
§ Compara>ve transcriptomic analyses elucidated differen>ally expressed transcripts between the three breast cancer groups, iden>fying several new modulators of breast cancer. § Iden>fica>on of common transcrip>onal regulatory elements, such as highly abundant primary transcripts, including osteonec>n, RACK1, calnexin, calre>culin, FTL, and B2M, and ‘‘genomic hotspots’’ enriched in primary transcripts between the three groups. § The study opens previously unexplored niches that could enable a bemer understanding of the disease and the development of poten>al interven>on strategies.
oslo.genomics.no
Scien5fic RNA sequencing case 3
Integra5ve annota5on of human large intergenic noncoding RNAs reveals global proper5es and specific subclasses Moran N. Cabili, Cole Trapnell, […], and John L. Rinn (2011)
In this study a reference catalog of > 8000 human lincRNAs is defined and characterize by sequence, structural and transcrip>onal features across 24 >ssues and cell types.
oslo.genomics.no
Integra5ve annota5on of human large intergenic noncoding RNAs
Computa>onal approach for comprehensive annota>on of lincRNAs
B A
oslo.genomics.no
Integra5ve annota5on of human large intergenic noncoding RNAs
Expression level of lincRNAs and protein coding genes across the >ssues (color intensity represents frac>onal density across the row)
oslo.genomics.no
Integra5ve annota5on of human large intergenic noncoding RNAs
(B) Expression abundance of 1508 highest expressed lincRNAs compared to 8906 highest expressed protein coding genes à lincRNAs are lower expressed (C) Distribu>on of maximal >ssue specificity scores calculated from data in A à lincRNAs show higher >ssue specificity
oslo.genomics.no
Non-‐coding RNAs in human diseases
HOTAIR binds to polycomp proteins that remodel chroma>n marks what leads to epigene>c silencing of i.e. HOXD and increases invasiveness of cancer cells.
lincRNA HOTAIR
oslo.genomics.no
BACE1-‐AS, an an>sense lncRNA regulates the expression of the sense BACE1 gene (labelled BACE1-‐S in the figure) through the stabiliza>on of its mRNA. BACE1-‐AS is elevated in Alzheimer’s disease, increasing the amount of BACE1 protein and, subsequently, the produc>on of β-‐amyloid pep>de.
lncRNA in Alzheimer`s disease
Non-‐coding RNAs in human diseases
oslo.genomics.no
Non-‐coding RNAs in human diseases
The loss of the snoRNA in PWS changes the alterna>ve splicing of the serotonin receptor HTR2C precursor mRNA (pre-‐mRNA), resul>ng in a protein with reduced func>on.
snoRNA in Prader-‐Willi syndrome
oslo.genomics.no
RNA seq data set for the prac5al part
Aim: Iden5fica5on of dysregulated genes in osteosarcoma
• Most common primary malignant tumours of bone
• occurs mainly in long bone (arm and leg) Children/adolescents
• High grade tumours that are very aggressive
• Complex genomic aberra5ons
à The high number of genomic aberra>ons is likely to have an effect on genes expression