RNAseq: Normalization and differential expression I · RNAseq: Normalization and di erential...

RNAseq:Normalization and differential expression I

Jens Gietzelt

22.05.2012

Robinson, Oshlack. A scaling normalization method for differentialexpression analysis of RNA-seq data. Genome Biology. 2010

Hardcastle, Kelly. baySeq: Empirical Bayesian methods foridentifying differential expression in sequence count data. BMCBioinformatics. 2010

IntroductionPairwise calibration (EdgeR)

Differential expression

Outline of the presentation

1 Introduction

2 Pairwise calibration (EdgeR)

3 Differential expression

Jens Gietzelt

Introduction

normalization:

comparison of expression levels between genes within a sample(same scale)

however technical effects introduce a bias in the comparison betweensamples

⇒ normalization is crucial before performing differential expression

calibration method EdgeR takes advantage of within-samplecomparability

differential expression:

appropriate distribution for count data

incorporate calibration parameters

Jens Gietzelt

Framework

Yg ,k ... observed count for gene g in library k

Nk =G∑

g=1Yg ,k ... total number of reads for library k

ηg ,k ... number of transcripts of gene g in library kLg ... length of gene g

Sk =G∑

g=1ηg ,kLg ... total RNA output of sample k

E (Yg ,k) =ηg ,kLg

counts are a linear function of the number of transcripts

library size calibration (Yg ,k/Nk) is appropriate for the comparisonof replicates

comparison of biologically different samples may be biased byvarying RNA composition

Jens Gietzelt

kidney vs. liver dataset

Trimmed mean of log-foldchange

RNA production Sk of one sample cannot be determined directly

estimation of relative differences of RNA production fk = Sk/Sr of apair of samples (k , r)

assumption: most genes are not differentially expressed

⇒ compute robust mean over log-foldchanges:

double filtering over both mean and difference of log-valuescalculate a weighted mean over the log-foldchanges⇒ resacle factors fk = TMM(k,r), where r is reference sample

(TMM(k,r)

∑g∈G∗

wg ,(k,r) (log2 (Yg ,k/Nk)− log2 (Yg ,r/Nr ))∑g∈G∗

wg ,(k,r)

wg ,(k,r) =

Yg ,k− 1

Yg ,r− 1

Jens Gietzelt

kidney vs. liver dataset

Simulation: pair of samples

simulated data sampled from poisson distribution

Simulation: replicates

Cloonan: log-transformation and quantile normalization

methods in use:

DegSeq (normal distr.)

EdgeR (negative binomial)

DEseq (negative binomial, multiple groups)

baySeq (negative binomial, multiple groups)

Myrna (permutation based)

Jens Gietzelt

technical replicates: poisson distr.biologically different samples: negative binomial distr.

Y ∼ NB (p, m)

Y ... number of successes in a sequence of Bernoulli trials withprobability p before r failures occur

alternative parametrization:qg ,e ... proportion of sequenced RNA of gene g for experimental group e

Yg ,k,e ∼ NB (qg ,eNk fk , φg )

E (Yg ,k,e) = µg ,k,e = qg ,eNk fk , Var (Yg ,k,e) = µg ,k,e + µ2g ,k,eφg

test if qg ,1 is significantly different from qg ,2

dispersons φg are moderated towards a common disperson

Jens Gietzelt

baySeq I

empirical Bayes approach to detect differential expression

Dg = {Yg ,k , Nk , fk}k=1,...,KM ... user specified modelθM ... vector of parameters of model M

P (M|Dg ) =P (Dg |M) P (M)

P (Dg )

calculate marginal likelihood:

P (Dg |M) =

∫P (Dg |θM ,M) P (θM |M) dθM

Jens Gietzelt

baySeq II

P (Dg |M) =

∫P (Dg |θM ,M) P (θM |M) dθM

e.g. Poisson-Gamma conjugacy, however no such conjugacy withnegative binomial data

⇒ define an empirical distribution on θM and estimate the marginallikelihood numerically

prior P (M) is estimated by iteration:

P (M) = pg , p∗g = P (M|Dg )

baySeq:

applicable to complex experimental designs

computationally intensive

Jens Gietzelt

RNAseq: Normalization and differential expression I · RNAseq: Normalization and di erential...

Documents

Transcript of RNAseq: Normalization and differential expression I · RNAseq: Normalization and di erential...

RNAseq Analyses Identify Tumor Necrosis Factor …ndtherapeutics.org/wp-content/uploads/2016/05/Paper...RNAseq Analyses Identify Tumor Necrosis Factor-mediated Inflammation as a Major

The RNA-‐SEQ Analysis Pipeline - Alicia Oshlack

INSS 6511 Lecture 4 Normalization An Normalization example.

Genetic circuit characterization and debugging using RNAseq

Transcript detection in RNAseq

RNASeq Analysis With the Command Line

RNASeq Expression Analysis Tutorial(365) Expression... · 2019-03-22 · RNASeq Expression Analysis Tutorial RNASeq Expression Analysis Tutorial 7 This is where you can change the

bio-informatic analysis of RNASeq datagenoweb.toulouse.inra.fr/~formation/Bruxlles_RNAseq/...bio-informatic analysis of RNASeq data Christophe Klopp / , bioinfo.genotoul.fr 2 Overview

Normalization Strand Map - Portland State Universityweb.cecs.pdx.edu/.../normalization-lecture-final.pdf · Normalization Strand Map, ... Goals for Normalization: (1) Lossless (Correct)

RNASeq DE methods review Applied Bioinformatics Journal Club

INSS 6511 Chapter 5 Normalization An Normalization example.

DATABASE NORMALIZATION - Database Normalization · DATABASE NORMALIZATION Normalization: process of efficiently organizing data in the DB. RELATIONS (attributes grouped together)

Normalization - 1 Normalization. Normalization - 2.

Normalization Exercises A337. 2 Normalization Example 1.

Microarrays, RNAseq And Functional Genomics CPSC265 Matt Hudson.

from mapped RNAseq data Fast, robust and accurate splice ... · Fast, robust and accurate splice junction prediction from RNAseq data RNAseq mappers produce large numbers of FP junctions,

Rnaseq forgenefinding

Single-cell RNAseq reveals cell adhesion molecule profiles ...Single-cell RNAseq reveals cell adhesion molecule profiles in electrophysiologically defined neurons Csaba Földya,b,1,

small RNAseq data analysis - INRA

Local Context Normalization: Revisiting Local Normalization · 2020. 6. 29. · Local Context Normalization: Revisiting Local Normalization Anthony Ortiz ∗1, 4, Caleb Robinson3,