BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression...

37
Gene expression analysis BS312 Bioinformatics Antonio Marco School of Biological Sciences University of Essex 10-Nov-15

Transcript of BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression...

Page 1: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Gene expression analysisBS312 Bioinformatics

Antonio Marco

School of Biological SciencesUniversity of Essex

10-Nov-15

Page 2: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Outline

1 Gene Expression

2 Measuring RNA expression levels

3 Data processing

4 Visualizing Gene Expression Patterns

5 Applications

6 Practical overview

Page 3: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

The ’central dogma’ of molecular biology

DNA makes RNA makes Protein

Francis Crick (1956)

”I just didn’t know what dogma meant”

Page 4: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

The ’central dogma’ of molecular biology

DNA makes RNA makes Protein

Francis Crick (1956)

”I just didn’t know what dogma meant”

Page 5: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

A more realistic scenario of gene expression

Page 6: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Analyzing Gene Expression: overview

Page 7: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Analyzing Gene Expression: overview

Page 8: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Outline

1 Gene Expression

2 Measuring RNA expression levels

3 Data processing

4 Visualizing Gene Expression Patterns

5 Applications

6 Practical overview

Page 9: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Northern blot

The ’IS IT THERE?’ approach

Page 10: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Microarrays

The ’IS ANY OF THESE?’ approach

Page 11: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

RNA-Seq

The ’WHAT’S IN THERE?’ approach

Page 12: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

RNA-Seq

DNA sequencing is done in small fragments

Page 13: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

RNA-Seq

RNA-Seq is a quantitative technique

Page 14: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Outline

1 Gene Expression

2 Measuring RNA expression levels

3 Data processing

4 Visualizing Gene Expression Patterns

5 Applications

6 Practical overview

Page 15: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Assessing data quality

• Ideally, sequencers always give the actual reads

• In reality, they often contain errors

• Good news is, sequencers tell us how confident they are

Page 16: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Assessing data quality

• Ideally, sequencers always give the actual reads

• In reality, they often contain errors

• Good news is, sequencers tell us how confident they are

Page 17: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Assessing data quality

• Ideally, sequencers always give the actual reads

• In reality, they often contain errors

• Good news is, sequencers tell us how confident they are

Page 18: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Assessing data quality

• Ideally, sequencers always give the actual reads

• In reality, they often contain errors

• Good news is, sequencers tell us how confident they are

Page 19: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Phred Score and FASTQ format

• Phred score measures the probability of a sequencing error

• The FASTQ format includes Phred scores in a one-letter code

@SEQUENCE NAME

CATGGCTAGCTGCTAGCTAGCTAGACATTCATCGAAATCGCTAGCCTAGCTACGA

+

!’’*((((***+))%%%++)(%%%%).1***-+*’’))**55CCF>>>>>>C%%%

Page 20: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Phred Score and FASTQ format

• Phred score measures the probability of a sequencing error

• The FASTQ format includes Phred scores in a one-letter code

@SEQUENCE NAME

CATGGCTAGCTGCTAGCTAGCTAGACATTCATCGAAATCGCTAGCCTAGCTACGA

+

!’’*((((***+))%%%++)(%%%%).1***-+*’’))**55CCF>>>>>>C%%%

Page 21: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Read quality

Page 22: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Normalization: principles

• Bilbo believes his sword is big, five ’hands’ in length

• Gandalf thinks Bilbo’s sword is rather short

• Who’s right?

Page 23: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Normalization: principles

• Bilbo believes his sword is big, five ’hands’ in length

• Gandalf thinks Bilbo’s sword is rather short

• Who’s right?

Page 24: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Normalization: RPKM

Page 25: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Normalization: Smear plots

• ’Smear plots’: average # of reads (X) and fold difference (Y)

• Average difference (red line) should be about 0

• Normalization does this!

Page 26: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Normalization: Smear plots

• ’Smear plots’: average # of reads (X) and fold difference (Y)

• Average difference (red line) should be about 0• Normalization does this!

Page 27: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Outline

1 Gene Expression

2 Measuring RNA expression levels

3 Data processing

4 Visualizing Gene Expression Patterns

5 Applications

6 Practical overview

Page 28: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Hierarchical clustering

• In previous lectures youlearnt that phylogenetictrees reflect sequencesimilarity

• Likewise, trees can be builtto reflect expressionsimilarity

• Most frequent algorithm isUPGMA

Page 29: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Hierarchical clustering

• In previous lectures youlearnt that phylogenetictrees reflect sequencesimilarity

• Likewise, trees can be builtto reflect expressionsimilarity

• Most frequent algorithm isUPGMA

Page 30: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

HC and Heatmaps

Page 31: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Outline

1 Gene Expression

2 Measuring RNA expression levels

3 Data processing

4 Visualizing Gene Expression Patterns

5 Applications

6 Practical overview

Page 32: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Time-course series

Drosophila (fruit fly) development

Adapted from: Graveley et al. (2011) Nature 471:473

Page 33: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Cancer

MicroRNAs in different cancer cells

Volinia et al. (2012) PNAS 109:3024

Page 34: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Reconstruct transcripts

Page 35: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Outline

1 Gene Expression

2 Measuring RNA expression levels

3 Data processing

4 Visualizing Gene Expression Patterns

5 Applications

6 Practical overview

Page 36: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Practical overview

Characterize the transcription profile of the Down’s SyndromeCritical Region

• Check the quality of the reads

• Mapping reads to a reference genome (human)

• Assemble reads into transcripts

• Visualize annotated transcripts

Page 37: BS312 Bioinformatics Antonio Marco - WordPress.com · Recommended readings Gene expression analysis: Pevsner J (2009) Bioinformatics and Functional Genomics. John Wiley & Sons. Chapter

Recommended readings

Gene expression analysis:

• Pevsner J (2009) Bioinformatics and Functional Genomics.John Wiley & Sons. Chapter 9

• Mutz K-O et al. (2013) Transcriptome analysis usingnext-generation sequencing. Curr Op Biotech 24:22-30

Web resources:

• RNA-Seq at http://en.wikipedia.org/wiki/RNA-Seq