Computational methods to quantify transcriptome changes in bacteria Rebecca Pankow Mentor: Dr. Jeff...
-
date post
22-Dec-2015 -
Category
Documents
-
view
216 -
download
2
Transcript of Computational methods to quantify transcriptome changes in bacteria Rebecca Pankow Mentor: Dr. Jeff...
Computational methods to quantify transcriptome
changes in bacteria
Rebecca PankowMentor: Dr. Jeff Chang
Botany and Plant PathologyOregon State University
What makes a pathogen?
Infections caused by Pseudomonas syringae
• Overcome host defenses• Manipulate host cell• Survive in host environment
HypothesisGenes that are expressed in conditions that
mimic the plant are candidates for host-associated genes.
Experimental Setup
Grow P. syringae inKB (rich media)
No virulence geneexpression
Grow P. syringae in minimal media:simulates environment of plant host
Virulence geneexpression
Identify differential expression of genes
How to identify expressed genes?
Transcriptome: all mRNAs in a cell at a given time
DNA mRNA protein
sequencedtranscriptome
completely sequenced genome
aligning back
AGAGCAATAGCA
TAATTCTCGTTATCGTCCGGATTAAGAGCAATAGCAGGCC
AGAGCAATAGCA
How to quantify transcriptome changes?
Next-Generation Illumina IIG Genome Sequencer
ACATAGGAGCTAGATAGCTATGCATCGATCGACATGGATCGACATGAGAGTTACGAGTAGACTGAGAGATATCTGAGAGATATGTTTACCCAGATTACTCTCCGATGCGATCGACATGAGAGTTACGAGTAGACTGAGAGATAT
mRNAs in transcriptome
36 base-long reads (36-mers)
Computational Pipeline
TGTTTACCCAGATTACTCTCCGATGCCAGGGAGAAT GATCGACAGATGCATGTTTACCCAGATTACTCTCCG ACATAGGAGCTAGATAGCTATGCATCGATCGACAGAGATCGACAGATGCATGTTTACCCAGATTACTCTCCG
Processed 36-mers
Align to ref. genome
Signal Processing
genome coordinates of a potential transcription unit
# reads thatmap to
coordinates
Graph signal
Not very informative!
…0010100234201231201001022410301022040102020…
Signal Processing
Using sliding window approach to minimize noise
Set
old signal
processed signal
Sum of reads in sliding window =
______________________________________…1919 ___________ _________________________…1919 2020 __________ _______________________…
19 20 “sliding window” = 15
22
1919 2020 2222 ________ _____________________…
Resulting signal
old signal
scaled and processed signal
More informative, but signal is jagged
Smoothing the Signal
Iteration of the sliding window
Deconvoluting Signal
Changes in the signal found by using the sliding window on the first and second derivatives of
the signal.
Deconvoluting Signal
• Refine signal divisions by looking in-between previous divisions• Categorize signal divisions as increasing, decreasing, or flat
Processing Empirical Data
Next-Generation Illumina IIG Genome Sequencer
ACATAGGAGCTAGATAGCTATGCATCGATCGACATGGATCGACATGAGAGTTACGAGTAGACTGAGAGATATCTGAGAGATATGTTTACCCAGATTACTCTCCGATGCGATCGACATGAGAGTTACGAGTAGACTGAGAGATAT
36 base-long reads (36-mers)
Problems
Mistakes in sequencing can be made!
ACATAGGAGCTAGATAGCTATGCATCGATCGACATGGATCGACATGAGAGTTACGAGTAGACTGAGAGATATCTGAGAGATATGTTTACCCAGATTACTCTCCGATGCGATCGACATGAGAGTTACGAGTAGACTGAGAGATAT
30% of reads match P.syringae genome
SolutionAccount for mismatches by treating each base in a 36-mer as a wildcard
ACATAGGAGCTAGATAGCTATGCATCGATCGACATG
_CATAGGAGCTAGATAGCTATGCATCGATCGACATG
A_ATAGGAGCTAGATAGCTATGCATCGATCGACATG
AC_TAGGAGCTAGATAGCTATGCATCGATCGACATG
36-mers containing wildcards are mapped back to the original genome
Conclusions
• Computational pipeline developed to– Generate and smooth signal– Divide signal into sections that are going up,
down, or are flat
• 30% of reads from transcriptome map back to original genome
Future Work
Quantify changes in bacterial transcriptome under different treatments
AcknowledgementsJeff Chang
Jason CumbieJeff KimbrelBill ThomasCait ThireaultAllison SmithRyan LilleyPhillip HillenbrandJayme Stout
HHMI/USDAKevin Ahern