Biocomputation: Comparative Genomics Tanya Talkar Lolly Kruse Colleen O’Rourke.

Post on 17-Jan-2016

218 views 2 download

Tags:

Transcript of Biocomputation: Comparative Genomics Tanya Talkar Lolly Kruse Colleen O’Rourke.

Biocomputation: Comparative Genomics

Tanya TalkarLolly KruseColleen O’Rourke

DNA

JunkDNA

ConservedDNA

What is Biocomputation?

Statistics

Computer Science

Molecular Biology

Four Main Parts Biomolecular computation Biological Computation Computational Biology Bioinformatics

Bioinformatics:

Biology

Computer Science

Information Technology

Sequence Analysis Very Functional! Compare DNA between Species Small Fragments Return full sequence

Computational Genomics Needleman – Wunsch

Not used much More Mapped Genomes =

Computational Genomics!

Alignment

Global Alignment:Needleman - Wunsch O(N3) Fewest edit operations Similar strings

Local AlignmentSmith - Waterman O(N2) Dissimilar strings Find high similarity regions

Comparison

S1 P Q R A X A B C S T V Q

S2 X Y A X B A C S L T

A X A B C S

A X B A C S

S1 A X A B _ C S

S2 A X _ B A C S

Score 2 2 -1 2 -1 2 2

Advantages:Global Alignment

Advantages:Local Alignment

BLAST• Basic Local Alignment Search Tool• FASTA

Improvements Increased Speed Locate initial alignment hot spots Statistical significance

Terminology Segment Pairs Locally maximal segment pairs Maximal segment pairs

How it works Query sentence, P Database

Must have score over C! Multiple segment pairs combined

A B C D E F G

A G C B F D E

B E D G A F B

G F B E D C A

How it works Extends each hit Done efficiently Truncates Doesn’t find all pairs

Proteins Fixed length, W Words above threshold Each hit extended

DNA Word List Exact matches NOT dynamic programming

Scoring Blosum62 Matrix Match (+2), Mismatch (-3),

Gaps penalized

Substitution Matrix Represents Scoring Functions

Multiple Sequence Alignment

Methods of MSA Progressive Alignment Construction Iterative Methods Hidden Markov Models Genetic Algorithms and Simulated

Annealing

Comparative Genomics Compare Species

Find Evolutionary Significances! Low Level High Level

Importance of Non Coding DNA