T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

35
T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame

description

Why Is It Difficult To Compute A multiple Sequence Alignment? A CROSSROAD PROBLEM BIOLOGY: What is A Good Alignment COMPUTATION What is THE Good Alignment chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKD wheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSE trybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGP mouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. :::.:... :.. *. *: *

Transcript of T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Page 1: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

T-COFFEE, a novel method for Multiple Sequence

AlignmentsCédric Notredame

Page 2: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *

chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :

Potential Uses of A Multiple Sequence Alignment?

Extrapolation

Motifs/Patterns

Phylogeny

Profiles

Struc. PredictionMultiple Alignments Are CENTRAL to MOST Bioinformatics Techniques.

Page 3: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Why Is It Difficult To Compute A multiple Sequence Alignment?

A CROSSROAD PROBLEMBIOLOGY:

What is A Good Alignment

COMPUTATIONWhat is THE Good

Alignment

chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *

Page 4: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Why Is It Difficult To Compute A multiple

Sequence Alignment ?

BIOLOGY

CIRCULAR PROBLEM....

GoodSequences

GoodAlignment

COMPUTATION

Page 5: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Dynamic Programming Using A Substitution Matrix

Progressive Alignment

Page 6: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

The T-Coffee Algorithm

Page 7: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Progressive Alignment Principle and its Limitations…

Page 8: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

The Extended Library Principle…

Page 9: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

The Extended Library Principle…

Page 10: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

The Triplet Assumption

SEQ A

SEQ B

Page 11: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Weighting And Extension

Extension=Using Information from Other Sequences

Weighting=Using The surrounding Information (Coffee)

Page 12: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

T-Coffee Progressive Alignment

Notredame, Higgins, Heringa, 2000

Dynamic Programming Using The extended Library

Page 13: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Local Alignment Global Alignment

Extension

Multiple Sequence Alignment

Mixing Local and Global Alignments

Page 14: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

What is a library?

Extension+T-Coffee

Library Based Multiple Sequence Alignment

2Seq1 MySeqSeq2 MyotherSeq#1 21 1 253 8 70….

3Seq1 anotherseqSeq2 atsecondoneSeq3 athirdone#1 21 1 25#1 33 8 70….

Page 15: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

How Long Does it Take

Page 16: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Primary Lib: O(N2L2)

Extension:O(N3L2)

Tree :O(N2L2)+O(N3)Aln :O(NL2)

Page 17: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

N times slower than

ClustalW

Page 18: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Validating T-Coffee

Page 19: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

What Is BaliBaseBaliBase

BaliBase is a collection of reference Multiple Alignments

The Structure of the Sequences are known and were used to assemble the MALN.

Evaluation is carried out by Comparing the Structure Based Reference Alignment With its Sequence Based Counterpart

Page 20: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

BaliBase

DALI, Sap …

Method X

Comparison

Page 21: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Validation Using BaliBase

T-Coffee Results

Page 22: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Validation Using BaliBase

Page 23: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Taking T-Coffee Further:

Using Structures

Page 24: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Mixing Heterogenous Information With T-Coffee

Local Alignment Global Alignment

Multiple Sequence Alignment

Multiple Alignment

StructuralSpecialist

Page 25: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Running T-Coffee ONLINE

Page 26: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

WHERE ?

[email protected]

www.tcoffee.org

Page 27: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

The T-Coffee Server

Page 28: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

The T-Coffee Server

ES45, 4Proc1 Gb RAM

Page 29: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.
Page 30: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Future…

Page 31: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Large Scale…

Page 32: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

Tailor Made…

Page 33: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

WHERE ?

[email protected]

www.tcoffee.org

Page 34: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.

WHO ?

WHO USES T-Coffee ?

Dali Domain DictionnaryPfamSwissProt

WHO Makes T-Coffee ?

Cédric NotredameDes HigginsChantal AbergelOlivier PoirotOrla O’Sullivan

Page 35: T-COFFEE, a novel method for Multiple Sequence Alignments Cédric Notredame.