Aligning Sequences With T-Coffee

38
Aligning Sequences With T-Coffee Cédric Notredame Comparative Bioinformatics Group Bioinformatics and Genomics Program

description

Aligning Sequences With T-Coffee. Cédric Notredame Comparative Bioinformatics Group Bioinformatics and Genomics Program. T-Coffee and Concistency…. SeqA GARFIELD THE LAST FAT CAT. SeqB GARFIELD THE FAST CAT. SeqC GARFIELD THE VERY FAST CAT. SeqD THE FAT CAT. - PowerPoint PPT Presentation

Transcript of Aligning Sequences With T-Coffee

Page 1: Aligning Sequences With T-Coffee

Aligning SequencesWith

T-Coffee

Cédric NotredameComparative Bioinformatics GroupBioinformatics and Genomics Program

Page 2: Aligning Sequences With T-Coffee

T-Coffee and Concistency…

SeqA GARFIELD THE LAST FAT CAT

SeqB GARFIELD THE FAST CAT

SeqC GARFIELD THE VERY FAST CAT

SeqD THE FAT CAT

SeqA GARFIELD THE LAST FA-T CATSeqB GARFIELD THE FAST CA-T ---SeqC GARFIELD THE VERY FAST CATSeqD -------- THE ---- FA-T CAT

Page 3: Aligning Sequences With T-Coffee

Consistency: Conflicts and Information

Y

W Z

X

Z

Y

ZW

Y

Z

X

W

X

Y

OR

+

+Non

ConsistentConsistent

Y

W Z

Y

ZW

ORX

X

X

Page 4: Aligning Sequences With T-Coffee

T-Coffee and Concistency…

SeqA GARFIELD THE LAST FAT CAT Prim. Weight =88SeqB GARFIELD THE FAST CAT ---

SeqA GARFIELD THE LAST FA-T CAT Prim. Weight =77 SeqC GARFIELD THE VERY FAST CAT

SeqA GARFIELD THE LAST FAT CAT Prim. Weight =100SeqD -------- THE ---- FAT CAT

SeqB GARFIELD THE ---- FAST CAT Prim. Weight =100SeqC GARFIELD THE VERY FAST CAT

SeqC GARFIELD THE VERY FAST CAT Prim. Weight =100SeqD -------- THE ---- FA-T CAT

Page 5: Aligning Sequences With T-Coffee

T-Coffee and Concistency…

SeqA GARFIELD THE LAST FAT CAT Prim. Weight =88SeqB GARFIELD THE FAST CAT ---

SeqA GARFIELD THE LAST FA-T CAT Prim. Weight =77 SeqC GARFIELD THE VERY FAST CAT

SeqA GARFIELD THE LAST FAT CAT Prim. Weight =100SeqD -------- THE ---- FAT CAT

SeqB GARFIELD THE ---- FAST CAT Prim. Weight =100SeqC GARFIELD THE VERY FAST CAT

SeqC GARFIELD THE VERY FAST CAT Prim. Weight =100SeqD -------- THE ---- FA-T CAT

SeqA GARFIELD THE LAST FAT CAT Weight =88SeqB GARFIELD THE FAST CAT ---

SeqA GARFIELD THE LAST FA-T CAT Weight =77 SeqC GARFIELD THE VERY FAST CATSeqB GARFIELD THE ---- FAST CAT

SeqA GARFIELD THE LAST FA-T CAT Weight =100SeqD -------- THE ---- FA-T CATSeqB GARFIELD THE ---- FAST CAT

Page 6: Aligning Sequences With T-Coffee

T-Coffee and Concistency…

SeqA GARFIELD THE LAST FAT CAT Weight =88SeqB GARFIELD THE FAST CAT ---

SeqA GARFIELD THE LAST FA-T CAT Weight =77 SeqC GARFIELD THE VERY FAST CATSeqB GARFIELD THE ---- FAST CAT

SeqA GARFIELD THE LAST FA-T CAT Weight =100SeqD -------- THE ---- FA-T CATSeqB GARFIELD THE ---- FAST CAT

Page 7: Aligning Sequences With T-Coffee

T-Coffee and Concistency…

Page 8: Aligning Sequences With T-Coffee

T-Coffee and Concistency…

Page 9: Aligning Sequences With T-Coffee

T-Coffee and Concistency…

Page 10: Aligning Sequences With T-Coffee

Methods

Data

Scalability

Page 11: Aligning Sequences With T-Coffee

Running T-Coffee over the Web

Page 12: Aligning Sequences With T-Coffee

Available Servers and Flavors

Page 13: Aligning Sequences With T-Coffee

Which MSA Method ???

Page 14: Aligning Sequences With T-Coffee

Combining Many MSAs into ONE

MUSCLE

MAFFT

ClustalW

???????

T-Coffee

Page 15: Aligning Sequences With T-Coffee

Consistency and Accuracy

Page 16: Aligning Sequences With T-Coffee

What To Do Without Structures

Page 17: Aligning Sequences With T-Coffee

Using the M-Coffee Server

Page 18: Aligning Sequences With T-Coffee

Using the M-Coffee Server

Page 19: Aligning Sequences With T-Coffee
Page 20: Aligning Sequences With T-Coffee

Integrating New Types of DataTemplate Based Sequence

Alignments

Page 21: Aligning Sequences With T-Coffee

ExperimentalData

TARGET

ExperimentalData

TARGETTemplate

Aligner

Template-Sequence Alignment

Primary Library

Template Alignment

Template based Alignmentof the Sequences

Templates Templates

TARGET

Page 22: Aligning Sequences With T-Coffee

Exploring The Template World

Template Generator Alignment Method

RNA Structure Prediction RNA Aligner

Protein Structure BLAST vs PDB 3D Aligner

Profile BLAST vs NR Profile/Profile Alignment

Gene Structure ENSEMBL Genome Aligner

Promoter Transfac Meta-Aligner

Page 23: Aligning Sequences With T-Coffee

Exploring The Template World

Template Generator Alignment Method

Mode

RNA Structure Prediction RNA Aligner R-Coffee

Protein Structure BLAST /PDB 3D Aligner 3D-Coffee

Profile BLAST/NR Profile/Profile PSI-Coffee

Gene Structure ENSEMBL Genome Aligner Exoset

Promoter Transfac Meta-Aligner Meta-Coffee

Page 24: Aligning Sequences With T-Coffee

3D-Coffee/ExpressoIncorporating

Structural Information

Page 25: Aligning Sequences With T-Coffee

Expresso: Finding the Right Structure

Sources

Templates

Library

BLAST BLAST

SAP

Template Alignment

Source Template Alignment

Remove Templates

Templates

Page 26: Aligning Sequences With T-Coffee

PSI-CoffeeHomology Extension

Page 27: Aligning Sequences With T-Coffee

Exploring The Template World

Page 28: Aligning Sequences With T-Coffee

What is Homology Extension ?

L L

L

?

-Simple scoring schemes result in alignment ambiguities

Page 29: Aligning Sequences With T-Coffee

What is Homology Extension ?

L L

L

LLLLLL

LLIVIL

LLLLLL

Profile 1

Profile 2

Page 30: Aligning Sequences With T-Coffee

What is Homology Extension ?

L L

L

LLLLLL

LLIVIL

LLLLLL

Profile 1

Profile 2

Page 31: Aligning Sequences With T-Coffee

PSI-Coffee: Homology Extension

Sources

Templates

Library

BLAST BLAST

Template Alignment

Source Template Alignment

Remove Templates

TemplatesProfile Aligner

Page 32: Aligning Sequences With T-Coffee

Benchmarks

Page 33: Aligning Sequences With T-Coffee

Do Benchmarks All Tell the same story?

Based on

Page 34: Aligning Sequences With T-Coffee

Method Method Template Score Comment

ClustalW-2 Progressive NO 22.74

PRANK Gap NO 26.18 Science2008

MAFFT Iterative NO 26.18

Muscle Iterative NO 31.37

ProbCons Consistency NO 40.80

ProbCons MonoPhasic NO 37.53

T-Coffee Consistency NO 42.30

M-Coffe4 Consistency NO 43.60

PSI-Coffee Consistency Profile 53.71

PROMAL Consistency Profile 55.08

PROMAL-3D Consistency PDB 57.60

3D-Coffee Consistency PDB 61.00 Expresso

Score: fraction of correct columns when compared with a structure based reference (BB11 of BaliBase).

Page 35: Aligning Sequences With T-Coffee

Method Method Template Score Comment

ClustalW-2 Progressive NO 22.74

PRANK Gap NO 26.18 Science2008

MAFFT Iterative NO 26.18

Muscle Iterative NO 31.37

ProbCons Consistency NO 40.80

ProbCons MonoPhasic NO 37.53

T-Coffee Consistency NO 42.30

M-Coffe4 Consistency NO 43.60

PSI-Coffee Consistency Profile 53.71

PROMAL Consistency Profile 55.08

PROMAL-3D Consistency PDB 57.60

3D-Coffee Consistency PDB 61.00 Expresso

Score: fraction of correct columns when compared with a structure based reference (BB11 of BaliBase).

Consistency

Page 36: Aligning Sequences With T-Coffee

Method Method Template Score Comment

ClustalW-2 Progressive NO 22.74

PRANK Gap NO 26.18 Science2008

MAFFT Iterative NO 26.18

Muscle Iterative NO 31.37

ProbCons Consistency NO 40.80

ProbCons MonoPhasic NO 37.53

T-Coffee Consistency NO 42.30

M-Coffe4 Consistency NO 43.60

PSI-Coffee Consistency Profile 53.71

PROMAL Consistency Profile 55.08

PROMAL-3D Consistency PDB 57.60

3D-Coffee Consistency PDB 61.00 Expresso

Score: fraction of correct columns when compared with a structure based reference (BB11 of BaliBase).

Homology Extension

Page 37: Aligning Sequences With T-Coffee

Method Method Template Score Comment

ClustalW-2 Progressive NO 22.74

PRANK Gap NO 26.18 Science2008

MAFFT Iterative NO 26.18

Muscle Iterative NO 31.37

ProbCons Consistency NO 40.80

ProbCons MonoPhasic NO 37.53

T-Coffee Consistency NO 42.30

M-Coffe4 Consistency NO 43.60

PSI-Coffee Consistency Profile 53.71

PROMAL Consistency Profile 55.08

PROMAL-3D Consistency PDB 57.60

3D-Coffee Consistency PDB 61.00 Expresso

Score: fraction of correct columns when compared with a structure based reference (BB11 of BaliBase).

Structural Extension

Page 38: Aligning Sequences With T-Coffee

T-Coffee and The World

BLAST/SOAP

-Some Templates are obtained with a BLAST-Queries can be sent to the EBI or the NCBI-No Need for a Local BLAST installation

Users sequences