ORDered ALignment Information Explorer. Alignment editor Conservation computtion “barcode” =...

Post on 02-Jan-2016

214 views 0 download

Tags:

Transcript of ORDered ALignment Information Explorer. Alignment editor Conservation computtion “barcode” =...

ORDered ALignment Information Explorer

Alignment editor

Conservation computtion

“barcode” = schematic alignment

Phylogenic tree

3D viewer

=> sequence / structure / function / evolution cross-talks

Sequence Clustering

Features Editor

AlignmentPositions

Taxa

Contexts

Exploring Alignment Information up to the residue Level

Globallevel

Clusteringslevel

SingleTaxaLevel

Full length

Domains

Motifs, secondary structures, …..

ResiduesX x x

3D structure

conservation

phylogeny

Reads ALN, MSF, TFA, RSF, Macsims/XML, ORD file formats

What is an alignment ?- description of the alignment (NorMD score, date, etc …)- set of sequences

generic information (length, EC, phylogeny, …) features (PFAM-A, PROSITE, BLOCK, etc …)

- clustering = groups of sequences- conservation scores based on clustering

and Alignments :

Sequence editing Clustering editing

CurrentAlignment

Overwrite current Create new MACSIM

Ordalie parameters (colors, fonts, thresholds, …)

Description of the alignment (name, NorMD score, creation date, ...)

Original Set of aligned sequences- general information (length, pI, mol. Weight, …)- features (Pfam domain, secondary structures, …)- AA sequence

Coordinates of 3D structures corresponding to PDB entriesDescription of 3D objects (representation type, colors, etc …)

M 3 – new clusteringClustering 1Sequences set 1-> conservation

M 4 – edit sequencesClustering 1Edit Sequences-> conservation

M 5 – clust. + editClustering 2Edit Sequences-> conservation

Inside :

M 2 – macsims clusteringMacsims ClusteringOriginal Sequences set-> original conservation

M 1 – original alignmentOriginal Sequences set

SQlite Database accessible through SQL statements ODBC compatible

Platform independantLight weight

Contains all Ordalie data preferences performances

ORD : file format

Modes :- features- search- pairwise identity- sequences editor - features editor- clustering- trees- conservation- superposition

Zone selection :•Whole alignment•By Feature•User defined

•Criterions :•% identity•pI•Length•Composition (aminoacid, physico-chemical groups)

•Clustering Methods :•Manual clustering by inserting/removing separators•Hierarchical classification + Secator•Kmeans + DPC•Mixture model + AIC

Clustering:

Threshold Global Identity -> 100% IdentityGlobal Conserved -> >80% identity.Group Identity -> 100 % identity in group

Mean Distanceas cf ClustalX

Vector Normbased on a vectorial (polarity,volume) representation of amino acids

Liu2based on Blosum62

Entropytakes gaps and physico-chemical properties of AA intoaccount

Validity of score clustering ?

Conservation Methods :

Key Usage Points :

Always leave a mode before entering a new one

Sequences selection : « à la Windows »- <Button-1> selects a sequence- <Control-Button-1> add current seq. to selection- <Shift-Button-1>

Zone selection :- All (button)- selecting a feature <Control-Button-1>- manuaally :

- <Button-1> for starting point- <Button-3> for ending point- <Shift-Button-3> to delete a selected zone

TODO List :

Short term :- Bugs, if any …. ;-)- group naming- project handling- MacOS X version- documentation and tutorials- publication

Long term :- Bugs, if any …. ;-)- on-line web services- on-line Macsims calculation- on-line sequence, information, feature updating- 3D surface mapping of features.- ….

Running Ordalie :

On surf/lameX :- setordalie- ordalie <filename>- ordalie <filename> option value option value

File formats: MSF, TFA, ALN, RSF, XML/Macsims and ORD

Conversion :ordalie toto.msf –convert ALN

- toto.aln

1985 1985

19851985

19851985

Ens

eign

emen

tEns

eign

emen

t