Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh...

Post on 26-Mar-2015

219 views 6 download

Tags:

Transcript of Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh...

Standardizer

Molecular Cosmetics for Chemoinformatics

György PirokNóra MáteIstván CsehSzilárd DórántPéter KovácsSzabolcs CsepregiFerenc Csizmadia

Why standardize structures?

Canonicalisation Uniformization of structures without changing the chemical content to

recognize duplicates, functional groups (aromatization, mesomers, tautomers, ... )

Beautification Making the structures visually more attractive (dearomatization,

cleaning coordinates, wedge orientation, ... )

Modification Conversion of structures by modifying its original content as a

preparation step for further chemoinformatics tasks (transformations, removing stereo, removing R-groups, ...).

often difficult to categorize the standardization actions

Canonicalisation

making hydrogens explicit

converting to canonical mesomer form

transforming to user defined mesomer form

Hydrogens

aromatizing Kekülé rings

Resonant structures

converting to canonical tautomer form

removing user defined fragments

transforming to user defined tautomer form

Tautomers

expanding stoichiometry

Other

removing small fragments

making hydrogens implicit

setting the chiralflag

Mesomers

Tautomersoxo-enol, enamine-imine

Tautomerspyridone-pyridol

Fragment removal

Specific counterion removal

Solvent removal

Stoichiometry expansionexpanding salt stoichiometry

Stoichiometry expansionexpanding reaction stoichiometry

Beautification

calculating 2D coordinates

Hydrogens

converting aromaticrings to Kekülé format

Resonant structures

making hydrogens implicit

Cleaning

reallocating wedge bonds

contracting/expanding/ungrouping abbreviated and multiple groups

Groups

template based cleaning

3D geometry optimization

Template-based Cleaning2D-coordinate calculation of macrocycles or bridged systems

query

Template-based Cleaningaligning search results to the query

client

Canonicalization During Database Import

RelationalDatabase

input structures

canonicalization configuration original structurescanonicalized structures

server

StandardizerJChem Base/ Cartridge

client

Sending Query to the Database

RelationalDatabase

serverquery structure

canonicalization configuration canonicalized queryquery is compared

to the canonicalized structures

StandardizerJChem Base/ Cartridge

Displaying Result Structures

RelationalDatabase

original structures

serverclient

beautification configuration

beautified structures

StandardizerJChem Base/ Cartridge

Modification

custom transformations+

API and command line interface

Standardizer st = new Standardizer(new File("standardize.xml"));st.standardize(mol);

standardize input.sdf -c config.xml -o output.smiles

Live Demonstration

Applications: Virtual Synthesis

Applications: Structure Databases

How can ChemAxon Help

Free for non commercial websites

Free for academic teaching and research“Academic Package”

Free Academic Package to be extended to cover academic networks – campus-wide roll out

Acknowledments

Ferenc Csizmadia Nóra Máté István Cseh Szabó Attila Szilárd Dóránt Péter Kovács Szabolcs Csepregi