Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh...

24
Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia

Transcript of Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh...

Page 1: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Standardizer

Molecular Cosmetics for Chemoinformatics

György PirokNóra MáteIstván CsehSzilárd DórántPéter KovácsSzabolcs CsepregiFerenc Csizmadia

Page 2: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Why standardize structures?

Canonicalisation Uniformization of structures without changing the chemical content to

recognize duplicates, functional groups (aromatization, mesomers, tautomers, ... )

Beautification Making the structures visually more attractive (dearomatization,

cleaning coordinates, wedge orientation, ... )

Modification Conversion of structures by modifying its original content as a

preparation step for further chemoinformatics tasks (transformations, removing stereo, removing R-groups, ...).

often difficult to categorize the standardization actions

Page 3: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Canonicalisation

making hydrogens explicit

converting to canonical mesomer form

transforming to user defined mesomer form

Hydrogens

aromatizing Kekülé rings

Resonant structures

converting to canonical tautomer form

removing user defined fragments

transforming to user defined tautomer form

Tautomers

expanding stoichiometry

Other

removing small fragments

making hydrogens implicit

setting the chiralflag

Page 4: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Mesomers

Page 5: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Tautomersoxo-enol, enamine-imine

Page 6: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Tautomerspyridone-pyridol

Page 7: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Fragment removal

Page 8: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Specific counterion removal

Page 9: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Solvent removal

Page 10: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Stoichiometry expansionexpanding salt stoichiometry

Page 11: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Stoichiometry expansionexpanding reaction stoichiometry

Page 12: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Beautification

calculating 2D coordinates

Hydrogens

converting aromaticrings to Kekülé format

Resonant structures

making hydrogens implicit

Cleaning

reallocating wedge bonds

contracting/expanding/ungrouping abbreviated and multiple groups

Groups

template based cleaning

3D geometry optimization

Page 13: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Template-based Cleaning2D-coordinate calculation of macrocycles or bridged systems

Page 14: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

query

Template-based Cleaningaligning search results to the query

Page 15: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

client

Canonicalization During Database Import

RelationalDatabase

input structures

canonicalization configuration original structurescanonicalized structures

server

StandardizerJChem Base/ Cartridge

Page 16: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

client

Sending Query to the Database

RelationalDatabase

serverquery structure

canonicalization configuration canonicalized queryquery is compared

to the canonicalized structures

StandardizerJChem Base/ Cartridge

Page 17: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Displaying Result Structures

RelationalDatabase

original structures

serverclient

beautification configuration

beautified structures

StandardizerJChem Base/ Cartridge

Page 18: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Modification

custom transformations+

Page 19: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

API and command line interface

Standardizer st = new Standardizer(new File("standardize.xml"));st.standardize(mol);

standardize input.sdf -c config.xml -o output.smiles

Page 20: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Live Demonstration

Page 21: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Applications: Virtual Synthesis

Page 22: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Applications: Structure Databases

Page 23: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

How can ChemAxon Help

Free for non commercial websites

Free for academic teaching and research“Academic Package”

Free Academic Package to be extended to cover academic networks – campus-wide roll out

Page 24: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.

Acknowledments

Ferenc Csizmadia Nóra Máté István Cseh Szabó Attila Szilárd Dóránt Péter Kovács Szabolcs Csepregi