Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for...

20
Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics

Transcript of Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for...

Page 1: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Standardizer

Molecular Cosmetics for Chemoinformatics

György Pirok

Java Solutions for Cheminformatics

Page 2: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Why standardize structures?

Canonicalisation Uniformization of structures without changing the chemical content to

recognize duplicates, functional groups (aromatization, mesomers, tautomers, ... )

Beautification Making the structures visually more attractive (dearomatization,

cleaning coordinates, wedge orientation, ... )

Modification Conversion of structures by modifying its original content as a

preparation step for further chemoinformatics tasks (transformations, removing stereo, removing R-groups, ...).

Page 3: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Canonicalisation

making hydrogens explicit

converting to canonical mesomer form

transforming to user defined mesomer form

Hydrogens

aromatizing Kekülé rings

Resonant structures

converting to canonical tautomer form

removing user defined fragments

transforming to user defined tautomer form

Tautomers

expanding stoichiometry

Other

removing small fragments

making hydrogens implicit

setting the chiralflag

Page 4: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Mesomers

Page 5: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Tautomersoxo-enol, enamine-imine

Page 6: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Fragment removal

Page 7: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Specific counterion removal

Page 8: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Solvent removal

Page 9: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Beautification

calculating 2D coordinates

Hydrogens

converting aromaticrings to Kekülé format

Resonant structures

making hydrogens implicit

Cleaning

reallocating wedge bonds

contracting/expanding/ungrouping abbreviated and multiple groups

Groups

template based cleaning

3D geometry optimization

Page 10: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Template-based Cleaning2D-coordinate calculation of macrocycles or bridged systems

Page 11: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

query

Template-based Cleaningorienting search results to the query

Page 12: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

client

Canonicalization During Database Import

RelationalDatabase

input structures

canonicalization configuration original structurescanonicalized structures

server

StandardizerJChem Base/ Cartridge

Page 13: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

client

Sending Query to the Database

RelationalDatabase

serverquery structure

canonicalization configuration canonicalized queryquery is compared

to the canonicalized structures

StandardizerJChem Base/ Cartridge

Page 14: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Displaying Result Structures

RelationalDatabase

original structures

serverclient

beautification configuration

beautified structures

StandardizerJChem Base/ Cartridge

Page 15: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Modification

custom transformations+

Page 16: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

API and command line interface

Standardizer st = new Standardizer(new File("standardize.xml"));st.standardize(mol);

standardize input.sdf -c config.xml -o output.smiles

Page 17: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Standardizer GUI

Page 18: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Applications: Virtual Synthesis

Page 19: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Applications: Structure Databases

Page 20: Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.

Acknowledments

Ferenc Csizmadia Nóra Máté István Cseh Szabó Attila Alex Allardyce Szilárd Dóránt Péter Kovács Szabolcs Csepregi

Java Solutions for Cheminformatics