András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists...

17
Project Haystack András Volford András Strácz Iván Solt

Transcript of András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists...

Page 1: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

Project Haystack

András Volford

András Strácz

Iván Solt

Page 2: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

Picture

about drug

developmen

t

Hit 2 Lead

Assay / Corporate database / IP

Purchase or synthesize

Building blocks

SAR by catalog

Scaffold hopping

Off target

measurements

similar assay data

Page 3: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.
Page 4: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

Information you don’t know it exists

Universal, domain agnostic, simple access

to the complete research history.

Page 5: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

Goal

Indexing arbitrary amount of data and

running substructure and similarity search.

Quickly explore your chemical space

Page 6: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

Some searchable implementations for

specific databases.

Current, partial solutions

Page 7: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

Information flood

Page 8: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

Databases

Your Corporate DB

626 k 1.7 M 17.6 M 64 k

39.9 M 28.3 M 5.8 M 94.5 M

18 M 18.5 M 75 M

or

142 M

6.5 M

or

105 M

Page 9: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

Database

Relational Database

ChemBL

SDF file

Page 10: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

Database

Relational Database

ChemBL

SDF file

id | canmol descriptor | duplicate | molecule | metadata

ChemBL

Page 11: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

Database

Relational Database

ChemBL Emolecules

BindingDBPubchem

Zinc

Zinc

Page 12: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

Database

PostgreSQL Cartridge

ChemBL Emolecules

BindingDBPubchem

Zinc

Zinc

Main table

Molecule

<=> TId

number

id | canmol descriptor | molecule | reference IDs

Page 13: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

Insert DB

PostgreSQL Cartridge

ChemBL Emolecules

BindingDBPubchem

Zinc

Zinc

Main table

Molecule

<=> TId

number

Load into new tablemark duplicates based

on inchikey

1

Import to Main tableFor non duplicate structures

If new entry created insert molecule

else update TId numbers

2

Page 14: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

Technical

PostgreSQL Cartridge

ChemBL Emolecules BindingDB

Pubchem

Main table

Molecule

<=> TId

number

Search

etc…

Query

Molecule +

additional data

Page 15: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.
Page 16: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.
Page 17: András Volford András Strácz Iván Solt - ChemAxon · Information you don’t know it exists Universal, domain agnostic, simple access to the complete research history.

See it live at BOOTH # 543