1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire...

37
1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000 Strasbourg

Transcript of 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire...

Page 1: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

1

InstantJChem: a flexible chemical database system

G. Marcou, D. Horvath+Laboratoire d’infochimie, Université de Strasbourg, 1, rue

Blaise Pascal, 67000 Strasbourg

Page 2: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Introduction The goal is to present InstantJChem for the

storage and manipulation of chemical information

1. General presentation2. Database search3. Creation of a database from scratch

Page 3: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

What is a database? A database stores data in an ordered form on a

precise subject. A relational database stores information into

tables which possess inter-references A relational database management system

(RDBMS) is a software that manages relational databases

InstantJChem is not a database and is not an RDBMS.

Page 4: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

What is InstantJChem? InstantJChem is a friendly interface between a

RDBMS, chemical information and the user.

User

RDBMS

Chemical Information

Page 5: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Key concepts of InstantJChem

ProjectsSchemaDatabases and TablesEntitiesData TreesViews

Page 6: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 1Create a new project names IJCExercises…

Page 7: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Key concept: Project

Project

contains resources and connections to one or more databases.

icon

Page 8: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 1

…and import the file SC100.SDF in it….

Page 9: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Key concept: Schema

Schema/Database

Contains connection to a database and special tables (JChemProperties)

icon

Page 10: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Key concept: Database and Tables

Table

Database and tables are managed by the RDBMS.

Actually store information.

icon

Page 11: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

What can be storedType Description

Standard tableInteger Long integer: 232 = 4294967296

Text User can specify widths of text fields as large as needed.

Real Real double-precision

Date Allows to store dates.

Boolean Value is True or False

List (Standard) To store a list of database items

JChem table

Chemical terms A list of functions evaluated on chemical structures: logD, pKa, tautomers,...

Structure Chemical structure, automatically created with a Jchem table

Page 12: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Key concept: Entities

Entity

An entity is a representation of data.

icon

It is a unique interface to conceptually different types of tables (Standard, Chemical, SQL, Extractions, etc).

Page 13: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Key concept: Data Trees

Data Tree

A collection of entities and views.

icon

Organize information using a hierarchy (parent-child relationship between entities).

Page 14: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 1….Customize a browser for it.

Page 15: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Key concept: Views

Views

An interface to data.

icon

For simple data, a spreadsheet view is relevant. For complex relational data, a form is mandatory.

Page 16: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 2In the SC100 database, search for fluorobenzene and pyridine containing molecules. Use Substructure or Similarity search.

Page 17: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 2In the SC100 database, search for fluorobenzene and pyridine containing molecules. Use Substructure or Similarity search.

Substructure search: 20 hitsSimilarity search: 0 hits

Substructure search: 14 hitsSimilarity search: 0 hits

Similarity search uses Chemical Hashed Fingerprints defined at database creation.

Page 18: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Chemical Hashed Fingerprints (CHF)

• Pattern Length: number of bonds of a pattern

• Fingerprint Length: total number of bits to store the fingerprint

• Bits per pattern: number of bits a pattern shall set on

Efficient annotation to accelerate structure search

www.chemaxon.com

Page 19: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 3Combine molecule 25 and 89 into a pseudo-molecule to perform a superstructure query.

Page 20: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 4Use compound 46 as a Full and Full fragment query to search the database. Repeat after removing the bromide from the query.

Page 21: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Structure Searches

www.chemaxon.com

Page 22: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 5Search benzene containing compounds, which name contains “pyrimidin” and annotated as “Good” concerning their aqueous solubility.

Page 23: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 6Search for compounds with at least one aromatic ring containing at least on Nitrogen atom

Page 24: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 7Search for compounds which MolWeight > 200 and not containing a benzene ring

Page 25: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 8Search for compounds with MolWeigh > 200, then for compounds without a benzene ring and search for the union of the hit lists.

Page 26: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Execrise 9Search for compounds possessing more than 4 microspecies at pH=4.0….

Page 27: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 9… Export your hit list.

Page 28: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 10Import in your project the file ISICCRsm.RDF…

Page 29: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 10… Create a Browser for this database

Page 30: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 11Search for reactions including an imidazole ring into their reactants then into their products.

Page 31: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 12Add to your Schema a new data tree and structure entity named AlkanBoilingPoint…

Page 32: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 12… and add a floating point value field named BoilingPoint.

Page 33: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 13Add to the AlkanBoilingPoint entity the following data.

Page 34: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 14Add to the AlkanBoilingPoint entity a new date field named Date and fill it.

Page 35: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Exercise 15Add to the AlkanBoilingPoint entity a calculated value of LogP using a Chemicalterm field.

Page 36: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Summary Create a project and schema Import data Search by substructure, superstructure, similarity,

and exact match Search by keyword Combining queries and result lists Export query results Create a new database

Page 37: 1 InstantJChem: a flexible chemical database system G. Marcou, D. Horvath + Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000.

Conclusion InstantJChem is a Chemoinformatics layer above a

standard SGDB. Provides many more Chemoinformatics services

(databases overlap, QSPR modeling, plots, enumeration, scripting)

SGDBSGDB InstantJChemInstantJChem