“ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European...

58
Good annotation Good annotation practice practice for chemical for chemical data: data: ChEBI experience ChEBI experience Kirill Degtyarenko Kirill Degtyarenko European Patent Office European Patent Office

Transcript of “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European...

Page 1: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

““Good annotation Good annotation practicepractice”” for chemical for chemical

data: data: ChEBI experienceChEBI experience

Kirill DegtyarenkoKirill DegtyarenkoEuropean Patent OfficeEuropean Patent Office

Page 2: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

good Naming practice how to give most appropriate names

good Ontology practice how to link the entity of interest by

defined logical relationships to other entities

good Drawing practice

• how to draw unambiguous 2-D diagrams

Good anNODation practice

Page 3: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

or

How to Give Most Appropriate Names

Good Naming Practice

Page 4: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

2-{[3-(trifluoromethyl)phenyl]amino}benzoic acid

NH

O

OH

F

F

F

Systematic Name (IUPAC)

1

23

4

5

6

1

2

34

5

6

Page 5: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

• flufenamic acid (INN English)• acide flufénamique (INN French)• ácido flufenámico (INN Spanish)• acidum flufenamicum (INN Latin)• Flufenaminsäure (German)

NH

O

OH

F

F

F

Common Name

Page 6: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

The Unpronounceables

CHEBI:48935

(E)-roxithromycin

IUPAC name:

(3R,4S,5S,6R,7R,9R,10E,11S,12R,13S,14R)-4-(2,6-dideoxy-3-C-methyl-3-O-methyl-α-L-ribo-hexopyranosyloxy)-14-ethyl-7,12,13-trihydroxy-10-{[(2-methoxyethoxy)methoxy]imino}-6-[3,4,6-trideoxy-3-(dimethylamino)-β-D-xylo-hexopyranosyloxy]-3,5,7,9,11,13-hexamethyloxacyclotetradecan-2-one

O O

O

O

OH

N

O

O

N

OH

OH

O OO

O

OH OH

CH3

CH3

CH3

CH3

CH3CH3

CH3 CH3

CH3

CH3

CH3

CH3

CH3CH3

Page 7: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

O O

O

O

OH

N

O

O

N

OH

OHO

OH OH

CH3

CH3

CH3CH3CH3

CH3 CH3

CH3

CH3

CH3

CH3

CH3CH3

OOO

CH3

CHEBI:32109(Z)-roxithromycin

What is the common name of roxithromycin?

CHEBI:48935(E)-roxithromycinINN: roxithromycin

O O

O

O

OH

N

O

O

N

OH

OH

O OO

O

OH OH

CH3

CH3

CH3

CH3

CH3CH3

CH3 CH3

CH3

CH3

CH3

CH3

CH3CH3

Page 8: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

O O

O

O

OH

N

O

O

N

OH

OH

O OO

O

OH OH

CH3

CH3

CH3

CH3

CH3CH3

CH3 CH3

CH3

CH3

CH3

CH3

CH3CH3

O O

O

O

OH

N

O

O

N

OH

OHO

OH OH

CH3

CH3

CH3CH3CH3

CH3 CH3

CH3

CH3

CH3

CH3

CH3CH3

OOO

CH3

CHEBI:48844 roxithromycin

(E)-roxithromycin

O O

O

O

OH

N

O

O

N

OH

OH

O OO

O

OH OH

CH3

CH3

CH3

CH3

CH3CH3

CH3 CH3

CH3

CH3

CH3

CH3

CH3CH3

(Z)-roxithromycin

Page 9: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

What is thiamine?CHEBI:18385thiamine(1+)aka thiamine

N

N

NH2

CH3 S

CH3

OH

N+

CHEBI:33283thiamine(1+) chlorideINN: thiamine

N

N

NH2

CH3 S

CH3

OH

N+

Cl-

CHEBI:49105 thiamine(2+) dichlorideaka thiamine chloride hydrochloride aka thiamine hydrochloride

N

NH3+

NCH3 S

CH3

OH

N+

Cl-

Cl-

Page 10: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Problem is not unique to ChEBI

Cf. phenol vs phenols phenol metabolism vs phenols

metabolism

Bad solution: article use a phenol metabolism?

Solution: prepositional phrases metabolism of phenols

Plurals and singulars

Page 11: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

or

How to Draw Unambiguous 2-D Diagrams

Good Drawing Practice

Page 12: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Linear forms of monosaccharides

CHO

CH2OH

H OH

OH H

OH H

H OH

OH

O

H OH

OH H

OH H

H OH

H H

OH

OH

OH

OH

OH

O

Page 13: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Pyranose forms of monosaccharides

O

OHH

HOH

HOH

H OH

H

CH2OH

O

CH2OH

OH

OH

OH

OH

OH

OH

OH

OH

OOH

Page 14: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Fused systems

(R)-camphor

ambiguous unambiguous

CH3

OCH3

CH3

O

CH3CH3

CH3

Page 15: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Square planar geometry

InChI=1/2ClH.2H3N.Pt/h2*1H;2*1H3;/q;;;;+2/p-2

Pt

N Cl

ClN

HH

H

H

HH

Pt

NCl

N Cl

H H

H

H

HH

cisplatin transplatin

SMILES: [H][N]([H])([H])[Pt](Cl)(Cl)[N]([H])([H])[H]

Page 16: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Compositional uncertainty

Positional uncertainty

Configurational uncertainty

Conformational uncertainty

Uncertainty and ambiguity in chemistry

Page 17: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Examples

an alkali metal cation

vanadate(V) anion

[2H]ethanol

Compositional uncertainty

Page 18: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Examples

L-bromohistidine residue

pteroic acid (several tautomers)

Positional uncertainty

Page 19: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Examples

androstane

rel-(2R,3R)-2-amino-3-methylpentanoic acid

tetradec-11-enoic acid

Configurational uncertainty

Page 20: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Examples

cyclohexane: chair, boat, twist

protein secondary structure: , , …

Conformational uncertainty

Page 21: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

or

How to Link the Entity of Interest by Defined

Logical Relationships to Other Entities

Good Ontology Practice

Page 22: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

• Molecular structure ontology• Subatomic particle ontology• Biological role ontology • Application ontology

ChEBI ontology

Page 23: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Relationships in ChEBI ∆ Is A generic

⋄ Is Part Of generic

♯ Is Conjugate Acid Of specific

♭ Is Conjugate Base Of specific

Is Enantiomer Of specific

Is Tautomer Of specific

ℛ Is Substituent Group From

specific

ℋ Has Parent Hydride specific

ℱ Has Functional Parent specific

Page 24: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Is A relationship

NH2

O

OHSH

NH2

O

OHSH∆

L-cysteine

cysteineis a

Page 25: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

NH3+

O

OHSH

NH3+

O

OHSH

L-cysteinium

Is Part Of

L-cysteine hydrochloride

is part of

Cl-

has part

Page 26: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

NH2

O

OHSH

Is Enantiomer Of

NH2

O

OHSH

L-cysteine

NH2

O

OHSH

∆ ∆

D-cysteine

is enantiomer of

Page 27: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Is Tautomer Of

3H-pyrrole

NH

N N

2H-pyrrole

1H-pyrrole

Page 28: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

NH2

O

O-

S-

NH3+

O

OHSH

NH2

O

O-

SH

Is Conjugate Acid Of

NH2

O

OHSH♯

L-cysteine

L-cysteinate(1–)is conjugate acid of

L-cysteinium

L-cysteinate(2–)

♯♯

Page 29: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

NH2

O

O-

SH

Is Conjugate Base Of

NH2

O

OHSH

L-cysteine

L-cysteinate(1–)

NH2

O

O-

S-

NH3+

O

OHSH

L-cysteinium

L-cysteinate(2–)

♭ ♭

Page 30: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

NH2

O

O-

SH

Acid/base relationships

NH2

O

OHSH

♭L-

cysteineL-cysteinate(1–)

NH2

O

O-

S-

NH3+

O

OHSH

L-cysteinium

L-cysteinate(2–)

♭♯♯

Page 31: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

NH2

O

SH

L-cysteinyl

NH

O

SH

NH

O

OHSH

Is Substituent Group From

NH2

O

OHSHL-cysteine

L-cysteine residue

L-cysteino

*

*

*

*

Page 32: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

salutaridinol

Has Parent Hydride

has parent hydride

is parent hydride of

ℋ NHH

morphinan

OH

N

O

O

CH3

OH

CH3

CH3

Page 33: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

7-O-acetylsalutaridinol

Has Functional Parent

has functional parent

is functional parent of

salutaridinol

OH

N

O

O

CH3

CH3

CH3

OCH3

O

OH

N

O

O

CH3

OH

CH3

CH3

Page 34: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Live annotation demo

Page 35: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Going to SourceForge…

Page 36: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Reading a request…

Page 37: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Going to curator tool…

Page 38: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Search result…

Page 39: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Adding new entry…

Page 40: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Editing new entry…

Page 41: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Success!

Page 42: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Let’s draw

Page 43: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.
Page 44: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Success again!

Page 45: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Using ACD/Name (1)

Page 46: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Using ACD/Name (2)

Page 47: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Adding IUPAC name (1)

Page 48: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Adding IUPAC name (2)

Page 49: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Classifying (1)

Page 50: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Classifying (2)

Page 51: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Classifying (3)

Page 52: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Classifying (4)

Page 53: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

The last touch (1)

Page 54: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

The last touch (2)

Page 55: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Responding request…

Page 56: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

A job well done…

Page 57: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

• Rafael Alcántara• Michael Ashburner• Volker Ast *• Michael Darsow *• Paula de Matos• Marcus Ennis• Janna Hastings• Alan McNaught *• Chris Steinbeck• Martin Zbinden *

The team

Page 58: “ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

• Kristian Axelsen• Hélène Courrier• Anne Morgat• Ian Unwin• Our faithful Users

• EU: funding

Thanks