HELM Notation Overview
-
Upload
pistoiaallianceclaire -
Category
Technology
-
view
224 -
download
1
description
Transcript of HELM Notation Overview
http://pistoiaalliance.org @PistoiaAlliance
Pistoia Alliance HELM
Project - What About the Big
Guys?The emerging HELM standard for macromolecular representation
Domain Lead – Sergio RotsteinBusiness Technology, Pfizer
What is a “Biomolecule”?
2
Peptides
Therapeutic Proteins
ADCs
Antibodies
Vaccines
ASOs
siRNAs
For our purposes, anything that is not a small molecule is a biomolecule
Goal
• Eliminate biomolecule penalty
• Make these entities first-class citizens of the Informatics tool portfolio
GAP
So what’s the problem?
3
N
NH
O
O
O
N
NH
O
O
O
Small Molecules
Sequences
Biomolecules
Small Molecule Tools Sequence-Based Tools
“Fit-for-Purpose” Structure Representation
We need to enable the representation, manipulation and visualization of each molecule type in a way that is appropriate for its size and complexity
4
Fit for Purpose: “Monomer” Level• While you could draw out an oligonucleotide like
this:
• The representation is likely more intuitive / practical:
5
Fit for Purpose: Sequence Level
• But even the monomer level representation would not scale well to proteins with hundreds of amino acids. Larger molecules require a more sequence-oriented representation:
6
Fit for Purpose: Component Level
• For multi-component structures such as antibody drug conjugates, component level representations are required to enable each component to dealt with separately.
7
F
O
OO
O N
N
“Collapsed” Antibody
Expanded Drug
Ab
Hierarchical Editing Language for Macromolecules
– Hierarchical – Amenable to the various “levels”• Complex Polymer ⇒ Simple Polymer ⇒ Monomer ⇒
Atom– Extensible
• Allowing addition of new biopolymer types– (Reasonably) comprehensive
• e.g. Allowing representation of oligonucleotide hybridization
– Canonicalizable• Facilitating uniqueness checking
– (Somewhat) human-readable
8
HELM Example: Simple polymer
• HELM notation: A.R.G.[dF].C.K.[ahA].E.D.A
– Non-natural amino acid codes are enclosed in square brackets
• Natural equivalent: ARGFCKXEDA9
HELM Example: Complex Polymer
10
Monomer Database
• Each monomer used in the notation needs to be predefined in a monomer database
• The database includes the chemical structure of the monomer and a description of all acceptable attachment points
11
J. Chem. Inf. Model 2012, 52, 2796-2806
12