PubChem—Substance, Compound, BioAssay Part 3: Essentials.
-
Upload
opal-harrington -
Category
Documents
-
view
224 -
download
0
Transcript of PubChem—Substance, Compound, BioAssay Part 3: Essentials.
![Page 1: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/1.jpg)
PubChem—Substance, Compound, BioAssay
Part 3:
Essentials
![Page 2: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/2.jpg)
PubChem—Substance, Compound, BioAssay
Global Entrez Search Page
All[Filter]All[Filter]
![Page 3: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/3.jpg)
PubChem—Substance, Compound, BioAssay
Overall Goal:
An on-line resource providing comprehensive information on the
biological activities of small molecules
![Page 4: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/4.jpg)
PubChem—Substance, Compound, BioAssay
Why Are Small Molecules Important?
Constituents to all macromolecules(DNA, RNA, protein, carbohydrates, etc.)
Serve as cofactors and signaling molecules to thousands of proteins
The chemistry part of “biochemistry” Most drug entities and drug types are small
molecules Most biomarkers used in clinical chemistry are
small molecules
![Page 5: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/5.jpg)
PubChem—Substance, Compound, BioAssay
PubChem Databases and Tools:http://
pubchem.ncbi.nlm.nih.gov/
http://pubchem.ncbi.nlm.nih.gov/
![Page 6: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/6.jpg)
PubChem—Substance, Compound, BioAssay
ChemicalDiversity
Technology Development
Screening
Instrumentation
AssayDevelopment
PredictiveADMET
Compound Repository(MLSMR)
Informatics
Chem-informaticsResearchCenters
The Molecular Libraries Roadmap:
An Integrated Initiative
Molecular LibrariesScreening Centers
Network ( M L S C N )
![Page 7: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/7.jpg)
PubChem—Substance, Compound, BioAssay
PubChem = Repository for small molecules and
bioactivity assay data Part of Entrez search and linking system Links to other NCBI databases, e.g.,
• PubMed, MeSH• Protein structures (MMDB)• Protein/Nucleotide sequences
(GenPept/GenBank) Contains complete chemical structures
Standardized for uniformity Small set of computed properties
Structure similarity searching
![Page 8: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/8.jpg)
PubChem—Substance, Compound, BioAssay
and more…
Other Depositors to PubChem
![Page 9: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/9.jpg)
PubChem—Substance, Compound, BioAssay
PubChem: Bird’s Eye View
Depositors
PubChemBioAssays
PubChemCompound
PubChemSubstance
ChemicalStructureSimilarity
![Page 10: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/10.jpg)
PubChem—Substance, Compound, BioAssay
How does data get into PubChem?
![Page 11: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/11.jpg)
PubChem—Substance, Compound, BioAssay
PubChem integration in Entrez
Protein Sequences
LiteratureVAST
StructureSimilarity
BioactivityAssay
Results
SmallMolecule
Structures
3DStructures
Term FrequencyStatistics
ChemicalStructureSimilarityActivity
ProfileSimilarity
![Page 12: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/12.jpg)
PubChem—Substance, Compound, BioAssay
![Page 13: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/13.jpg)
PubChem—Substance, Compound, BioAssay
PrimaryDatabase
![Page 14: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/14.jpg)
PubChem—Substance, Compound, BioAssay
Depositor Data
• No “Global” rules or standards– Based on organizational needs– Lots of data overlap– Often based on individual Scientist preferences
• PubChem accepts data from many organizations– Previously unseen data representation– Combinatorial explosion of ways for drawing the
same structure
![Page 15: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/15.jpg)
PubChem—Substance, Compound, BioAssay
Redundancy, mixtures
Mixture
![Page 16: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/16.jpg)
PubChem—Substance, Compound, BioAssay
DerivativeDatabase
![Page 17: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/17.jpg)
PubChem—Substance, Compound, BioAssay
Chemical Structures may be representedin many different ways
![Page 18: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/18.jpg)
PubChem—Substance, Compound, BioAssay
Chemical Structures may be representedin many different ways
![Page 19: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/19.jpg)
PubChem—Substance, Compound, BioAssay
Compound
Substance
![Page 20: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/20.jpg)
PubChem—Substance, Compound, BioAssay
Knownstereochemistry
Unknown stereo Unknown E/Z isomers
Compound
Substance
![Page 21: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/21.jpg)
PubChem—Substance, Compound, BioAssay
Most molecules come out right, even complex ones
VancomycinNeed to fix heme bond orders Result
Sometimes there is a need to fix problems, e.g. bond orders
PDB lacks chemical detail
– no bond order information
– no hydrogens
Substances (heterogens) from Protein 3D structures (PDB)
Deposited structure receives
– bond information
– hydrogens
– stereochemistry(where possible)
Dopamine
![Page 22: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/22.jpg)
PubChem—Substance, Compound, BioAssay
PubChem Compound Processing
• Chemical Data Verification– Atom description (label, element?)– Functional group clean-up– Atom valence verification to prevent non-sense
• “Normalize” and “Standardize”– Valence-Bond canonicalize (for Tautomer invariance)– Aromaticity detection and self-consistency– Stereochemistry detection– Explicit hydrogen assignment
• Calculation– 2-D Coordinate generation– Image Depictions– Fingerprints
– IUPAC Name– SMILES, InChI, Hash Codes– xLogP, TPSA, HBD, HBA, MW, MF
![Page 23: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/23.jpg)
PubChem—Substance, Compound, BioAssay
Chemical Structure “Sanitization”
Chemical Structures that fail Sanitization Are not part of the aggregated PubChem Compound
Database Still “searchable” via PubChem Substance Database
Keeps the PubChem Compound Database “Clean” for Chemical Informatic Analysis
Collapses structures represented in various ways into a uniform, identical representation
![Page 24: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/24.jpg)
PubChem—Substance, Compound, BioAssay
Compound for mixture
Component compounds
![Page 25: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/25.jpg)
PubChem—Substance, Compound, BioAssay
Components of a mixture
![Page 26: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/26.jpg)
PubChem—Substance, Compound, BioAssay
Substance vs. Compound
Substance summary Compound summary
![Page 27: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/27.jpg)
PubChem—Substance, Compound, BioAssay
Substance vs. Compound
![Page 28: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/28.jpg)
PubChem—Substance, Compound, BioAssay
"InChI=1/Ca.3H2O/h;3*1H2/q 2;;;/p-3/fCa.3HO/h;3*1h/qm;3*-1"[InChI]
200[MW]
300:500[MW]
“ dopamine”[CompleteSynonym]
“ pcsubstance structure"[Filter]
“ ca"[Element] AND 300:500[MW] AND "chemidplus"[SourceName]
"lipinski"[Filter] AND "antineoplastic agents"[PharmAction]
Examples of queries
Lipinski rule of 5 -- a molecule is likely to be bioactive if it has:•not more than 5 hydrogen bond donors (OH and NH groups) •<10 hydrogen bond acceptors (N or O) •a molecular weight under 500 •a LogP under 5
![Page 29: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/29.jpg)
PubChem—Substance, Compound, BioAssay
All [ALL] -- All of the following fields are searched; default search field. Uid[UID] -- The integer represents SID for PCSubstance database. By default, an integer without a field alias is recognized as a UID. Same as [SID].Filter [Filter] -- Limits the records to various indexed filters. ActiveAid [AA] -- Active BioAssay identifier, integer. ActiveAidCount [AC, ACNT] -- # bioassays where tested active. AtomChiralCount [ACC, ACCNT] -- Total count of chiral atoms in a given compound.BioAssayID [BAID, AID] -- BioAssay identifier.BondChiralCount [BCC, BCCNT] –- Number of chiral bonds.Comment [CMT] -- Substance or bioassay comment. CompleteSynonym [CSYN, CSYNO] – exactly matching name for substance/compound. CompoundID [CID] -- Compound identifier, integer. DepositDate [DDAT, DEPDAT] -- Deposition timestamp for a substance.Element [ELMT, EL] -- Chemical element in a substance/compound. ExactMass [EMAS, EXMASS]-- The calculated mass of an ion or a molecule containing most likely isotopic composition for a single random molecule, corresponding to mass of most intense ion/molecule peak in a MS spec. A real number.HeavyAtomCount [HAC, HACNT] -- Atom count in a compound except hydrogen, integer. HydrogenBondAcceptorCount [HBAC, HBACNT] -- Hydrogen bond acceptors for a compound, integer. HydrogenBondDonorCount [HBDC, HBDCNT] -- Hydrogen bond donors for a compound, integer. InChI [inchi] -- IUPAC International Chemical Identifier.
Examples of PubChem Index Fields …
![Page 30: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/30.jpg)
PubChem—Substance, Compound, BioAssay
IUPACName [UPAC, IUPAC] -- Standard IUPAC name for compound. MeSHDescription [MHD]MeSHTerm [MSHT, MESHT] -- Medical Subject Heading term.MeSHTreeNode [MSHN, MESHTN] -- Medical Subject Heading tree node (tree structures).MolecularWeight [MW, MWT, MOLWT] -- Mass of a molecule calculated using the average mass of each element weighted for its natural isotopic abundance. E.g., Carbon has two natural isotopes 12 and 13 with relative abundances of 98.9% and 1.1% to yield an average mass of 12.011 g/mol. A real number. MonoisotopicMass [MMAS, MIMASS] -- Mass of a molecule calculated using the mass of the most abundant isotope of each element. E.g., Carbon has a monoisotopic mass of 12.000 g/mol. A real number. PharmAction [PHMA, PHARMA] -- MeSH pharmacological actions heading.RotatableBondCount [RBC, RBCNT] – Number of rotatable bonds. SourceCategory [SRCC, SRCCAT, SRCCATG] -- Depositor categories.SourceID [SRID, SRCID] -- Depositor's external id.SourceName [SRC, SRCNAM, SRCNAME] -- official depositor name.SubstanceID [SID] -- Substance ID. Same as [UID].Synonym [SYNO] -- Synonyms for substance. TautomerCount [TC, TCNT, TTMC] -- Possible tautomer count for each given structure, ≤ 200. TotalFormalCharge [TFC, CHG, CHRG] -- Total formula charge.TPSA [TPSA] -- Topological Polar Surface Area.XLogP [XLGP, LOGP]
Examples of PubChem Index Fields, contd.
![Page 31: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/31.jpg)
PubChem—Substance, Compound, BioAssay
Preview/Index Tab
![Page 32: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/32.jpg)
PubChem—Substance, Compound, BioAssay
History Tab
Substances of MW 300-500Da having antineoplastic properties and obeying Lipinski rule of 5
Substances of MW 300-500Da having antineoplastic properties and obeying Lipinski rule of 5
![Page 33: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/33.jpg)
PubChem—Substance, Compound, BioAssay
LinksLinks
For the whole set oronly selected records
![Page 34: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/34.jpg)
PubChem—Substance, Compound, BioAssay
Property Report
![Page 35: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/35.jpg)
PubChem—Substance, Compound, BioAssay
SDF format
![Page 36: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/36.jpg)
PubChem—Substance, Compound, BioAssay
![Page 37: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/37.jpg)
PubChem—Substance, Compound, BioAssay
![Page 38: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/38.jpg)
PubChem—Substance, Compound, BioAssay
![Page 39: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/39.jpg)
PubChem—Substance, Compound, BioAssay
Medical Subject Headings (MeSH)
MeSH is the National Library of Medicine's controlled vocabulary thesaurus.
Consists of sets of terms naming descriptors in a hierarchical and alphabetic structure, e.g.:
"Mental Disorders”, “Pharmacological action”, “Catecholamine hormones” , etc.
Permits searching at various levels of specificity MeSH thesaurus is used for indexing articles for the
MEDLINE/PubMed database MeSH is continually updated
PubChem assigns MeSH headings to Compound records
![Page 40: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/40.jpg)
PubChem—Substance, Compound, BioAssay
Contains bioactivity screens of chemical substances described in PubChem Substance
Provides searchable descriptions of each bioassay, including descriptions of the conditions and readouts specific to a screening protocol
Depositor decides on data definitions and interpretation
Data can be plotted as graphs of statistical histograms
Cross-indexed to other Entrez databases
PrimaryDatabase
![Page 41: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/41.jpg)
PubChem—Substance, Compound, BioAssay
![Page 42: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/42.jpg)
PubChem—Substance, Compound, BioAssay
![Page 43: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/43.jpg)
PubChem—Substance, Compound, BioAssay
![Page 44: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/44.jpg)
PubChem—Substance, Compound, BioAssay
![Page 45: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/45.jpg)
PubChem—Substance, Compound, BioAssay
![Page 46: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/46.jpg)
PubChem—Substance, Compound, BioAssay
![Page 47: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/47.jpg)
PubChem—Substance, Compound, BioAssay
Click to view structureClick to view structureClick to view structureClick to view structure
![Page 48: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/48.jpg)
PubChem—Substance, Compound, BioAssay
![Page 49: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/49.jpg)
PubChem—Substance, Compound, BioAssay
NCBI FTP >> PubChem Folder
![Page 50: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/50.jpg)
PubChem—Substance, Compound, BioAssay
Entrez PubChem: Help and Tabs
![Page 51: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/51.jpg)
PubChem—Substance, Compound, BioAssay
PubChem is part of NIH Molecular Libraries Roadmap for Medicine Initiative
PubChem consists of 3 databases, Substance, Compound and BioAssay, and a poweful Structure Search engine
Substance = samples; Compounds = calculated structures, properties
PubChem is integrated into NCBI’s Entrez Search and Linking system of databases
Records are indexed using number of terms
Records are linked to each other and to other databases at NCBI
Brief Summary
![Page 52: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/52.jpg)
PubChem—Substance, Compound, BioAssay
For More Information…
![Page 53: PubChem—Substance, Compound, BioAssay Part 3: Essentials.](https://reader035.fdocuments.us/reader035/viewer/2022062518/56649e7a5503460f94b7b222/html5/thumbnails/53.jpg)
PubChem—Substance, Compound, BioAssay
For More Information…
•General Help [email protected]•[email protected]•Telephone:• Voice: +1 (301) 496-2475
Fax: +1 (301) 480-9241
E-mail addresses
The (free!) NCBI Newsletter
The NCBI Handbook
http://www.ncbi.nih.gov/Education/index.html
The NCBI Education Page
http://www.ncbi.nih.gov/About/newsletter.html
Follow the link from the NCBI Home Page