Post on 23-Jun-2015
Chemical Interaction Matrix:
Gerald Lushington / LiS Consultinghttp://geraldlushington.com / glushington@yahoo.com
Personalized Medicine
Comprehensive Biochemical & Chemical Biology Understanding
Big data: NGS, medical outcomes, etc.
Personalized Medicine
Comprehensive Biochemical & Chemical Biology Understanding
Informatics& Creativity
HTS &Chemical
Proteomics
Big data: NGS, medical outcomes, etc.
Example Challenges:●Toxicology: single toxin may modulate several different biochemical processes
●Cancer: malignant cells have multiple biochemical sensitivities that may be targeted
●Spectral disorders (e.g., Autism, Alzheimers, etc.): distinct phenotypes produce similar symptoms
Discovery Paradigm:
Chemical screening prospective hitsChemical proteomics prospective targets
How to attain comprehensive understanding?
Data Comprehension Reality
TargetsCompounds
How to make sense of diffuse multimode data?
Mechanism of Action (MOA) discovery: find compound subsets that conserve common mechanism
Excellent (but imperfect) example: TEST (Toxicology Estimation Software Tool)
http://www.epa.gov/nrmrl/std/qsar/qsar.html
TEST
Multiple data sets covering toxicity outcomes for numerous compounds
Predict toxicity of query compounds via on-the-fly training to similar pre-characterized analogs
TEST
Multiple data sets covering toxicity outcomes for numerous compounds
Predict toxicity of query compounds via on-the-fly training to similar pre-characterized analogs
Use Tanimoto distances over molecular fingerprints: no validated relevance specific
outcomes
Procedure: 1. Assemble Matrix of compounds vs.
activity & features
MOA method: feature / compound selection
Procedure: 1. Assemble Matrix of compounds vs.
activity & features2. Normalize
MOA method: feature / compound selection
Procedure: 1. Assemble Matrix of compounds vs.
activity & features2. Normalize3. Fold activity into features as per:
Ci = |Act* - Xi*|
X values: 0 = perfect correlation1 = perfect anticorrelation
MOA method: feature / compound selection
Procedure: 1. Assemble Matrix of compounds vs.
activity & features2. Normalize3. Fold activity into features as per:
Ci = |Act* - Xi*|4. Bicluster
MOA method: feature / compound selection
Procedure: 1. Assemble Matrix of compounds vs.
activity & features2. Normalize3. Fold activity into features as per:
Ci = |Act* - Xi*|4. Bicluster
Clusters Contiguous correlative or anticorrelative regions or matrix
Within clusters: molecules may share MOA; features may correlate with activity
Confidence: correlative & predictive quality of model derived from cluster
MOA method: feature / compound selection
Example: Oral Bioavailability
Oral update depends on:
● Polar solubility● Membrane permeability● Interaction with various transporters
Data (from Tingjun Hou): 773 moleculeshttp://modem.ucsd.edu/adme/databases/databases_bioavailability.htm
Descriptors (from VolSurf and DVS): 298 featurespassing information content and linear independence (R < 0.90) filters
Example: Oral Bioavailability
Oral update depends on:
● Polar solubility● Membrane permeability● Interaction with various transporters
Data (from Tingjun Hou): 773 moleculeshttp://modem.ucsd.edu/adme/databases/databases_bioavailability.htm
Descriptors (from VolSurf and DVS): 298 featurespassing information content and linear independence (R < 0.90) filters
Preliminary Model (Weka: Bootstrap Aggregating / RepTree):
Q2(5-fold) = 0.4712
Example: Oral Bioavailability
Oral update depends on:
● Polar solubility● Membrane permeability● Interaction with various transporters
Data (from Tingjun Hou): 773 moleculeshttp://modem.ucsd.edu/adme/databases/databases_bioavailability.htm
Descriptors (from VolSurf and DVS): 298 featurespassing information content and linear independence (R < 0.90) filters
Preliminary Model (Weka: Bootstrap Aggregating / RepTree):
Q2(5-fold) = 0.4712 CFS & RF: reduced to 27 features
Q2(5-fold) = 0.4739
Biclustering: Before and After
Clusters as local training sets:
Clusters as local training sets:
Condense to 18 high quality clusters that cover almost entire training space (omit only 10 of 768 cpds)
Conclusions
Correlative & predictive performance of subset models gives strong confidence in MOA conservation in clusters
Head-to-head comparison with chemical proteomics data should provide strong basis for target identification
Questions / Suggestions?glushington@yahoo.com