Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology...
Transcript of Artificial intelligence/machine learning at the National ... · ̶Hutchins BI et al. PLoS Biology...
Office of Portfolio Analysis
Artificial intelligence/machine learningat the National Institutes of Health
(AI/ML at the NIH)
Predicting translational progress in biomedical research
George Santangelo, Ph.D.Ian Hutchins, Ph.D.
Office of Portfolio Analysis, NIH
Mission of the NIH Office of Portfolio Analysis (OPA)
Consult &Collaborate
Develop New Analytical Methods
BuildTools
Data Cleaning & Analysis
Supportdata-driven
decision-making
Disseminate Best PracticesClassroom Training (Custom Classes Available)
Online TrainingWeb Resources (Case Studies, FAQs)
Office HoursSymposia
OPA website: https://dpcpsi.nih.gov/opa/index
Support data-driven decision-making
• Enable NIH research administrators and decision-makers to evaluate and prioritize current and emerging areas of research that will advance scientific knowledge and improve human health
•Help ensure that the NIH research portfolio ― is balanced― is free of unnecessary duplication― takes advantage of collaborative,
cross-cutting research― stimulates the emergence of
transformative ideas
Office of Portfolio Analysis
AI/MLBibliometricsContent analysisEmerging areas
OPA developed the Relative Citation Ratio (RCR) metric to meet the need for a new and thoroughly validated way
to measure the influence of any/all biomedical research papers
Public website to retrieve RCR data:
iCite.od.nih.gov
Office of Portfolio Analysis
Office of Portfolio Analysis
OPA AI to track and predict the impact of NIH decision-makingTrack and parameterize:
• Influence using bibliometric data The Relative Citation Ratio (RCR)
Hutchins BI et al. PLoS Biology 2016 14:e1002541 Hutchins BI et al. PLoS Biology 2017 15:e2003552 Santangelo GM Mol. Biol. Cell 2017 28:1401-1408
Office of Portfolio Analysis
OPA AI to track and predict the impact of NIH decision-makingTrack and parameterize:
• Translational progress / clinical trials (CTs) and tech transfer / patentsThe triangle of biomedicine, APT scores
Hutchins BI et al. PLoS Biology 2019 17(10):e3000416
• Influence using bibliometric data The Relative Citation Ratio (RCR)
Hutchins BI et al. PLoS Biology 2016 14:e1002541 Hutchins BI et al. PLoS Biology 2017 15:e2003552 Santangelo GM Mol. Biol. Cell 2017 28:1401-1408
Office of Portfolio Analysis
OPA AI to track and predict the impact of NIH decision-makingTrack and parameterize:
2.0Influence module
Translation moduleOpen Citation Collection
Hutchins et al. PLOS Biology 201917:e3000385
The publicly available OPA tool
• Influence using bibliometric data The Relative Citation Ratio (RCR)
Hutchins BI et al. PLoS Biology 2016 14:e1002541 Hutchins BI et al. PLoS Biology 2017 15:e2003552 Santangelo GM Mol. Biol. Cell 2017 28:1401-1408
• Translational progress / clinical trials (CTs) and tech transfer / patentsThe triangle of biomedicine, APT scores
Hutchins BI et al. PLoS Biology 2019 17(10):e3000416
Office of Portfolio Analysis
OPA AI to track and predict the impact of NIH decision-makingTrack and parameterize:
• Development of drugs and devicesDisambiguated drug and lead compound name, FDA data
2.0Influence module
Translation moduleOpen Citation Collection
Hutchins et al. PLOS Biology 201917:e3000385
The publicly available OPA tool
• Influence using bibliometric data The Relative Citation Ratio (RCR)
Hutchins BI et al. PLoS Biology 2016 14:e1002541 Hutchins BI et al. PLoS Biology 2017 15:e2003552 Santangelo GM Mol. Biol. Cell 2017 28:1401-1408
• Translational progress / clinical trials (CTs) and tech transfer / patentsThe triangle of biomedicine, APT scores
Hutchins BI et al. PLoS Biology 2019 17(10):e3000416
Office of Portfolio Analysis
OPA AI to track and predict the impact of NIH decision-makingTrack and parameterize:
• Development of drugs and devicesDisambiguated drug and lead compound name, FDA data
• Rate of scientific progress and emergence
2.0Influence module
Translation moduleOpen Citation Collection
Hutchins et al. PLOS Biology 201917:e3000385
The publicly available OPA tool
• Influence using bibliometric data The Relative Citation Ratio (RCR)
Hutchins BI et al. PLoS Biology 2016 14:e1002541 Hutchins BI et al. PLoS Biology 2017 15:e2003552 Santangelo GM Mol. Biol. Cell 2017 28:1401-1408
• Translational progress / clinical trials (CTs) and tech transfer / patentsThe triangle of biomedicine, APT scores
Hutchins BI et al. PLoS Biology 2019 17(10):e3000416
Office of Portfolio Analysis
OPA AI to track and predict the impact of NIH decision-making
• Detection of overlapping proposals submitted to different funders─ Hoppe et al. Science Advances 2019 5:eaaw7238
Track and parameterize:
• Influence using bibliometric data The Relative Citation Ratio (RCR)
Hutchins BI et al. PLoS Biology 2016 14:e1002541 Hutchins BI et al. PLoS Biology 2017 15:e2003552 Santangelo GM Mol. Biol. Cell 2017 28:1401-1408
• Development of drugs and devicesDisambiguated drug and lead compound name, FDA data
• Rate of scientific progress and emergence
2.0Influence module
Translation moduleOpen Citation Collection
Hutchins et al. PLOS Biology 201917:e3000385
The publicly available OPA tool
• Translational progress / clinical trials (CTs) and tech transfer / patentsThe triangle of biomedicine, APT scores
Hutchins BI et al. PLoS Biology 2019 17(10):e3000416
Publicly available web tool: iCite 2.0
Track influential publications
Track and predict translational impact
The Open Citation Collection
Office of Portfolio Analysis
The Translation module of iCite: tracking bench-to-bedside progress
The Translation module of iCite: tracking bench-to-bedside progress
Development of cancer immunotherapy
HighLow Article DensityClinical ResearchTranslational ResearchFundamental Research
2009
1996
human
Citation patterniCite Translation1996 to 2009
2009
1996
Fundamental
Translational/Clinical
Office of Portfolio Analysis
Office of Portfolio Analysis
Total citations are a poor predictor of translational progress(citation by one more clinical articles*)
Seminal articles leading to Nobel Prizes in Physiology or Medicine (2003 to 2015)
Topic Prize year Pub date RCR RCR %ile Total cites Cites by clinical articles % by clinical articles
Treatment of parasitic diseases 2015 1979 6.6 96.1 132 7 5.30%
In vitro fertilization 2010 1978 34.4 99.8 680 33 4.85%
Helicobacter pylori & ulcers 2005 1984 95.8 99.9 2285 100 4.38%
MRI 2003 1984 14.0 99.1 232 7 3.02%
HPV/Cervical Cancer & HIV/AIDS 2008 1983 102.4 99.9 2849 32 1.12%
Positioning system in the brain 2014 1971 60.4 99.9 1773 14 0.79%
Olfactory biology 2004 1991 60.2 99.9 1953 14 0.72%
Activation of the immune system 2011 1998 113.4 99.9 3976 21 0.53%
Telomeres and telomerase 2009 1978 10.9 98.5 407 2 0.49%
Reprogramming mature cells to pluripotency 2012 2007 238.8 99.9 7890 17 0.22%
RNA interference 2006 1998 163.5 99.9 6500 7 0.11%
Introducing specific gene modifications in mice 2007 1987 30.0 99.8 1180 1 0.08%
Vesicle trafficking 2013 1993 51.6 99.9 1776 1 0.06%
*a publication that describes a clinical trial or clinical guideline
Office of Portfolio Analysis
Non-uniform probability of being cited by a clinical article
Low
Mid
High
Office of Portfolio Analysis
Training a machine learning system to predict future translation
Office of Portfolio Analysis
Training a machine learning system to predict future translation
*
*
*Random Forests outperformed Support Vector Machines, Neural Networks, logistic regression, et al.
Office of Portfolio Analysis
Validation of machine learning predictions of future translation
Accurate predictions can be made two years after publication
Office of Portfolio Analysis
Validation of machine learning predictions of future translation
Machine learning outperforms experts rating the clinical impact of publications
Office of Portfolio Analysis
Validation of machine learning predictions of future translation
Increase in APT score is responding to new information in the citing networks
Approximate Potential to Translate (APT) scores are stable but can change over time
Office of Portfolio Analysis
“Genetic” analysis of machine learning predictions of future translation
Office of Portfolio Analysis
Summary
• The Office of Portfolio Analysis at NIH develops tools and new methods, including AI/ML approaches, to improve data-driven decision-making
• Predicting translational progress in biomedical research has the potential to accelerate scientific advances that improve human health
• We built and validated an ML system that accurately determines, within two years post-publication, the likelihood any/all paper(s) will be cited by a clinical article (a paper that describes a trial or guideline)
• Making accurate predictions requires both the features of the paper in question and the features of the papers that cite it
• Analyzing citation patterns indicates that knowledge flows through different domains before percolating into the clinical arena
Office of Portfolio Analysis
Temporal dynamics of translation
Office of Portfolio Analysis
Using fractional in place of binary counting of MeSH terms
Office of Portfolio Analysis
Temporal dynamics of translationLow, mid, or high fraction of Human MeSH terms