When every piece matters · 2019. 10. 3. · Jabe Wilson, Director, Text and Data Analytics,...
Transcript of When every piece matters · 2019. 10. 3. · Jabe Wilson, Director, Text and Data Analytics,...
Jabe Wilson, Director, Text and Data Analytics, Elsevier
28 February, 2017
Mobilizing informational resources for rare diseasesWhen every piece matters
| 2
Rare diseases – when every piece matters
Nick Sireau at TEDx ImperialCollege
https://www.youtube.com/watch?v=B4UnVlU5hAY
• No support
• No funding
• No treatments
is a UK charity that is building the rare disease community to raise awareness,
drive research and develop treatments.
is partnering with Findacure scientists to help identify and evaluate treatments
for congenital hypersinsulinism
• Patients community
• Collaboration with medical
researchers
• Drug repurposing candidate
• Fundraising
• Clinical Trial
| 3
Findacure / Elsevier collaboration
Dr Rick Thompson
Findacure
Dr Nicolas Sireau
Findacure
Dr Matthew Clark
Elsevier
Dr Maria Shkrob
Elsevier
| 4
Why do we need literature?
PLACES PEOPLE GENES
DRUGS INTERACTIONSPROPERTIES
| 5
Why text mining?
Amorphous information Structured information
Image Source: http://www.thesocialleader.com/wp-content/uploads/2011/03/paper-piles.jpg
Text mining: analyzing text to extract information that is useful for particular purposes
Text
mining
• Hard to deal with
• Hard to deal with algorithmically
• Not scalable
• Search
• Visualize
• Network analysis
• Scalable
• Compressed
20km
| 6
• CHI Library
• Disease, Target, Pathway, andCompound Analysis
• Research Landscape Analysis
Information Assets Applied
• Content
Elsevier’s vast set of literature and patent data
• Data normalization
Taxonomies and dictionaries to normalizeauthor names, institutions, drugs, targets, andother important terms
• Information extraction
Finding semantic relationships, targets,pathways, drugs, and bioactivities
Creating a comprehensive view of CHI with Elsevier
R&D Solutions
| 7
Research landscape analysis: connecting patients,
researchers and institutions
0 10 20 30 40 50 60 70
Stanley, C.A.
Hussain, K.
De Lonlay, P.
Rahier, J.
Ellard, S.
Flanagan, S.E.
Shyng, S.L.
Nihoul-Fekete, C.
Bellanne-Chantelot, C.
Robert, J.J.
Brunelle, F.
KEY AUTHORS
0 10 20 30 40 50 60 70 80
The Children's Hospital of Philadelphia
UCL Institute of Child Health
Hopital Necker Enfants Malades
University of Pennsylvania, School of…
UCL
Universite Paris Descartes
University of Pennsylvania
Cliniques Universitaires Saint-Luc,…
University of Exeter
Oregon Health and Science University
KEY INSTITUTIONS0 1 2
Ajinomoto CO., INC.
Arkray, INC.
Korea Research Institute…
ViviaBiotech, S.L.
Bassa, Babu V.
Commisariat a l'Energie…
Glaser, Benjamin
Kowa CO., LTD.
Kyowa Hakko Kogyo…
KEY PATENTS
• Most prolific authors and institutions,
based on full-text searching for terms and
synonyms
• Patent assignee names from Reaxys
| 8
Research landscape analysis: collaboration
• Network of people and organizations collaborating in CHI space based on
co-authorship
| 9
• CHI in abstract or title
• CHI subtypes
• By publication type
• By study type
(including MeSH terms)
CHI: finding relevant documents
Indicate what to query
Filter by study type
Specify distanceFinding documents that mention certain
aspects of CHI
| 10
CHI: finding targets, drugs, and drug effects
"protein"
"terms for
genetic
variations"
"Persistent
Hyperinsulinemia
Hypoglycemia of Infancy"
Relevant Text Title AuthorsReference
DateDOI
ABCC8 mutation Persistent
Hyperinsulinemia
Hypoglycemia of Infancy
In the literature, nine genes have been reported to
be associated with CHI , with the most common
genetic causes of CHI being mutations in either
ABCC8 or KCNJ11 .
Successful treatment of a newborn
with congenital hyperinsulinism having
a novel heterozygous mutation in the
ABCC8 gene using subtotal
pancreatectomy
Yen C.-F, Huang C.-Y,
Chan C.-I, Hsu C.-H, Wang
N.-L, Wang T.-Y, Lin C.-L,
Ting W.-H.
2016 10.1016/j.
tcmj.2016
.04.001
ABCC8 loss of function
mutation
Persistent
Hyperinsulinemia
Hypoglycemia of Infancy
GOF and loss-of function mutations in KCNJ11
(Kir6.2) and ABCC8 (SUR1), which encode the
predominant KATP channel subunits in
pancreatic β-cells and in neurons, are now well-
understood to underlie neonatal diabetes and
congenital hyperinsulinism, respectively.
Adenosine Triphosphate-Sensitive
Potassium Currents in Heart Disease
and Cardioprotection
Nichols C.G. 2016 10.1016/j.
ccep.201
6.01.005
ATP-activated inward
rectifier potassium
channel
mutation Persistent
Hyperinsulinemia
Hypoglycemia of Infancy
The prevalence of KATP channel gene mutations,
diazoxide responsiveness, and rates for surgery
is broadly commensurate with other CHI cohorts.
Feeding Problems Are Persistent in
Children with Severe Congenital
Hyperinsulinism
Banerjee I, Forsythe L,
Skae M, Avatapalle HB,
Rigby L, Bowden LE,
Craigie R, Padidela R,
Ehtisham S, Patel L,
Cosgrove KE, Dunne MJ,
Clayton PE.
2016 10.3389/f
endo.201
6.00008
Extracting structured information from text
Standardized
names
Standardized
link
Evidence
| 11
CHI: summarization and visualization of the findings
• Visualization and summarization of
6.2 M literature findings
• Linking to non-literature sources
| 12
Automated analysis combines bioassay data with text-mined data
From pathways to treatments:
Mean of activities
among these targets
Mean of activities
among these targets
Targets and activities
for each compound
Drug-likeness
metrics for
sorting/classification
• All compounds that
were observed to bind
to targets in pathway
• Sorted by number of
active targets. Too many targets may
suggest lack of specificity.
Find all targets that could
be used to affect the
disease state
Query for each protein to find
compounds that target it (>6
log units)
Collate data by compound to summarize the
targets/activities related to disease that the
compound hits• Compute geometric mean of activities for ranking
• Rank by number of targets and geometric mean of
activities against targets
Step 1 Step 2Step 3
| 13
• Used extensive Elsevier’s content, tools and capabilities to provide
information about a rare disease:
Text Mining to find targets and summarize what is known about the
disease mechanism
Bioactivity data to find drugs that target those targets
Normalized names of authors and institution to find collaborators
• Once the output of interest is decided, answer generation can be
automated:
Provide disease name and get:
List of targets with supporting information
Sorted list of approved drugs with supporting information
KOLs and institutes
Summary
Thank you
https://www.elsevier.com/solutions/professional-services