ICIC 2016: ChemAnalyser: Searching 76 Million Unique Structures and 5 Billion Chemical Entities,...
-
Upload
dr-haxel-congress-and-event-management-gmbh -
Category
Internet
-
view
371 -
download
0
Transcript of ICIC 2016: ChemAnalyser: Searching 76 Million Unique Structures and 5 Billion Chemical Entities,...
www.infoapps.com
www.ChemAnalyser.com 76 m unique strucutres6 billion chemical entities
Heidelberg, 17th of October 2016
www.infoapps.com
19.10.2016 2
Products
In-house Patent Information System
Automated classification engine
Full-text boolean and semantic search
Cognitive searchable chemical knowledge
database
www.infoapps.com
ChemAnalyser mission
ChemAnalysers goal, is to support R&D people and Information
Professionals in Life Science, Pharma, Cosmetics, Chemistry, Nutrition,
Biology, Material Science, with:
• Easy and intuitive use – Structure - Substructure
• Global chemical information
• Analytics and knowledge extraction (Export)
to support all R&D developments:
• New uses
• Idea generation
• Hypotheses validation
• Business intelligence
19.10.2016 3
www.infoapps.com
SOME EXAMPLES, ABOUT SEARCHES DONE IN MAY 2016 BY INDEPENDENT SEARCH AUTHORITIES.
ChemAnalyser SciFinder
19.10.2016 4
www.infoapps.de
Case 1: SciFinder - ChemAnalyser: PI3K-gamma kinase inhibitors competitive intelligence
➣ chemistry: check and secure IP for PI3K gamma inhibitors
21.09.16 5
SciFinder structure search in registry: no hit
www.infoapps.de
Case 1: SciFinder - ChemAnalyser:PI3K-gamma kinase inhibitors competitive intelligence
➣ chemistry: check and secure IP for PI3K inhibitors
21.09.16 6
ChemAnalyser structure search in ChemDB: 7 hits
www.infoapps.de
Case 2: SciFinder - ChemAnalyser:PI3K-gamma kinase inhibitors competitive intelligence
➣ Markush chemistry: check and secure IP for PI3K inhibitors
21.09.16 7
SciFinder via Markush structure search: 3 patent families
www.infoapps.de
Case 2: SciFinder - ChemAnalyser:PI3K-gamma kinase inhibitors competitive intelligence
➣ Markush chemistry: check and secure IP for PI3K inhibitors
21.09.16 8
chemanalyser via R group structure search: 333 patent families
www.infoapps.de
Case 2: SciFinder - ChemAnalyser:PI3K-gamma kinase inhibitors competitive intelligence
➣ Markush chemistry: check and secure IP for PI3K inhibitors
21.09.16 9
chemanalyser via R group structure search: 333 patent families
www.infoapps.de
Use Cases 3 (May 2016):
➣ chemistry classes & classifications
Search term “tomatidine” (a class of natural products for food):
chemanalyser 1,333 patent hit documentsSciFinder 57 patent hit documents
Search term “sesquiterpenes” (a class of natural products):
chemanalyser 441,690 patent hit documentsSciFinder 1,837 patent hit documents
Search terms “food additives” AND “natural products” AND “triterpenes”
chemanalyser 471,460 patent hit documentsSciFinder 33 all hit documents
21.09.16 10
www.infoapps.com
WHY CHEMANALYSER HAS MORE THE 6 BILLION CHEMICAL ENTITIES?
WHY LARGE?
19.10.2016 11
www.infoapps.com
12
Coverage Bibliography
CC Country
Name
First
Year
AM Armenia 2004
AP ARIPO 1971
AR Argentina 1965
AT Austria 1899
AU Australia 1917
BA Bosnia and Herzegovina 1998
BE Belgium 1862
BG Bulgaria 1973
BR Brazil 1972
BY Belarus 1997
CA Canada 1863
CH Switzerland 1888
CL Chile 2005
CN China 1985
CO Colombia 1995
CR Costa Rica 1988
CS Czechoslovakia 1911
CU Cuba 1968
CY Cyprus 1921
CZ Czech republic 1992
DD German Democratic R. 1951
DE Germany 1877
DK Denmark 1895
DO Dominican Republic 2001
DZ Algeria 2000
EA Eurasia 1996
EC Ecuador 1990
EE Estonia 1994
EG Egypt 1976
CC Country
Name
First
Year
EP European PO 1978
ES Spain 1827
FI Finland 1842
FR France 1855
GB Great Britain 1782
GC Gulf CC 2002
GE Georgia 2000
GR Greece 1920
GT Guatemala 1961
HK Hong Kong 1976
HN Honduras 2005
HR Croatia 1994
HU Hungary 1913
ID Indonesia 1988
IE Ireland 1930
IL Israel 1968
IN India 1912
IS Iceland 1925
IT Italy 1921
JO Jordan 1971
JP Japan 1913
KE Kenya 1975
KG Kyrgyzstan 2003
KR South Korea 1978
KZ Kazakhstan 1993
LT Lithuania 1992
LU Luxembourg 1933
LV Latvia 1993
MA Morocco 1977
CC Country
Name
First
Year
MC Monaco 1957
MD Moldova 1994
ME Montenegro 2010
MN Mongolia 1972
MT Malta 1968
MW Malawi 1973
MX Mexico 1980
MY Malaysia 1953
NI Nicaragua 2003
NL Netherlands 1913
NO Norway 1909
NZ New Zealand 1978
OA OAPI 1966
PA Panama 1996
PE Peru 1992
PH Philippines 1975
PL Poland 1930
PT Portugal 1967
RO Romania 1907
RS R. o. Serbia 2013
RU Russia 1975
SA Saudi Arabia 2006
SE Sweden 1888
SG Singapore 1983
SI Slovenia 1992
SK Slovakia 1993
SM San Marino 2000
SU Soviet Union 1919
SV El Salvador 1970
CC Country
Name
First
Year
TH Thailand 2010
TJ Tajikistan 1996
TN Tunisia 1990
TR Turkey 1973
TT Trinidad & Tobago 1994
TW Chinese Taipei 1991
UA Ukraine 1987
US USA 1790
UY Uruguay 2000
UZ Uzbekistan 1997
VN Viet Nam 1984
WO WIPO 1978
YU Former Serbia Mont. 1964
ZA South Africa 1968
ZM Zambia 1968
ZW Zimbabwe 1980
www.infoapps.com
13
Full-Text and Machine translation EN
Country
Code
Country
Name
Earliest
Year
Fulltext Machine
Translation
English
Human Translation
AR Argentina 2012-current F M
AT Austria 1902-2013 F M
AU Australia 1924-current F
BE Belgium 1897-2013 F M
BR Brazil 2005-current F M
CA Canada 1905-current F
CH Switzerland 1888-current F M
CN China 1985-current F M H (Applicant Names)
DE Germany 1920-current F M
DK Denmark 1980-current F M
EP European Patent Office 1978-current F M
ES Spain 2004-current F M
FI Finland 1980-current F M
FR France 1902-current F M
GB Great Britain 1859-current F
GC Arab States of the Gulf 2000-current F M
HK Hong Kong 2005-current F
IE Ireland 1980-current F
www.infoapps.com
14
Full-Text and Machine translation EN
Country
Code
Country
Name
Earliest
Year
Fulltext Machine
Translation
English
Human Translation
IL Israel 2000-current F M
IN India 2004-current F
JP Japan 1993-current F M
KR South Korea 2006-current F M
MX Mexico 2010-current F M
LU Luxemburg 1980-current F M
MY Malaysia 2005-current F
NL Netherlands 1980-current F M
NO Norway 1980-current F M
PT Portugal 1980-current F M
RU Russia 1993-current F M
SE Sweden 1980-current F M
SG Singapore 2006-current F
TH Thailand 2010-current English claims only
TR Turkey 2015-current F M
US United States of America 1753-current F
VN Viet Nam 2010-current English claims only
WO World Intellectual Property Organization 1978-current F M
www.infoapps.com
15
Completeness of coverage over
CC and years
106.312.443 IP, Re-Keyed
www.infoapps.com
ChemAnalyser - Data collection
The patent information pool
100 m documents – article in full text - all IP in english machine translated
and originals
meets
The non patent information pool
80 m scientific documents (articles and books) + websites, databases
We has collected a large set of abstracts and full text literature: scientific journals, books and conference
proceedings. Most of these sources are referenced in Scopus, EMBASE, Medline, ScienceDirect or other data
collections. Full text documents can be obtained from Reprints Desk. In addition, we have collected and
integrated scientific data from public and proprietary data bases such as clinical trials, ChEMBL, from FDA,
EMEA, EFSA and others as well as any collection of websites of interest. On demand News can be extracted on
each topic by using world leading News providers and Blog + social networks.
19.10.2016 16
www.infoapps.com
ChemAnalyser –best annotation methods
Chemical Synonyms > 450.000.000 (450 million)
Protein Synonyms > 4.000.000 (4 million)
Disease Synonyms >
…
Chemical Concepts > 100.000.000 (100 million)
Protein Concepts > 680.000
Disease Concepts >
…
60 Different databases
8 Years of chemical collection
19.10.2016 17
www.infoapps.com
Results
6.000.000.000 chemical terms (6 billion)
76.000.000 unique chemical structures (76 million)
complete back file is quarterly new indexed!
Easy and fast to find relevant information
19.10.2016 18
www.infoapps.com
ChemAnalyser -context sensitive recognition + cognitive search
19.10.2016 19
homonym resolution: disambiguate based on environment –material
properties
www.infoapps.com
ChemAnalyser -context sensitive recognition – better hits!
19.10.2016 20
Context sensitive recognition: annotate only in desired context
This technology enables:
1. More relevant recognition / information
2. A better relevance sorting of hit lists
3. Less hits for the user / less relevant hits at the end of list
www.infoapps.com
ChemAnalyser -Understanding chemistry
19.10.2016 21
Chemistry: class compound class
Chemistry: compound regular chemical compound
Protein protein that is targeted by chemistry
Effect or process „Mode-of-action“ of compound on target
under lined: short form
over lined: drawable structure
triangle: multiple annotations
www.infoapps.com
WHY CHEMANALYSER BRINGS YOU RELEVANT INFORMATION FIRST
WHY RELEVANTHITS FIRST?
19.10.2016 23
www.infoapps.com
Result list ranking
19.10.2016 24
For calculation the right documents:
• Ranking depends on:
• Number of concepts
• Hierarchy of concepts
• Publication date
• Document structure (Claims, Examples…)
ChemAnalyser decides like an expert if cancer is only named in the
document, or if the document relates to cancer and cancer has
chemical reactions with others substances in the document.
www.infoapps.com
WHY CHEMANALYSER IS EASY TO USE?
WHY EASY AND FASTER?
19.10.2016 25
www.infoapps.com
Cognitive Search (cognitive computing)
19.10.2016 26
Synonyms
• DB Identifiers like:
CAS No., PubChem, ChEMBL
• Company Numbers
• IUPAC Names
• INN
• Trade Names
• InChiKey, Smiles
• Formulas
• Abbreviation
• Acronyms
• Different languages
Children
• Chemical ontology
• Intellectual Human work
• Database integration
www.infoapps.com
Cognitive Search (cognitive computing)
19.10.2016 27
• Cognitive Search is understanding the meaning of the search!
• On base of that, we just bring relevant search results and not
search results, based on simple full text hits, without the sense of
your search (ontology).
• We sort your search results, so that searchers get the best hits,
first (like google).
We will find it, even if you have not search the exact name!
We will bring you ideas, how you can use it in another way!
www.infoapps.com
Extensive Export capabilities
19.10.2016 28
www.infoapps.com
SELECT DISEASES RELATED TO PI3K-GAMMA, BEYOND CANCER
PI3K gammaKinase - diseases
19.10.2016 29
www.infoapps.de
Use Case: select diseases related to PI3K-gamma, beyond cancer
➣ what other diseases are affected by PI3K-gamma ?
21.09.16 30
www.infoapps.de
Use Case: select diseases related to PI3K-gamma
➣ what other diseases are affected by PI3Kgamma ?
21.09.16 31
www.infoapps.de
Use Case: select diseases related to PI3K-gamma
➣ what other diseases are affected by PI3Kgamma ?
21.09.16 32
www.infoapps.de
Use Case: select diseases related to PI3K-gamma
➣ insulin resistance affected by PI3Kgamma ?
21.09.16 33
www.infoapps.com
ONE SIMPLE SEARCH – SEVERAL REPOSITORIES
PI3K gammaSearch
19.10.2016 34
www.infoapps.de
Use Case: PI3K-gamma kinase inhibitors competitive intelligence
➣ check and secure IP for PI3K gamma target
21.09.16 35
www.infoapps.de
Use Case: PI3K-gamma kinase inhibitors competitive intelligence
➣ chemistry: check and secure IP for PI3K gamma inhibitors
21.09.16 36
www.infoapps.com
SELECT DISEASES RELATED TO PI3K-GAMMA, BEYOND CANCER
PI3K gammaKinase - diseases
19.10.2016 37
www.infoapps.de
Use Case: Diabetes type 2 – PI3K gamma - search
➣ chemistry: check and secure IP for PI3K gamma inhibitors
21.09.16 38
www.infoapps.de
Use Case: Diabetes type 2 – PI3K gamma – search results
➣ chemistry: check and secure IP for PI3K gamma inhibitors
21.09.16 39
www.infoapps.de
Use Case: Diabetes type 2 – PI3K gamma - hit
➣ chemistry: check and secure IP for PI3K gamma inhibitors
21.09.16 40
www.infoapps.com
19.10.2016 41
Sascha Kamhuber
Phone: +49 89 30748574
Raphael Marche
Phone: +49 8742 2847039
Johannes Herbert
Phone: +49 89 30748575
Thank you!
www.infoapps.com
WWW.CHEMANALYSER.COM
LIVE!
19.10.2016 42
www.infoapps.de
Use Cases: … typical agroscience workflow
➣ select targets related to project goal
➣ competitive intelligence – am I to late or to early ?
➣ check and secure IP for target modulators
➣ select chemistry/modulators for my target
➣ methods and assays related to my target
➣ mode-of-action of inhibitors
backups
➣ off-target effects
➣ predict ADMET compound properties
➣ synthesis
21.09.16 43
www.infoapps.de
Use Case: select targets for project goal - insecticide
21.09.16 45
www.infoapps.de
Use Case: select targets for project goal - insecticide
21.09.16 46
www.infoapps.de
Use Case: select targets for project goal - insecticide
21.09.16 47
www.infoapps.de
Use Case: select targets for project goal - insecticide
21.09.16 48
www.infoapps.de
Use Case: select targets for project goal - insecticide
21.09.16 49
www.infoapps.de
Use Case: select effects related to target chitinase
21.09.16 50
www.infoapps.de
Use Case: select IP related to chitinase
21.09.16 51
+(prot:"chitinase" prot:"chitinase 1" prot:"chitinase 3" prot:"chitinase 5" prot:"chitinase-like protein" -prot:"acidic mammalian chitinase") +eff:"insecticidal"
www.infoapps.de
Use Case: select IP related to chitinase: Expert Search
21.09.16 52
www.infoapps.de
Use Case: select IP related to chitinase: Expert Search
21.09.16 53
www.infoapps.de
Use Case: am I too late? Chitinase patents
21.09.16 54
www.infoapps.de
Use Case: am I too late? Chitinase literature
21.09.16 55
www.infoapps.de
Use Case: ryanodine receptor ligands competitive intelligence
21.09.16 56
www.infoapps.de
Use Case: ryanodine receptor ligands competitive intelligence
21.09.16 57
www.infoapps.de
Use Case: pesticidal pyrazoles
21.09.16 58
www.infoapps.de
Use Case: pesticidal pyrazoles
21.09.16 59
www.infoapps.com
19.10.2016 60
Why infoPatent classifier (Text mining)vs. exact chemical relations
Relations bring exact and correct results, compared to pattern:
select diseases related to PI3K-gamma
what other diseases are affected by PI3Kgamma ?
www.infoapps.de
03.10.16 61
Why infoPatent classifier (Text mining)vs. exact chemical relations
Relations bring exact and correct results, compared to pattern:
select diseases related to PI3K-gamma
what other diseases are affected by PI3Kgamma ?
www.infoapps.de
03.10.16 62
Why infoPatent classifier (Text mining)vs. exact chemical relations
Relations bring exact and correct results, compared to pattern:
select diseases related to PI3K-gamma
what other diseases are affected by PI3Kgamma ?
www.infoapps.com
Facetes, for easy filtering on base of documents
19.10.2016 63
www.infoapps.com
Examples and compares to STN and SciFinder
19.10.2016 64
www.infoapps.de
Use Case: PI3K-gamma kinase inhibitors competitive intelligence
➣ chemistry: check and secure IP for PI3K gamma inhibitors
21.09.16 65
SciFinder structure search in registry: no hit
www.infoapps.de
Use Case: PI3K-gamma kinase inhibitors competitive intelligence
➣ chemistry: check and secure IP for PI3K inhibitors
21.09.16 66
ChemAnalyser structure search in ChemDB: 7 hits
www.infoapps.de
Use Case: PI3K-gamma kinase inhibitors competitive intelligence
➣ Markush chemistry: check and secure IP for PI3K inhibitors
21.09.16 67
SciFinder via Markush structure search: 3 patent hit documents
www.infoapps.de
Use Case: PI3K-gamma kinase inhibitors competitive intelligence
➣ Markush chemistry: check and secure IP for PI3K inhibitors
21.09.16 68
chemanalyser via R group structure search: 333 patent families
www.infoapps.de
Use Case: PI3K-gamma kinase inhibitors competitive intelligence
➣ Markush chemistry: check and secure IP for PI3K inhibitors
21.09.16 69
chemanalyser via R group structure search: 333 patent families
www.infoapps.de
Use Case: PI3K-gamma kinase inhibitors competitive intelligence
➣ PI3K inhibitors
21.09.16 70
www.infoapps.de
Use Case: PI3K-gamma kinase inhibitors competitive intelligence
➣ chemistry classes & classifications
Search term “tomatidine” (a class of natural products for food):chemanalyser 1,333 patent hit documentsSciFinder 57 patent hit documents
Search term “sesquiterpenes” (a class of natural products):chemanalyser 441,690 patent hit documentsSciFinder 1,837 patent hit documents
Search terms “food additives” AND “natural products” AND “triterpenes”
chemanalyser 471,460 patent hit documentsSciFinder 33 all hit documents
21.09.16 71
www.infoapps.de
Use Case: known PI3K-gamma kinase inhibitors
➣ including 3rd party DBs (ChEMBL): select modulators for my target
21.09.16 72
www.infoapps.de
Use Case: PI3K-gamma kinase inhibitors related methods
➣ methods and assays related to my target
21.09.16 73
www.infoapps.de
Use Case: P2X7 agonists
➣ mode-of-action of modulators
21.09.16 74
www.infoapps.de
Use Case: PI3K-gamma kinase inhibitors selectivity
➣ target toxicity effects ?
21.09.16 75
www.infoapps.de
Backups
21.09.16 76
www.infoapps.de
Use Case: select diseases related to PI3K-gamma
➣ what PI3K compounds are in clinical trials ?
21.09.16 77
www.infoapps.de
Use Case: PI3K-gamma kinase inhibitors properties
➣ predict ADMET compound properties with data from text
21.09.16 78
www.infoapps.de
Use Case: inhibitor synthesis planning
➣ synthesis ? ...extracting reactions from patents and articles
21.09.16 79
www.infoapps.de
Use Case: select diseases related to PI3K-gamma
➣ what other diseases are affected by PI3Kgamma ?
21.09.16 80