BioNLPSADI
-
Upload
ahmad-syed -
Category
Education
-
view
94 -
download
0
description
Transcript of BioNLPSADI
Presenter: Ahmad C. Bukhari
Google Project Page: https://code.google.com/p/bionlp-sadi/
Project Demo Page: https://cbakerlab:8080/p/bionlp-sadi/
1
Motivation and Introduction Past Research Work Proposed Methodology System architecture System design Ontology Development SADI Service development
Demo and code view Experiments and Results Conclusion and Future work References
2
Scientific literature, the most updated source of information
Explosive growth observed in scientific literatureproduction
Internet is full of Bio related databases and searchengines
Text formats are provided by PubMed and OMIM.
Sequence data is provided by GenBank, in terms of DNA, and UniProt, in terms of protein.
Protein structures are provided by PDB, SCOP, and CATH.
3
Thousands of documents produced weekly : Impossible to read all the published documents
Several solution developed based on AI techniques
Lost significant due to new terms developed and static mechanism
NLP emerged as possible solution in past decade
NLP was widely adopted by scientists
Several applications are available on internet based on NLP techniques
4
We Introduced semantically rich interoperable suite of BioNLPservices based on SADI framework.
Exploits the NLP technologies in order to extract the biological useful information from scientific documents.
Can present the extracted information in such fashion that itwould be reusable, searchable and interoperable.
Can display the output in integrated format which further can lead for better bio system analysis
5
Existing text mining services
Existing text mining services with web services
•U-Compare•Whatizit•EBIMED
6
Scientific community looking for sophisticated solution which can handle Biological data interoperability, usability and integrationchallenges.
We coupled the useful biological NLP techniques with SADIframework to cope the biological information logisticsissues.
Proposed solution exploit the NLP technologies to extract bio worthy info. With semantic support
Proposed solution provides output in reusable; searchable and interoperable format
7
User Interaction Layer
SADI services suite
8
REST, XML, SOAP, or WSDL
KLEIOU-CompareGENIAFACTA+etc
XML, RDF, OWL, RDFS
NLP +WS = XML output
SWS+BNLP
9
10
11
Deal with Annotation
All document related concepts
Feature Modeling
12
13
mutationFinder DrugExtractor (enhanced) DrugDrug Interaction (80% complte) Drug2Food Interaction (Business logic
complte) Pmid2pdf (enhanced) Pdf2ascii (upgraded overall) // A lot bug in
existing SADI client level integration service
14
•Java•Servlet•RDF•SPARQL•JSP•JSF•Javascript•XHTML•And several third partylibraries
15
Too
ls a
nd
te
chn
olo
gie
s u
sed
Demo and Code View
16
17
18
Show where the drug Amoxicillin (DB01060 ) positive effect against higher serum levels
Give me the sentence where mutation and drug name occur in the same sentence.
Extract all the drug names from text and show me the interaction (if exist) among all the drugs
Tell me the food which have bad interaction with drug Cytarabine
19
Consolidated Output Generated By system
20
Proposed a generalized architecture : semantic interoperability and integration among BNLP tools
Performed several experiments by designing different corpora’s and by choosing different combination of services
In most of the cases: system generated the results according to our requirements
. AS a future work, we will try to enhance the performance of the system by refining the algorithms
A registry feature will be added to give user more freedom to work.
21
Topic Finding
Limited availability of tools
Development challenges (countless)
Integration with web
Finding case study (still have)
22
E. Gatial, Z. Balogh, M. Ciglan, L. Hluchy, Focused web crawling mechanism based on page relevance, In: Proceedings of (ITAT 2005) information technologies applications and theory, 2005, pp. 41–45
F.N Natalya, LM Deborah, Ontology development 101: a guide to creating your first ontology. http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.htm
H. Cunningham, Y. Wilks, R. J. Gaizauskas, GATE, a General Architecture for Text Engineering. Computers and humanities (2002), 1057-1060.
R. Subhashini, V.J.S Kumar, Shallow NLP techniques for noun phrase extraction, In: Proceeding of Trendz in Information Sciences & Computing (TISC), 2010 , pp.73-77.
S. Nasrolahi, M. Nikdast, M. Boroujerdi, The semantic web: a new approach for future world wide web, In: Proceedings of World Academy of Science, Engineering and Technology, 2009, pp. 1149-1154
A.C. Bukhari, Y.G Kim, Exploiting the Heavyweight Ontology with Multi-Agent System Using Vocal Command System: A Case Study on E-Mall, International Journal of Advancements in Computing Technology 3(2011) 233-241.
A.C. Bukhari, Y.G Kim, Ontology-assisted automatic precise information extractor for visually impaired inhabitants, Artificial Intelligence Review (2005) Issn: 0269-2821.
D.H. Fudholi, N. Maneerat, R. Varakulsiripunth, Y. Kato, Application of Protégé, SWRL and SQWRL in fuzzy ontology-based menu recommendation, International Symposium on Intelligent Signal Processing and Communication Systems, 2009, pp. 631-634.
Baumgartner WA, Cohen KB, Fox L, Acquaah-Mensah G, Hunter L: Manual annotation is not sufficient for curating genomic databases.
Bioinformatics 2007, 23:i41-i48. PubMed Abstract | Publisher Full Text | PubMed Central Full Text Laurilla J, Naderi N, Witte R, Riazanov A, Kouznetsov A, Baker CJO: Algorithms and semantic infrastructure for
mutation impact extraction and grounding. BMC Genomics 2010, 11(Suppl 4):S24. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text
23
Many Thanks
24